Article
K-Means is one of the most popular clustering algorithms. By having central points to a cluster, it groups other points based on their distance to that central point. A downside of K-Means is having to choose the number of clusters, K, prior to running the algorithm that groups points. If...
Cássia Sampaio
In this guide, we will focus on implementing the Hierarchical Clustering Algorithm with Scikit-Learn to solve a marketing problem. After reading the guide, you will understand: When to apply Hierarchical Clustering How to visualize the dataset to understand if it is fit for clustering How to pre-process features and engineer...
TensorFlow Datasets, also known as tfds is is a library that serves as a wrapper to a wide selection of datasets, with proprietary functions to load, split and prepare datasets for Machine and Deep Learning, primarily with TensorFlow. Note: While the TensorFlow Datasets library is used to get data, it's...
David Landup
Keras is a high-level API, typically used with the TensorFlow library, and has lowered the barrier to entry for many and democratized the creation of Deep Learning models and systems. When just starting out, a high-level API that abstracts most of the inner-workings helps people get the hang of the...
Scikit-Learn is one of the most widely-used Machine Learning library in Python. It's optimized and efficient - and its high-level API is simple and easy to use. Scikit-Learn has a plethora of convenience tools and methods that make preprocessing, evaluating and other painstaking processes as easy as calling a single...
This guide is an in-depth introduction to an unsupervised dimensionality reduction technique called Random Projections. A Random Projection can be used to reduce the complexity and size of data, making the data easier to process and visualize. It is also a preprocessing technique for input preparation to a classifier or...
Mehreen Saeed
In this guide, we'll be taking a look at an unsupervised learning model, known as a Self-Organizing Map (SOM), as well as its implementation in Python. We'll be using an RGB Color example to train the SOM and demonstrate its performance and typical usage. Self-Organizing Maps: A General Introduction A...
In this guide, we'll dive into a dimensionality reduction, data embedding and data visualization technique known as Multidimensional Scaling (MDS). We'll be utilizing Scikit-Learn to perform Multidimensional Scaling, as it has a wonderfully simple and powerful API. Throughout the guide, we'll be using the Olivetti faces dataset from AT&...
This guide is an introduction to Spearman's rank correlation coefficient, its mathematical calculation, and its computation via Python's pandas library. We'll construct various examples to gain a basic understanding of this coefficient and demonstrate how to visualize the correlation matrix via heatmaps. What Is the Spearman Rank Correlation Coefficient? Spearman...
© 2013-2025 Stack Abuse. All rights reserved.