Data Scientist, Research Software Engineer, and teacher. Cassia is passionate about transformative processes in data, technology and life. She is graduated in Philosophy and Information Systems, with a Strictu Sensu Master's Degree in the field of Foundations Of Mathematics.
K-Means Elbow Method and Silhouette Analysis with Yellowbrick and Scikit-Learn
K-Means is one of the most popular clustering algorithms. By having central points to a cluster, it groups other points based on their distance to that central point. A downside of K-Means is having to choose the number of clusters, K, prior to running the algorithm that groups points. If...
Definitive Guide to Hierarchical Clustering with Python and Scikit-Learn
In this guide, we will focus on implementing the Hierarchical Clustering Algorithm with Scikit-Learn to solve a marketing problem. After reading the guide, you will understand: When to apply Hierarchical Clustering How to visualize the dataset to understand if it is fit for clustering How to pre-process features and engineer...
Linear Regression in Python with Scikit-Learn
If you had studied longer, would your overall scores get any better? One way of answering this question is by having data on how long you studied for and what scores you got. We can then try to see if there is a pattern in that data, and if in...