scikit-learn

Articles: 51

Recently published

Article

Guide to Multidimensional Scaling in Python with Scikit-Learn

In this guide, we'll dive into a dimensionality reduction, data embedding and data visualization technique known as Multidimensional Scaling (MDS). We'll be utilizing Scikit-Learn to perform Multidimensional Scaling, as it has a wonderfully simple and powerful API. Throughout the guide, we'll be using the Olivetti faces dataset from AT&...

Mehreen Saeed

Aug 24, 2021·16 min read

Article

Generating Synthetic Data with Numpy and Scikit-Learn

In this tutorial, we'll discuss the details of generating different synthetic datasets using the Numpy and Scikit-learn libraries. We'll see how different samples can be generated from various distributions with known parameters. We'll also discuss generating datasets for different purposes, such as regression, classification, and clustering. At the end we'll...

Mehreen Saeed

Oct 14, 2020·15 min read

Article

Kernel Density Estimation in Python Using Scikit-Learn

This article is an introduction to kernel density estimation using Python's machine learning library scikit-learn. Kernel density estimation (KDE) is a non-parametric method for estimating the probability density function of a given random variable. It is also referred to by its traditional name, the Parzen-Rosenblatt Window method, after its discoverers....

Mehreen Saeed

Sep 18, 2020·12 min read

Article

One-Hot Encoding in Python with Pandas and Scikit-Learn

In computer science, data can be represented in a lot of different ways, and naturally, every single one of them has its advantages as well as disadvantages in certain fields. Since computers are unable to process categorical data as these categories have no meaning for them, this information has to...

Mila Lukic

Apr 01, 2020·10 min read

Article

Grid Search Optimization Algorithm in Python

In this tutorial, we are going to talk about a very powerful optimization (or automation) algorithm, i.e. the Grid Search Algorithm. It is most commonly used for hyperparameter tuning in machine learning models. We will learn how to implement it using Python, as well as apply it in an...

Muhammad Junaid Khalid

Mar 09, 2020·7 min read

Article

Ensemble/Voting Classification in Python with Scikit-Learn

Ensemble classification models can be powerful machine learning tools capable of achieving excellent performance and generalizing well to new, unseen datasets. The value of an ensemble classifier is that, in joining together the predictions of multiple classifiers, it can correct for errors made by any individual classifier, leading to better...

Dan Nelson

Jan 22, 2020·18 min read

Article

Dimensionality Reduction in Python with Scikit-Learn

In machine learning, the performance of a model only benefits from more features up until a certain point. The more features are fed into a model, the more the dimensionality of the data increases. As the dimensionality increases, overfitting becomes more likely. There are multiple techniques that can be used...

Dan Nelson

Nov 22, 2019·23 min read

Article

Gradient Boosting Classifiers in Python with Scikit-Learn

Gradient boosting classifiers are a group of machine learning algorithms that combine many weak learning models together to create a strong predictive model. Decision trees are usually used when doing gradient boosting. Gradient boosting models are becoming popular because of their effectiveness at classifying complex datasets, and have recently been...

Dan Nelson

Jul 17, 2019·22 min read

Article

Multiple Linear Regression with Python

Linear regression is one of the most commonly used algorithms in machine learning. You'll want to get familiar with linear regression because you'll need to use it if you're trying to measure the relationship between two or more continuous values. A deep dive into the theory and implementation of linear...

Dan Nelson

Jun 11, 2019·19 min read