Total 29 Posts

Generating Synthetic Data with Numpy and Scikit-Learn

Introduction

In this tutorial, we'll discuss the details of generating different synthetic datasets using Numpy and Scikit-learn libraries. We'll see how different samples can be generated from various distributions with known parameters.

We'll also discuss generating datasets for different purposes, such as regression, classification, and clustering. At the end we'll

Kernel Density Estimation in Python Using Scikit-Learn

Introduction

This article is an introduction to kernel density estimation using Python's machine learning library scikit-learn.

Kernel density estimation (KDE) is a non-parametric method for estimating the probability density function of a given random variable. It is also referred to by its traditional name, the Parzen-Rosenblatt Window method, after its

One-Hot Encoding in Python with Pandas and Scikit-Learn

Introduction

In computer science, data can be represented in a lot of different ways, and naturally, every single one of them has its advantages as well as disadvantages in certain fields.

Since computers are unable to process categorical data as these categories have no meaning for them, this information has

Grid Search Optimization Algorithm in Python

Introduction

In this tutorial, we are going to talk about a very powerful optimization (or automation) algorithm, i.e. the Grid Search Algorithm. It is most commonly used for hyperparameter tuning in machine learning models. We will learn how to implement it using Python, as well as apply it in

Ensemble/Voting Classification in Python with Scikit-Learn

Introduction

Ensemble classification models can be powerful machine learning tools capable of achieving excellent performance and generalizing well to new, unseen datasets.

The value of an ensemble classifier is that, in joining together the predictions of multiple classifiers, it can correct for errors made by any individual classifier, leading to

Dimensionality Reduction in Python with Scikit-Learn

Introduction

In machine learning, the performance of a model only benefits from more features up until a certain point. The more features are fed into a model, the more the dimensionality of the data increases. As the dimensionality increases, overfitting becomes more likely.

There are multiple techniques that can be

Gradient Boosting Classifiers in Python with Scikit-Learn

Introduction

Gradient boosting classifiers are a group of machine learning algorithms that combine many weak learning models together to create a strong predictive model. Decision trees are usually used when doing gradient boosting. Gradient boosting models are becoming popular because of their effectiveness at classifying complex datasets, and have recently

Multiple Linear Regression with Python

Introduction

Linear regression is one of the most commonly used algorithms in machine learning. You'll want to get familiar with linear regression because you'll need to use it if you're trying to measure the relationship between two or more continuous values.

A deep dive into the theory and implementation of