Grid Search Optimization Algorithm in Python

Introduction

In this tutorial, we are going to talk about a very powerful optimization (or automation) algorithm, i.e. the Grid Search Algorithm. It is most commonly used for hyperparameter tuning in machine learning models. We will learn how to implement it using Python, as well as apply it in an actual application to see how it can help us choose the best parameters for our model and improve its accuracy. So let's start.

Prerequisites

To follow this tutorial, you should have a basic understanding of Python or some other programming language. It is preferred, but not essential, that you have some basic knowledge of machine learning as well. Other than that, this article is beginner-friendly and can be followed by anyone.

Installation

To go through the tutorial, you need to have the following libraries/frameworks installed in your system:

  1. Python 3
  2. NumPy
  3. Pandas
  4. Keras
  5. Scikit-Learn

They are all quite simple to install - you can click on each to go to their respective websites where detailed installation instructions are provided. Generally, the packages can be installed using pip:

$ pip install numpy pandas tensorflow keras scikit-learn

If you run in to any issues, please refer to the official documentation from each package.

What is Grid Search?

Grid search is essentially an optimization algorithm which lets you select the best parameters for your optimization problem from a list of parameter options that you provide, hence automating the 'trial-and-error' method. Although it can be applied to many optimization problems, but it is most popularly known for its use in machine learning to obtain the parameters at which the model gives the best accuracy.

Let's assume that your model takes the below three parameters as input:

  1. Number of hidden layers [2, 4]
  2. Number of neurons in each layer [5, 10]
  3. Number of epochs [10, 50]

If for each parameter input we wish to try out two options (as mentioned in square brackets above), it totals up to 23=8 different combinations (for example, one possible combination is [2,5,10]). Doing this manually would be a headache.

Now imagine if we had 10 different input parameters and we wanted to try out 5 possible values for each parameter. It would require manual input from our side each time we wish to change a parameter value, rerun the code, and keep track of the results for all the combinations of parameters. Grid Search automates that process, as it simply takes the possible values for each parameter and runs the code to try out all possible combinations, outputs the result for each combination, as well as outputs the combination which gives the best accuracy. Useful, no?

Grid Search Implementation

Alright, enough talk. Lets apply Grid Search on an actual application. Discussing the Machine Learning and Data Preprocessing part is out of scope for this tutorial, so we would simply be running its code and talk in-depth about the part where Grid Search comes in. Lets start!

We will be using the Pima Indian Diabetes dataset which contains information about whether or not a patient is diabetic based on different attributes such blood glucose glucoses concentration, blood pressure, etc. Using Pandas read_csv() method you can directly import the dataset from an online resource.

The following script imports the required libraries:

from sklearn.model_selection import GridSearchCV, KFold
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.wrappers.scikit_learn import KerasClassifier
from keras.optimizers import Adam
import sys
import pandas as pd
import numpy as np

The following script imports the dataset and sets the column headers for the dataset.

columns = ['num_pregnant', 'glucose_concentration', 'blood_pressure', 'skin_thickness',
           'serum_insulin', 'BMI', 'pedigree_function', 'age', 'class']

data_path = "https://raw.githubusercontent.com/mkhalid1/Machine-Learning-Projects-Python-/master/Grid%20Search/pima-indians-diabetes.csv"

df = pd.read_csv(data_path, names=columns)

Let's take a look at the first 5 rows of the dataset:

df.head()

Output:

diabetes dataset

As you can see, these 5 rows are all labels to describe each column (there are actually 9 of them), so they have no use to us. We'll start by removing these non-data rows and then replace all the NaN values with 0:

# Remove first 9 non-data rows
df = df.iloc[9:]

# Replace NaN (Not a Number) values with 0 in each column
for col in columns:
    df[col].replace(0, np.NaN, inplace=True)

df.dropna(inplace=True) # Drop all rows with missing values
dataset = df.values # Convert dataframe to numpy array

The following script divides the data into the feature and label sets and applies the standard scaling on the dataset:

X = dataset[:,0:8]
Y = dataset[:, 8].astype(int)

# Normalize the data using sklearn StandardScaler
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler().fit(X)

# Transform and display the training data
X_standardized = scaler.transform(X)

data = pd.DataFrame(X_standardized)

The following method creates our simple deep learning model:

def create_model(learn_rate, dropout_rate):
    # Create model
    model = Sequential()
    model.add(Dense(8, input_dim=8, kernel_initializer='normal', activation='relu'))
    model.add(Dropout(dropout_rate))
    model.add(Dense(4, input_dim=8, kernel_initializer='normal', activation='relu'))
    model.add(Dropout(dropout_rate))
    model.add(Dense(1, activation='sigmoid'))

    # Compile the model
    adam = Adam(lr=learn_rate)
    model.compile(loss='binary_crossentropy', optimizer=adam, metrics=['accuracy'])
    return model

This is all the code that you would need to run in order to load the dataset, preprocess it, and create your machine learning model. Since, we are only interested in seeing the functionality of Grid Search, I have not performed the train/test split, and we'd be fitting the model on the entire dataset.

In the next section we'll start to see how Grid Search makes our life easier by optimizing our parameters.

Training the Model without Grid Search

In the code below, we will create a model using parameter values that we decided on randomly, or based on our intuition, and see how our model performs:

# Declare parameter values
dropout_rate = 0.1
epochs = 1
batch_size = 20
learn_rate = 0.001

# Create the model object by calling the create_model function we created above
model = create_model(learn_rate, dropout_rate)

# Fit the model onto the training data
model.fit(X_standardized, Y, batch_size=batch_size, epochs=epochs, verbose=1)

Output:

Epoch 1/1
130/130 [==============================] - 0s 2ms/step - loss: 0.6934 - accuracy: 0.6000

The accuracy we got, as you can see below, is 60.00%. This is quite low, but nothing to worry about! We still have Grid Search to try and save the day. So, let's get to it.

Optimizing Hyper-parameters using Grid Search

If you do not use Grid Search, you can directly call the fit() method on the model we have created above. However, to use Grid Search, we need to pass in some parameters to our create_model() function. Furthermore, we need to declare our grid with different options that we would like to try for each parameter. Let's do that in parts.

First we modify our create_model() function to accept parameters from the calling function:

def create_model(learn_rate, dropout_rate):
    # Create model
    model = Sequential()
    model.add(Dense(8, input_dim=8, kernel_initializer='normal', activation='relu'))
    model.add(Dropout(dropout_rate))
    model.add(Dense(4, input_dim=8, kernel_initializer='normal', activation='relu'))
    model.add(Dropout(dropout_rate))
    model.add(Dense(1, activation='sigmoid'))

    # Compile the model
    adam = Adam(lr=learn_rate)
    model.compile(loss='binary_crossentropy', optimizer=adam, metrics=['accuracy'])
    return model

# Create the model
model = KerasClassifier(build_fn=create_model, verbose=1)

Now, we are ready to implement our Grid Search algorithm and fit the dataset on it:

# Define the parameters that you wish to use in your Grid Search along
# with the list of values that you wish to try out
learn_rate = [0.001, 0.02, 0.2]
dropout_rate = [0.0, 0.2, 0.4]
batch_size = [10, 20, 30]
epochs = [1, 5, 10]

seed = 42

# Make a dictionary of the grid search parameters
param_grid = dict(learn_rate=learn_rate, dropout_rate=dropout_rate, batch_size=batch_size, epochs=epochs )

# Build and fit the GridSearchCV
grid = GridSearchCV(estimator=model, param_grid=param_grid,
                    cv=KFold(random_state=seed), verbose=10)

grid_results = grid.fit(X_standardized, Y)

# Summarize the results in a readable format
print("Best: {0}, using {1}".format(grid_results.best_score_, grid_results.best_params_))

means = grid_results.cv_results_['mean_test_score']
stds = grid_results.cv_results_['std_test_score']
params = grid_results.cv_results_['params']

for mean, stdev, param in zip(means, stds, params):
    print('{0} ({1}) with: {2}'.format(mean, stdev, param))

Output:

Best: 0.7959183612648322, using {'batch_size': 10, 'dropout_rate': 0.2, 'epochs': 10, 'learn_rate': 0.02}

In the output, we can see it gives us the parameter combination which yields the best accuracy.

It is safe to say that the Grid Search was quite easy to implement in Python and saved us a lot of time, in terms of human labor. You can just list down all the parameters you'd like to tune, declare the values to be tested, run your code, and forget about it. No more input required from your side. Once the best parameters combination has been found, you can simply use that for your final model.

Conclusion

To sum it up, we learned what Grid Search is, how it can help us optimize our model and the benefits it entails like automation. Furthermore, we learned how to implement it in a few lines of code using Python Language. To see its effectiveness, we also trained a Machine Learning model with and without performing Grid Search, and the accuracy was 19% higher with Grid Search.