How to Save and Load Fit Scikit-Learn Scalers

Scikit-Learn's scalers are the backbone of practically all regressors and classifiers built on top of them, scaling the data to a workable range and preparing a latent representation to learn from.

If you'd like to read more about feature scaling, read our "Feature Scaling Data with Scikit-Learn for Machine Learning in Python"!

When you want to push your model to production, you'll want to scale the data in the same way it was scaled during training for your model to work. A fresh scaler that wasn't fit on your training data will never reproduce the same latent representations!

Thankfully, it's easy to save an already fit scaler and load it in a different environment alongside the model, to scale the data in the same way as during training:

import joblib

scaler = sklearn.preprocessing.StandardScaler()
joblib.dump(scaler, '') 
scaler = joblib.load('') 
Putting it into practice:

import joblib

scaler = sklearn.preprocessing.MinMaxScaler()
print('Scaler results:', scaler.transform(X_train)[:1])

joblib.dump(scaler, '') 
scaler = joblib.load('') 
print('Loaded scaler results:', scaler.transform(X_train)[:1])

This results in:

Scaler results: [[0.16060468 0.52941176 0.02742132 0.02532079 0.02561875 0.00184402
  0.4293305  0.47310757]]
Loaded scaler results: [[0.16060468 0.52941176 0.02742132 0.02532079 0.02561875 0.00184402
  0.4293305  0.47310757]]

The data was scaled in the exact same way across both scaler objects!

Last Updated: August 22nd, 2022
