How to Save and Load Fit Scikit-Learn Scalers

Scikit-Learn's scalers are the backbone of practically all regressors and classifiers built on top of them, scaling the data to a workable range and preparing a latent representation to learn from.

If you'd like to read more about feature scaling, read our "Feature Scaling Data with Scikit-Learn for Machine Learning in Python"!

When you want to push your model to production, you'll want to scale the data in the same way it was scaled during training for your model to work. A fresh scaler that wasn't fit on your training data will never reproduce the same latent representations!

Thankfully, it's easy to save an already fit scaler and load it in a different environment alongside the model, to scale the data in the same way as during training:

import joblib

scaler = sklearn.preprocessing.StandardScaler()
joblib.dump(scaler, 'scaler.save') 
scaler = joblib.load('scaler.save') 
Get free courses, guided projects, and more

No spam ever. Unsubscribe anytime. Read our Privacy Policy.

Putting it into practice:

import joblib

scaler = sklearn.preprocessing.MinMaxScaler()
scaler.fit(X_train)
print('Scaler results:', scaler.transform(X_train)[:1])

joblib.dump(scaler, 'scaler.save') 
scaler = joblib.load('scaler.save') 
print('Loaded scaler results:', scaler.transform(X_train)[:1])

This results in:

Scaler results: [[0.16060468 0.52941176 0.02742132 0.02532079 0.02561875 0.00184402
  0.4293305  0.47310757]]
  
Loaded scaler results: [[0.16060468 0.52941176 0.02742132 0.02532079 0.02561875 0.00184402
  0.4293305  0.47310757]]

The data was scaled in the exact same way across both scaler objects!

Last Updated: August 22nd, 2022
Was this helpful?
David LandupAuthor

Entrepreneur, Software and Machine Learning Engineer, with a deep fascination towards the application of Computation and Deep Learning in Life Sciences (Bioinformatics, Drug Discovery, Genomics), Neuroscience (Computational Neuroscience), robotics and BCIs.

Great passion for accessible education and promotion of reason, science, humanism, and progress.

Project

Bank Note Fraud Detection with SVMs in Python with Scikit-Learn

# python# machine learning# scikit-learn# data science

Can you tell the difference between a real and a fraud bank note? Probably! Can you do it for 1000 bank notes? Probably! But it...

David Landup
Cássia Sampaio
Details

© 2013-2024 Stack Abuse. All rights reserved.

AboutDisclosurePrivacyTerms