Implementing SVM and Kernel SVM with Scikit-Learn and Python

David Landup
Cássia Sampaio

Before getting more into the theory of how SVM works, we can build our first baseline model with the data, and Scikit-Learn's Support Vector Classifier/SVC class.

Our model will receive the wavelets coefficients and try to classify them based on the class. The first step in this process is to separate the coefficients or features from the class or target. After that step, the second step is to further divide the data into a set that will be used for the model's learning or train set and another one that will be used to the model's evaluation or test set.

Note: The nomenclature of test and evaluation/validation can be a little confusing, because you can also split your data between train, evaluation/validation and test sets. In this way, instead of having two sets, you would have an intermediary set just to use and see if your model's performance is enhancing. This means that the model would be trained with the train set, enhanced with the evaluation/validation set, and obtaining a final metric with the test set.

Some people say that the evaluation is that intermediary set, others will say that the test set is the intermediary set, and that the evaluation set is the final set. This is another way to try to guarantee that the model isn't seeing the same example in any way, or that some kind of data leakage isn't happening, and that there is a model generalization by the improvement of the last set metrics. If you want to follow that approach, you can further divide the data once more as described in this Scikit-Learn's train_test_split() - Training, Testing and Validation Sets guide.

Start project to continue
Lessson 3/3
You must first start the project before tracking progress.
Mark completed

© 2013-2024 Stack Abuse. All rights reserved.

AboutDisclosurePrivacyTerms