People have been finding ways to forge bank notes for a long time - and most forged notes are somewhat easy to recognise by eye, due to the proprietary technology used to produce official bank notes that isn't easy to copy.
<blockquote>
If a human can discern between real and fake notes - is there a way to automatically know if bank notes are forged or real?
</blockquote>
There are many ways to answer this question and go about automating the process. One approach would be to photograph each received note, and train an image classifier to &quot;fish out&quot; the real and fake notes. Though, data is universal - we don't need to work with images directly, and can use a derived set of data of each banknote with a far more computationally cheaper algorithm than deep neural networks!
Instead of raw images, we can compact them, reduce them to grayscale and have their measurements extracted or quantisized. This way, the comparison would be between images measurements, instead of each image's pixel. The measurements are derived from the images, but this is a much more compact way to work with the data!
<blockquote>
So far, we've found a way to process and compare bank notes, but how will they be classified into real or forged?
</blockquote>
Really any classification model would work, but this is a great opportunity in the course to dive into Support Vector Machines, commonly known by their abbreviated name: SVMs.
<h3 id="backgroundonsvms">Background on SVMs</h3>
SVMs were introduced initially in 1968, by Vladmir Vapnik and Alexey Chervonenkis. At that time, their algorithm was limited to the classification of data that could be separated using just one straight line, that is, data that is linearly separable. We can see how that separation would look like:
<img src="svm-in-python-with-sklearn-0.png" alt="">
In the above image we have a line in the middle, compare to which some points are to the left, and others are to the right of that line.
Notice that both groups of points are perfectly separated, there are no points in between or even close to the line. There seems to be a margin between similar points and the line that divides them, that margin is called separation margin. The function of the separation margin is to make the space between the similar points and the line that divides them bigger. SVMs do that by using some points calculating their perpendicular vectors to support the decision for the line's margin. Those are the support vectors (hence the name). The straight line that we see in the middle is found by methods that maximize that space between the line and the points, or that maximize the separation margin. Those methods originate from the field of Optimization Theory.
In the example we've just seen, both groups of points can be easily separated, since each individual point is close together to its similar points, and the two groups are far from each other.
<blockquote>
But what happens if there is not a way to separate the data using one straight line? If there are messy out of place points, or if a curve is needed?
</blockquote>
To solve that problem, SVM was later refined in the 1990s to be able to also classify data that had points that were far from its central tendency, such as outliers, or more complex problems that had more than two dimensions and weren't linearly separable.
What is curious is that only in recent years have SVM's become widely adopted, mainly due to their ability to achieve sometimes more than 90% of correct answers or accuracy, for difficult problems.

 <div class="alert alert-note">
 <div class="flex">
 
 <div class="flex-shrink-0 mr-3">
 <img src="/assets/images/icon-information-circle-solid.svg" class="icon" aria-hidden="true" />
 </div>
 
 <div class="w-full">
 Note: Since then, research and production applications typically more commonly rely on Random Forests and/or Gradient Boosting methods, due to their amazing predictive power compared to other common statistical models, optimization with respect.to the underlying hardware, etc. Even so - SVMs are still used for their speed of training, as well as great performance and interpreteability by linear SVMs for linear datasets.

 </div>
 </div>
 </div>
 SVMs are implemented in a unique way when compared to other machine learning algorithms, once they are based on statistical explanations of what learning is, or on Statistical Learning Theory.
In this project, we'll cover what Support Vector Machines algorithms are, the brief theory behind a support vector machine, and their implementation in Python's Scikit-Learn library. We will then move towards another SVM concept, known as Kernel SVM, that take advantage of the Kernel trick, that allowed SVMs to be applied to non-linear contexts.

David Landup

Cássia Sampaio

Introduction

Following the example given in the introduction, we will use a dataset that has measurements of real and forged bank notes images.
When looking at two notes, our eyes usually scan them from left to right and check where there might be similarities or dissimilarities. We look for a black dot coming before a green dot, or a shiny mark that is above an illustration. This means that there is an order in which we look at the notes. If we knew there were greens and black dots, but not if the green dot was coming before the black, or if the black was coming before the green, it would be harder to discriminate between notes.
There is a similar method to what we have just described than can be applied to the bank notes images. In general terms, this method consists in translating the image's pixels into a signal, then taking into consideration the order in which each different signal happens in the image by transforming it into little waves, or wavelets. After obtaining the wavelets, there is a way to figure out the order in which some signal happens before another, or the time of occurrence, but not exactly what signal. To know that, the image's frequencies need to be obtained. They are obtained by a method that performs the decomposition of each signal, called the Fourier method.

Exploratory Data Analysis

Before getting more into the theory of how SVM works, we can build our first baseline model with the data, and Scikit-Learn's Support Vector Classifier/SVC class.
Our model will receive the wavelets coefficients and try to classify them based on the class. The first step in this process is to separate the coefficients or features from the class or target. After that step, the second step is to further divide the data into a set that will be used for the model's learning or train set and another one that will be used to the model's evaluation or test set.

 <div class="alert alert-note">
 <div class="flex">
 
 <div class="flex-shrink-0 mr-3">
 <img src="/assets/images/icon-information-circle-solid.svg" class="icon" aria-hidden="true" />
 </div>
 
 <div class="w-full">
 Note: The nomenclature of test and evaluation/validation can be a little confusing, because you can also split your data between train, evaluation/validation and test sets. In this way, instead of having two sets, you would have an intermediary set just to use and see if your model's performance is enhancing. This means that the model would be trained with the train set, enhanced with the evaluation/validation set, and obtaining a final metric with the test set.
Some people say that the evaluation is that intermediary set, others will say that the test set is the intermediary set, and that the evaluation set is the final set. This is another way to try to guarantee that the model isn't seeing the same example in any way, or that some kind of data leakage isn't happening, and that there is a model generalization by the improvement of the last set metrics. If you want to follow that approach, you can further divide the data once more as described in this <a href="https://stackabuse.com/scikit-learns-traintestsplit-training-testing-and-validation-sets/">Scikit-Learn's train_test_split() - Training, Testing and Validation Sets</a> guide.</div></div></div>

Implementing SVM and Kernel SVM with Scikit-Learn and Python

Can you tell the difference between a real and a fraud bank note? Probably! 
Can you do it for 1000 bank notes? Probably! But it takes time.
This doesn't have to be a computer vision problem - in this Guided Project, you'll learn the intuition and theory behind Support Vector Machines (SVMs) and use them on a tabular dataset to determine whether a bank note is fraudulent or not. We'll be using Python, Scikit-Learn, Pandas and Seaborn, building from exploratory data analysis to training and evaluating a model.