# Introduction

People have been finding ways to forge bank notes for a long time - and most forged notes are somewhat easy to recognise by eye, due to the proprietary technology used to produce official bank notes that isn't easy to copy.

If a human can discern between real and fake notes - is there a way to automatically know if bank notes are forged or real?

There are many ways to answer this question and go about automating the process. One approach would be to photograph each received note, and train an image classifier to "fish out" the real and fake notes. Though, data is universal - we don't need to work with images directly, and can use a derived set of data of each banknote with a far more computationally cheaper algorithm than deep neural networks!

Instead of raw images, we can compact them, reduce them to grayscale and have their measurements extracted or quantisized. This way, the comparison would be between images measurements, instead of each image's pixel. The measurements are derived from the images, but this is a much more compact way to work with the data!

So far, we've found a way to process and compare bank notes, but how will they be classified into real or forged?

Really any classification model would work, but this is a great opportunity in the course to dive into Support Vector Machines, commonly known by their abbreviated name: SVMs.

### Background on SVMs

SVMs were introduced initially in 1968, by Vladmir Vapnik and Alexey Chervonenkis. At that time, their algorithm was limited to the classification of data that could be separated using just one straight line, that is, data that is **linearly separable**. We can see how that separation would look like:

In the above image we have a line in the middle, compare to which some points are to the left, and others are to the right of that line.

Notice that both groups of points are perfectly separated, there are no points in between or even close to the line. There seems to be a margin between similar points and the line that divides them, that margin is called **separation margin**. The function of the separation margin is to make the space between the similar points and the line that divides them bigger. SVMs do that by using some points calculating their perpendicular vectors to support the decision for the line's margin. Those are the **support vectors** (hence the name). The straight line that we see in the middle is found by methods that **maximize** that space between the line and the points, or that maximize the separation margin. Those methods originate from the field of *Optimization Theory*.

In the example we've just seen, both groups of points can be easily separated, since each individual point is close together to its similar points, and the two groups are far from each other.

But what happens if there is not a way to separate the data using one straight line? If there are messy out of place points, or if a curve is needed?

To solve that problem, SVM was later refined in the 1990s to be able to also classify data that had points that were far from its central tendency, such as outliers, or more complex problems that had more than two dimensions and weren't linearly separable.

What is curious is that only in recent years have SVM's become widely adopted, mainly due to their ability to achieve sometimes more than 90% of correct answers or **accuracy**, for difficult problems.

**Note:** Since then, research and production applications typically more commonly rely on Random Forests and/or Gradient Boosting methods, due to their amazing predictive power compared to other common statistical models, optimization with respect.to the underlying hardware, etc. Even so - SVMs are still used for their speed of training, as well as great performance and interpreteability by linear SVMs for linear datasets.

SVMs are implemented in a unique way when compared to other machine learning algorithms, once they are based on statistical explanations of what learning is, or on *Statistical Learning Theory*.

In this project, we'll cover what Support Vector Machines algorithms are, the brief theory behind a support vector machine, and their implementation in Python's Scikit-Learn library. We will then move towards another SVM concept, known as **Kernel SVM**, that take advantage of the **Kernel trick**, that allowed SVMs to be applied to non-linear contexts.