Original article was published on Artificial Intelligence on Medium
In your way to Learn machine learning, you must come across some fundamental machine learning algorithms such as linear regression, decision tree, and support vector machine, to be able to construct a solid understanding of the basics of the field, you could find the linear regression full introduction right here:
What is Support Vector Machine?
SVM is a supervised machine learning algorithm that can be used for classification or regression problems. It uses a technique called the kernel trick to transform your data and then based on these transformations it finds an optimal boundary between the possible outputs. Simply put, it does some extremely complex data transformations, then figures out how to separate your data based on the labels or outputs you’ve defined.
How does SVM work?
The main objective is to segregate the given dataset in the best possible way. The distance between the nearest points is known as the margin. The objective is to select a hyperplane with the maximum possible margin between support vectors in the given dataset. SVM searches for the maximum marginal hyperplane in the following steps:
- Generate hyperplanes that segregate the classes in the best way. Left-hand side figure showing three hyperplanes black, blue, and orange. Here, the blue and orange have higher classification errors, but the black is separating the two classes correctly.
- Select the right hyperplane with the maximum segregation from either nearest data points
A hyperplane is a decision plane that separates between a set of objects having different class memberships.
The dimension of the hyperplane is directly proportional to the number of features. For a simple linear regression model if your data is based on a single feature then your plane/decision boundary would look like left side image. If you have more than 1 feature then the plane would be called a hyperplane as its data points now reside in 3D vectors. As features get increased so does the numbers of dimensions for the ML model which is hard to picture for let’s say 10,20 or 100 dimensions. Also, there are techniques to reduce dimensionality like PCA, Backward/Forward Feature Elimination, High Correlation / Low Variance Filters and etc since too many features are computationally expensive to model and make classifications on.
A margin is a gap between the two lines on the closest class points. This is calculated as the perpendicular distance from the line to support vectors or closest points. If the margin is larger in between the classes, then it is considered a good margin, a smaller margin is a bad margin.
Let’s say you have a bunch of red and blue points on cardboard. You are asked to draw a straight line to separate them. You look at the points and realize that there is no way this can happen because, in order to separate all the points correctly, you will have to draw a squiggly line.
Now you take a step back and see that you were looking at one face of a 3-dimensional cube. In this cube, you see that you can easily place a simple planar cardboard piece somewhere to separate the red and blue points (because they are at different depths).
Coming to the concept of SVM, it relies on the fact that finding out the properties of that cardboard piece in the 3-dimensional cube is much easier than finding out the properties of that squiggly line in that 2-dimensional cardboard piece. So it just extracts the properties of that cardboard piece and projects it back to 2-dimensions. A simple plane in 3-dimensions will look like a squiggly line in 2-dimensions.
To generalize it, SVM basically projects given data points onto really high dimensions (using kernel functions), gets the separating hyperplane, and converts it back to the lower dimensions. SVM defines that boundary using something called Support Vectors. They are the ones that are closest to the boundary and “support” the separation. The separating boundary will the optimal boundary (equidistant from both sets of points)”.
Implementation with Python
Evaluation of the model using other metrics, If you are interested in learning the metrics visit:
I hope I was able to clarify it a little to you, SVM it is one of the basic Algorithms, I will be uploading a lot of more explanation of algorithms because why not 🙂
Those are my personal research, if you have any comments please reach out to me.
Welcome to my medium page