Original article was published on Artificial Intelligence on Medium
Support Vector Machines (SVM) and its Python implementation
The support vector machines algorithm is a supervised machine learning algorithm that can be used for both classification and regression. In this article, we will be discussing certain parameters concerning the support vector machines and try to understand this algorithm in detail.
For understanding, let us consider the SVM used for classification. The following figure shows the geometrical representation of the SVM classification.
After taking a look at the above diagram you might notice that the SVM classifies the data a bit differently as compared to the other algorithms. Let us understand this figure in detail. The Red-colored line is called as the ‘hyperplane’. This is basically the line or the plane which linearly separates the data. Along with the hyperplane, two planes that are parallel to the hyperplane are created. While creating these two planes, we make sure that they pass through the points that are closest to the hyperplane. These points can be called as the nearest points. The hyperplane is adjusted in such a way that it lies exactly in the middle of the two parallel planes. The distance between these two planes is called the ‘margin’. The advantage of these two parallel planes is that it helps us to classify the two classes in a better way. Now a question arises that there can be multiple hyperplanes and out of them why did we select the one in the above diagram? The answer to that is we select the hyperplane for which the margin i.e the distance between the two parallel planes is maximum. The points that are on these two parallel planes are called support vectors. The above figure is obtained after training the data. Now for the classification of unknown data or the testing data, the algorithm will only take into consideration the reference of the support vectors for classification.
The python implementation is shown below. The data is divided into a training dataset and a testing dataset. The notations used are X_train, X_test, y_train, y_test. This is done with the help of the ‘train-test-split function’. Now let’s get to the implementation part. First, we start off by importing the libraries that are required to implement the SVM algorithm.
# Import SVM from sklearn import svm# Creating a SVM Classifier
classifier = svm.SVC(C=0.01, break_ties=False, cache_size=200, class_weight=None, coef0=0.0, decision_function_shape='ovr', degree=1, gamma='scale', kernel='rbf',max_iter=-1,probability=False, random_state=None, shrinking=True,tol=0.001, verbose=False)
# Training the model
As seen from the above code block, the implementation is quite easy. Let us discuss the parameters that we need to specify in the brackets. These parameters are used to tune the model to obtain better accuracy. The most commonly used parameters that we use for tuning are:
C : This is the regularization parameter. It is inversely proportional to C. The most common values that we use for C are 1, 10, 100, 1000.
kernel: The most commonly used kernels are, ‘linear’, ‘poly’, ‘rbf’. When the data is linearly separable, we can use the linear kernel. When the data is not linearly separable and the relationship is of a higher degree then we use ‘poly’ as the kernel. The ‘rbf’ is used when we don’t exactly know which kernel is to be specified.
gamma: This parameter is used to handle non-linear classification. When the points are not linearly separable, we need to transform them into a higher dimension. A small gamma will result in low bias and high variance while a large gamma will result in higher bias and low variance. Thus we need to find the best combination of ‘C’ and ‘gamma’.
Support Vector Machines is one of the widely used algorithms in machine learning. I hope that the geometrical interpretation along with its implementation in python along with the parameter tuning is understood. Happy learning!