Support Vector Machines

Posted: 05/02/2018/Under: Machine Learning/By: Vaibhav

Support Vector Machines are supervised learning models which can be used for classification as well as regression problems.

For classification, the data is represented as a point in space and the classification is achieved by dividing the points by a hyperplane so that it has the maximum distance from the two classes being separated.

The margin of separation is with respect to the vector points which are on the border and closest to the dividing hyperplane for each class. The maximum distance is attempted from these points so they form the Support Vector against which the hyperplane is drawn. Hence the word Support Vector in the name Support Vector Machines.

Many a times the data points are not linearly separable for which we have to use the polynomial kernel or also known as kernel tick. This is accomplished by creating high dimensional features from the given features. Check this short video to see how it is done visually.

While determining the class of a new data point, it is checked which side of the hyperplane(s) the point lies and accordingly it is classified.

Refer to this video if you want to look into more mathematical details.

Key factors to consider about SVM:

It is more effective if the classes have clear separation or at least less overlap
It is effective when number of features are more than the number of samples
It uses the only support vector points for calculations (e.g. unlike KNN where difference is calculated against all points) so more memory efficient
It is computationally very expensive on large dataset
SVM works better than logistic regression in cases of multi-class classification

Now lets look at SVM in practice on the Iris dataset.

IRIS Flower Data

Reference: Iris flower data set.

Get the data

iris = sns.load_dataset('iris') 
iris.head()

	sepal_length	sepal_width	petal_length	petal_width	species
0	5.1	3.5	1.4	0.2	setosa
1	4.9	3.0	1.4	0.2	setosa
2	4.7	3.2	1.3	0.2	setosa
3	4.6	3.1	1.5	0.2	setosa
4	5.0	3.6	1.4	0.2	setosa

Train Test Split

Split your data into a training set and a testing set.

from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(iris.drop('species',axis=1), iris['species'], test_size=0.3, random_state=101)

Train a Model

Now we can train a Support Vector Machine Classifier.

Call the SVC() model from sklearn and fit the model to the training data.

from sklearn.svm import SVC
model = SVC()
model.fit(X_train,y_train)

 Out:
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape='ovr', degree=3, gamma='auto', kernel='rbf',
  max_iter=-1, probability=False, random_state=None, shrinking=True,
  tol=0.001, verbose=False)

Model Evaluation

predictions = model.predict(X_test)
from sklearn.metrics import classification_report,confusion_matrix
print(confusion_matrix(y_test,predictions))
Out: 
[[13  0  0]
 [ 0 20  0]
 [ 0  0 12]]

Print classification report:

print(classification_report(y_test,predictions))
 Out: 
             precision    recall  f1-score   support

     setosa       1.00      1.00      1.00        13
 versicolor       1.00      1.00      1.00        20
  virginica       1.00      1.00      1.00        12

avg / total       1.00      1.00      1.00        45

Inference

Interesting! The model is 100% accurate as it has predicted all values correctly. No room for further optimization like by using Grid search.

Refer to the Grid Search blog where it is used for finding the most optimized hyper-parameters in an SVM classifier.