In this experiment RBF kernel SVMs are trained on the Pima
Indians Diabetes data set with different kernel width parameter
gamma) values. The value of this parameter is
increased from 0.001 to 5 while the value of the parameter
is fixed to 1 to obtain comparable results. The data set is split into a
training and a test set, 75% of the examples are used to form a training
set, and the rest are for testing. The classification error rates on both
the training and the test sets are determined for each SVM.
Pima Indians Diabetes [UCI MLR]
Figure 8.21. The classification error rates of the SVM on the training and the test sets against the value of RBF kernel width parameter.
The value of the RBF kernel width parameter can be chosen such that the SVM will perfectly classify all training examples. Unfortunately, the model does not perform well on the test data. Apparently, overfitting occurs here. It should be noted that the linear SVM does not perform so well on the training set, its classification error rate is around 20%.