Ensemble methods: boosting

Description

The experiment shows the combined method of boosting. In this method, a better fitting model can be built from supervised data mining models. The method is based on the repeated reweighting of the records and the classifiers in such a way that the wrongly classified cases are gaining more and more importance and we try to classify them to the right class. In the boosting method a basic classifier is selected which can be a decision tree, a logistic regression, or a neural networks etc, of which several copies, given by the boosting cycle, are built up. In this experiment, this basic classifier is a decision tree. In the experiment, the boosting cycle has set to 20, i.e. 20 pieces of decision trees are fitted on the whole training dataset. The result is compared to a polynomial kernel support vector machine (SVM), which is recognized as an effective method for binary classification tasks. In the boosting method the basic classifier is determined by the Ensemble operator which is straddled between the Start Groups and End Groups operators. The size of the boosting cycle is set in the Start Groups operator.

Input

Spambase [UCI MLR]

In the preprocessing step the dataset is partitionated by the Data Partition operator according to the rates 60/20/20 for training, validatation and test dataset.

Output

There are similar tools to evaluate boosting classifiers which are available for other supervised data mining models: statistics (number of incorrectly classified cases, misclassification rate) and graphs (response and lift curves). The only additional graph is the second figure, where the error of the resulting classifiers can be seen which are 20 decision trees in our case.

Figure 20.16. The classification matrix of the boosting classifier

The classification matrix of the boosting classifier

Figure 20.17. The error curve of the boosting classifier

The error curve of the boosting classifier

The obtained boosting classifier is compared with a reference polynomial kernel SVM that we fit on the whole training dataset. The statistical and graphical results obtained are shown below.

Figure 20.18. Misclassification rates of the boosting classifier and the SVM

Misclassification rates of the boosting classifier and the SVM

Figure 20.19. Classification matrices for the boosting classifier and the SVM

Classification matrices for the boosting classifier and the SVM

Figure 20.20. Cumulative response curves of the boosting classifier and the SVM

Cumulative response curves of the boosting classifier and the SVM

Figure 20.21. Response curves of the boosting classifier and the SVM comparing the baseline and the optimal classifiers

Response curves of the boosting classifier and the SVM comparing the baseline and the optimal classifiers

Figure 20.22. ROC curves of the boosting classifier and the SVM

ROC curves of the boosting classifier and the SVM

Interpretation of the results

The experiment shows that a classifier obtained by the boosting method is competitive even comparing with a polynomial kernel support vector machine classifier in the sense that, although the misclassification rate is worse it has higher accuracy in the first few deciles. This can be seen clearly on the response and the ROC curves.

Video

Workflow

sas_ensemble_exp3.xml

Keywords

ensemble method
supervised learning
boosting
ROC curve
classification

Operators

Data Source
Decision Tree
End Groups
Model Comparison
Data Partition
Start Groups
Support Vector Machine