The experiment shows the combined method of boosting. In this method, a better fitting model can be built
from supervised data mining models. The method is based on the repeated reweighting of the records and the
classifiers in such a way that the wrongly classified cases are gaining more and more importance and
we try to classify them to the right class. In the boosting method a basic classifier is selected which
can be a decision tree, a logistic regression, or a neural networks etc, of which several copies, given
by the boosting cycle, are built up. In this experiment, this basic classifier is a decision tree.
In the experiment, the boosting cycle has set to
pieces of decision trees are fitted on the whole training dataset. The result is compared to a polynomial
kernel support vector machine (SVM), which is recognized as an effective method for binary classification tasks.
In the boosting method the basic classifier is determined by the
which is straddled between the
Start Groups and
End Groups operators.
The size of the boosting cycle is set in the
Start Groups operator.
Spambase [UCI MLR]
In the preprocessing step the dataset is partitionated by the
Data Partition operator
according to the rates 60/20/20 for training, validatation and test dataset.
There are similar tools to evaluate boosting classifiers which are available for other supervised data mining models:
statistics (number of incorrectly classified cases, misclassification rate) and graphs (response and lift curves).
The only additional graph is the second figure, where the error of the resulting classifiers can be seen which
20 decision trees in our case.
The obtained boosting classifier is compared with a reference polynomial kernel SVM that we fit on the whole training dataset. The statistical and graphical results obtained are shown below.
Figure 20.21. Response curves of the boosting classifier and the SVM comparing the baseline and the optimal classifiers
The experiment shows that a classifier obtained by the boosting method is competitive even comparing with a polynomial kernel support vector machine classifier in the sense that, although the misclassification rate is worse it has higher accuracy in the first few deciles. This can be seen clearly on the response and the ROC curves.