Ensemble methods: bagging

Description

This experiment shows the combined method of bagging. In this method, a better fitting model can be built from supervised data mining models using bootstrap aggregation. Bagging is sampling the original training dataset obtaining several subsamples by the bootstrap method. On these subsamples supervised models (decision treee in this experiment) are fitted, respectively, and a new model is obtained by aggregation of the models that are obtained. In the experiment, the bagging cycle has set 10, i.e., 10 pieces of decision trees are fitted on 10 different subsamples. The results are compared with a simple decision tree, which is fitted to the entire training dataset. In the bagging method the basic classifier is determined by the Ensemble operator which is straddled between the Start Groups and End Groups operators. The size of the bagging cycle is set in the Start Groups operator.

Input

Spambase [UCI MLR]

In the preprocessing step the dataset is partitionated by the Data Partition operator according to the rates 60/20/20 for training, validatation and test dataset.

Output

There are similar tools to evaluate bagging classifiers which are available for other supervised data mining models: statistics (number of incorrectly classified cases, misclassification rate) and graphs (response and lift curves). The only additional graph can be seen on the second figure below, where the errors of the 10 classifiers obtained in the consecutive bagging cycle are plotted.

Figure 20.9. The classification matrix of the bagging classifier

The classification matrix of the bagging classifier

Figure 20.10. The error curves of the bagging classifier

The error curves of the bagging classifier

The obtained bagging classifier is compared with a reference decision tree that we fit on the whole training dataset. The statistical and graphical results obtained are shown below.

Figure 20.11. Misclassification rates of the bagging classifier and the decision tree

Misclassification rates of the bagging classifier and the decision tree

Figure 20.12. Classification matrices of the bagging classifier and the decision tree

Classification matrices of the bagging classifier and the decision tree

Figure 20.13. Response curves of the bagging classifier and the decision tree

Response curves of the bagging classifier and the decision tree

Figure 20.14. Response curves of the bagging classifier and the decision tree comparing the baseline and the optimal classifiers

Response curves of the bagging classifier and the decision tree comparing the baseline and the optimal classifiers

Figure 20.15. ROC curves of the bagging classifier and the decision tree

ROC curves of the bagging classifier and the decision tree

Interpretation of the results

The experiment shows that a better working model can be obtained by taking a bagging classifier than a simple decision tree if the models are compared on the first deciles. This is clear considering the classification matrix, the response and the ROC curve.

Video

Workflow

sas_ensemble_exp2.xml

Keywords

ensemble method
supervised learning
bagging
mi9sclassification rate
ROC curve
classification

Operators

Data Source
Decision Tree
End Groups
Model Comparison
Data Partition
Start Groups