Table of Contents
The experiment introduces the use of ensemble methods, featuring the
Bagging operator. The average classification error
rate from 10-fold cross-validation on the Heart Disease
data set is compared for a single decision stump and an ensemble of 10
decision stumps trained by bagging. The impurity measure used for the
decision stumps is the gain ratio.
Heart Disease [UCI MLR]
The data set was donated to the UCI Machine Learning Repository by R. Detrano [Detrano et al.].
Figure 9.1. The average classification error rate of a single decision stump obtained from 10-fold cross-validation.
Figure 9.2. The average classification error rate of the bagging algorithm obtained from 10-fold cross-validation, where 10 decision stumps were used as base classifiers.
An ensemble of 10 decision stumps trained by bagging gives an average classification error rate that is about 7% better that those of a single decision stump.