In this experiment, we demonstrate by the help of the
Congressional Voting Records
dataset how to modify the values of attributes by the
and then how to impute the missing value by the
The replacement of missing values for each variable can be carried out independently of the others and
to interact with the target variable by fitting a decision tree.
Congressional Voting Records [UCI MLR]
Replacement operator we can set the substitution of discrete and continuous variables
A number of imputation methods can be choosen in the
We may fill in the missing values by a constant value, but also can use distribution-based value, where a
random value is generated by the system, or decision tree based method.
The results of the imputation correlated by the target variable are shown in the following two bar charts.
The experiment shows that if the method of imputation is chosen in appropriate way the values obtained in place of the missing data values is not very distorted and thus, on a larger dataset, we can perform a more reliable fitting of the model.