Evaluation of performance for classification by regression model

Description

The process shows, using the Spambase dataset, how the quality, the precision of a given classification that is created based on a regression model fitted to a given data set can be evaluated. After the regression model has been built based on the training set, and the test set has been classified using it, the quality of the classification executed can be examined. Using the evaluation received this way, it can be decided whether the resulting classification is appropriate for the goals of the process, the existing model should be improved further, or the existing model is of such poor quality that using a completely new model is necessary.

Input

Spambase [UCI MLR]

Output

After creating the regression model, in order to be able to use it for classification, it has to be placed into an operator that implements regression-based classification. Similarly to when using the operator individually, it can be defined for example which method should be used for attribute selection, or what the level of minimal tolerance should be. The thus created linear regression model can be applied to the test set.

Figure 7.7. The subprocess of the classification by regression operator

The subprocess of the classification by regression operator

The following regression model is created based on the data of the training set:

Figure 7.8. The linear regression model yielded as a result

The linear regression model yielded as a result

Interpretation of the results

Using the regression model created based on the records of the training set on the test set, confidence values can be calculated regarding the probabilities of the individual test records belonging to the given groups. Based on these confidence values, class assignments are assigned to the individual records of the test set. Corresponding to this, it can be evaluated how many records have been classified successfully based on the regression model:

Figure 7.9. The performance vector of the classification based on the regression model

The performance vector of the classification based on the regression model

Video

Workflow

regr_exp3.rmp

Keywords

classification
regression
performance
evaluation

Operators

Apply Model
Classification by Regression
Linear Regression
Performance (Classification)
Read AML
Split Data