The process shows, using the Wine dataset, how a regression model can be fitted to a given dataset, and then how can a classification task be completed based on the received estimates. Classification can also be done based on a regression model; in this case, approximate values for numerical labels can be defined based on the regression model,and afterwards, these values can be assigned to concrete class labels. Similarly to other classification methods, the data set has to be split into training and test sets, and the regression model created using the training set is to be applied to the test set.
Wine [UCI MLR]
When creating the regression model, it can be chosen from among various types of regression, such as linear regression or logistic regression. From these, linear regression is utilized in the process. In order to be able to use this for classification, it has to be placed into an operator that implements regression-based classification. Identically to when the operator is used by itself, it can be defined for example which method should be used for attribute selection, or what the level of minimal tolerance should be. The thus created linear regression model can be applied to the test set.
The following regression model is created based on the data of the training set:
Using the regression model created based on the records of the training set on the test set, confidence values can be calculated regarding the probabilities of the individual test records belonging to the given groups. These confidence values, and the class assignments created based on these can be seen in the labelled data set yielded by the model application:
It can be seen that based on the approximate values, the assignments are done correctly, and are equal to the original labels in most cases.