Table of Contents
The process shows, using the Wine dataset, how a regression model can be fitted to a given dataset. Classification can also be done based on a regression model, but however, this process shows that creating the regression model itself is insufficient to perform this. Based on the regression model, approximate values for numerical labels can be defined, but these values are not assigned to concrete class labels. Apart from this, it can be stated that similarly to other classification methods, the data set has to be split into training and test sets, and the regression model created using the training set is to be applied to the test set.
Wine [UCI MLR]
When creating the regression model, it can be chosen from among various types of regression, such as linear regression or logistic regression. From these, linear regression is utilized in the process. In this form, for example, it can be defined which method should be used for attribute selection, or what the level of minimal tolerance should be. The thus created linear regression model can be applied to the test set.
The following regression model is created based on the data of the training set:
Using the regression model created based on the records of the training set on the test set, approximate values can be calculated for values of the labels of the individual test records. These approximate values can be seen in the labelled data set yielded by the model application:
It can be seen that most of the approximate values yield a rather good estimation, and take a value that is close to the original label, but this by itself is insufficient to complete the classification task. In order to be able to classify records based on a regression model, its estimation have to be assigned to class labels.