Osztályozás lineáris regresszióval

Description

The process shows, using the Wine dataset, how a regression model can be fitted to a given dataset, and then how can a classification task be completed based on the received estimates. Classification can also be done based on a regression model; in this case, approximate values for numerical labels can be defined based on the regression model,and afterwards, these values can be assigned to concrete class labels. Similarly to other classification methods, the data set has to be split into training and test sets, and the regression model created using the training set is to be applied to the test set.

Input

Wine [UCI MLR]

Output

When creating the regression model, it can be chosen from among various types of regression, such as linear regression or logistic regression. From these, linear regression is utilized in the process. In order to be able to use this for classification, it has to be placed into an operator that implements regression-based classification. Identically to when the operator is used by itself, it can be defined for example which method should be used for attribute selection, or what the level of minimal tolerance should be. The thus created linear regression model can be applied to the test set.

Figure 7.4. The subprocess of the classification by regression operator

The subprocess of the classification by regression operator

The following regression model is created based on the data of the training set:

Figure 7.5. The linear regression model yielded as a result

The linear regression model yielded as a result

Interpretation of the results

Using the regression model created based on the records of the training set on the test set, confidence values can be calculated regarding the probabilities of the individual test records belonging to the given groups. These confidence values, and the class assignments created based on these can be seen in the labelled data set yielded by the model application:

Figure 7.6. The class labels derived from the predictions calculated based on the regression model

The class labels derived from the predictions calculated based on the regression model

It can be seen that based on the approximate values, the assignments are done correctly, and are equal to the original labels in most cases.

Video

Workflow

regr_exp2.rmp

Keywords

classification
regression

Operators

Apply Model
Classification by Regression
Linear Regression
Read AML
Split Data