Chapter 7. Classification Methods 3

Regression

Table of Contents

Linear regression
Osztályozás lineáris regresszióval
Evaluation of performance for classification by regression model
Evaluation of performance for classification by regression model 2

Linear regression

Description

The process shows, using the Wine dataset, how a regression model can be fitted to a given dataset. Classification can also be done based on a regression model, but however, this process shows that creating the regression model itself is insufficient to perform this. Based on the regression model, approximate values for numerical labels can be defined, but these values are not assigned to concrete class labels. Apart from this, it can be stated that similarly to other classification methods, the data set has to be split into training and test sets, and the regression model created using the training set is to be applied to the test set.

Input

Wine [UCI MLR]

Output

When creating the regression model, it can be chosen from among various types of regression, such as linear regression or logistic regression. From these, linear regression is utilized in the process. In this form, for example, it can be defined which method should be used for attribute selection, or what the level of minimal tolerance should be. The thus created linear regression model can be applied to the test set.

Figure 7.1. Properties of the linear regression operator

Properties of the linear regression operator

The following regression model is created based on the data of the training set:

Figure 7.2. The linear regression model yielded as a result

The linear regression model yielded as a result

Interpretation of the results

Using the regression model created based on the records of the training set on the test set, approximate values can be calculated for values of the labels of the individual test records. These approximate values can be seen in the labelled data set yielded by the model application:

Figure 7.3. The class prediction values calculated based on the linear regression model

The class prediction values calculated based on the linear regression model

It can be seen that most of the approximate values yield a rather good estimation, and take a value that is close to the original label, but this by itself is insufficient to complete the classification task. In order to be able to classify records based on a regression model, its estimation have to be assigned to class labels.

Video

Workflow

regr_exp1.rmp

Keywords

classification
regression

Operators

Apply Model
Linear Regression
Read AML
Split Data