The process shows, using the
dataset, that how can we fit a regression model to a dataset containing discrete but non-binary target variable.
Moreover, how can we performe a classification task using the parameter estimates obtained from the model. The
type of the fitted regression model depends on the measurement scale of the discrete target variable. If the
target is nominal then the
Regression operator fits binary logistic models separately, so
that a selected event of the target is compared to the class of the other values of the discrete target variable.
On the other hand, if the target is ordinal then a common logistic regression model are fitted, wherein only the
constant parameters differ, but the parameters of the input variables are shared.
(As opposed to the nominal case where these coefficients are different.)
Wine [UCI MLR]
A number of models can be choosen to fit a regression model, e.g., linear or logistic regression. Among them,
the logistic regression used in the process. It is not needed to set in, the software recognizes the right
type of the regression using the metadata on the target. Of course, it is possible to override this option and
to enforce linear regression, but this does not make sense because in this case the
operator can not be used to compare this model to other discrete supervised models. The fit of the model can be
tested by the well-known statistics and charts.
The classification chart shows that the fitted model is perfect on the training dataset and has small error on the validation dataset.
Besides the standard goodness-of-fit tests the
Regression operator presents a bar graph
showing the importance of the input variables in the regression models. The higher the coefficient of an input
variable, the more is its explanation power with respect to the target variable. In case of ordinal target, there
is only one, while, in case of the nominal target, the number of different values of the target variable
minus 1 bar graphs are created. Since the
Class variable has three values there are two
such bar graphs below.
Using the regression model created on the training set for the validation dataset the above results show that
a model can be built with relatively high accuracy for multiple discrete target variable value by the
Regression operator. It is noted that for the problem discussed here the
Dmine Regression operator can not be applied.