Chapter 24. Regression for continuous target

Table of Contents

Logistic regression
Prediction of discrete target by regression models
Supervised models for continuous target

Logistic regression

Description

The process shows, using the Spambase dataset, how can a regression model be fitted to a dataset which has binary target. The conventional linear regression are not suitable for this task even though the Regression operator offers this option. Instead, we must use the logistic regression method which is the default option of this operator. We can choose between the following link functions: logit, which takes the name of the procedure, probit and complementary logit. There is no significant difference among these link functions. The Enterprise Miner™ gives an other operator for fitting regression. By the Dmine Rgeression operator forward stepwise regression can be fitted. In each step, an input variable is selected that contributes most significantly to the variability of the target.

Input

Spambase [UCI MLR]

Output

After fitting the logistic regression, standard statistics and graphs are obtained similarly to the binary classification tasks. Here, only the confusion matrix is shown, the rest of comparison tools is left at the and of this experiment.

Figure 24.1. Classification matrix of the logistic regression

Classification matrix of the logistic regression

In addition to the usual tools, the regression operators, using the effect plot, also show the importance of the input variables in the regression model which were built during the process.

Figure 24.2. Effects plot of the logistic regression

Effects plot of the logistic regression

In addition to the traditional regression analysis Enterprise Miner ™ yields another operator to fit forward stepwise regression. This is the Dmine Rgeression operator. The results can be seen in the figures below.

Figure 24.3. Classification matrix of the stepwise logistic regression

Classification matrix of the stepwise logistic regression

Figure 24.4. Effects plot of the stepwise logistic regression

Effects plot of the stepwise logistic regression

The two regressions can be compared by the usual way with the Model Comparison operator. The results of this comparison are presented in the following figures.

Figure 24.5. Fitting statistics for logistic regression models

Fitting statistics for logistic regression models

Figure 24.6. Classification charts of the logistic regression models

Classification charts of the logistic regression models

Figure 24.7. Cumulativ lift curve of the logistic regression models

Cumulativ lift curve of the logistic regression models

Figure 24.8. ROC curves of the logistic regression models

ROC curves of the logistic regression models

Interpretation of the results

The fit statistics and ROC curves clearly show on the test set that the logistic regression model is better than the stepwise logistic regression model.

Video

Workflow

sas_regr_exp1.xml

Keywords

classification
binary target
logistic regression

Operators

Data Source
Dmine Regression
Model Comparison
Data Partition
Regression