Checking Your Model
main topics
 

Regression analysis does not end once the regression model is fit. You should examine residual plots and other diagnostic statistics to determine whether your model is adequate and the assumptions of regression have been met. If your model is inadequate, it will not correctly represent your data. For example:

·    The standard errors of the coefficients may be biased, leading to incorrect t- and p-values.

·    Coefficients may have the wrong sign.

·    The model may be overly influenced by one or two points.

Use the table below to determine whether your model is adequate.

Characteristics of an adequate regression model

 

Check using...

 

Possible solutions

Functional form accurately models any curvature that is present.

Lack-of-fit-tests

Residuals vs variables plot

·    Add higher-order term to model.

·    Transform variables.

·    Nonlinear regression

Residuals have constant variance.

Residuals vs fits plot

·    Transform variables.

·    Weighted least squares.

Residuals are independent of (not correlated with) one another.

Durbin-Watson statistic

Residuals vs order plot

·    Add new predictor.

·    Use time series analysis.

·    Add lag variable.

Residuals are normally distributed.

Histogram of residuals

Normal plot of residuals

Residuals vs fit plot

Normality test

·    Transform variables.

·    Check for outliers.

No unusual observations or outliers.

Residual plots

Leverages

Cook's distance

DFITS

·    Transform variables.

·    Remove outlying observation.

Data are not ill-conditioned.

Variance inflation factor (VIF)

Correlation matrix of predictors

·    Remove predictor.

·    Partial least squares regression.

·    Transform variables.

If you determine that your model does not meet the criteria listed above, you should :

1    Check to see whether your data are entered correctly, especially observations identified as unusual.

2    Try to determine the cause of the problem. You may want to see how sensitive your model is to the issue. For example, if you have an outlier, run the regression without that observation and see how the results differ.

3    Consider using one of the possible solutions listed above. See [11], [35] for more information.