Suppose you are a grade school curriculum director interested in what children identify as their favorite subject and how this is associated with their age or the teaching method employed. Thirty children, 10 to 13 years old, had classroom instruction in science, math, and language arts that employed either lecture or discussion techniques. At the end of the school year, they were asked to identify their favorite subject. We use nominal logistic regression because the response is categorical and possesses no implicit categorical ordering.
1 Open the worksheet EXH_REGR.MTW.
2 Choose Stat > Regression > Nominal Logistic Regression.
3 In Response, enter Subject. In Model, enter TeachingMethod Age. In Factors (optional), enter TeachingMethod.
4 Click Results. Choose In addition, list of factor level values, and tests for terms with more than 1 degree of freedom. Click OK in each dialog box.
Session window output
Nominal Logistic Regression: Subject versus TeachingMethod, Age
Response Information
Variable Value Count Subject science 10 (Reference Event) math 11 arts 9 Total 30
Factor Information
Factor Levels Values TeachingMethod 2 discuss, lecture
Logistic Regression Table
Odds 95% CI Predictor Coef SE Coef Z P Ratio Lower Upper Logit 1: (math/science) Constant -1.12266 4.56425 -0.25 0.806 TeachingMethod lecture -0.563115 0.937591 -0.60 0.548 0.57 0.09 3.58 Age 0.124674 0.401079 0.31 0.756 1.13 0.52 2.49 Logit 2: (arts/science) Constant -13.8485 7.24256 -1.91 0.056 TeachingMethod lecture 2.76992 1.37209 2.02 0.044 15.96 1.08 234.90 Age 1.01354 0.584494 1.73 0.083 2.76 0.88 8.66
Log-Likelihood = -26.446 Test that all slopes are zero: G = 12.825, DF = 4, P-Value = 0.012
Goodness-of-Fit Tests
Method Chi-Square DF P Pearson 6.95295 10 0.730 Deviance 7.88622 10 0.640 |
The Session window output contains the following five parts:
Response Information displays the number of observations that fall into each of the response categories (science, math, and language arts), and the number of missing observations. The response value that has been designated as the reference event is the first entry under Value. Here, the default coding scheme defines the reference event as science using reverse alphabetical order.
Factor Information displays all the factors in the model, the number of levels for each factor, and the factor level values. The factor level that has been designated as the reference level is the first entry under Values. Here, the default coding scheme defines the reference level as discussion using alphabetical order.
Logistic Regression Table shows the estimated coefficients (parameter estimates), standard error of the coefficients, z-values, and p-values. You also see the odds ratio and a 95% confidence interval for the odds ratio. The coefficient associated with a predictor is the estimated change in the logit with a one unit change in the predictor, assuming that all other factors and covariates are the same.
Next displayed is the last Log-Likelihood from the maximum likelihood iterations along with the statistic G. G is the difference in -2 log-likelihood for a model which only has the constant terms and the fitted model shown in the Logistic Regression Table. G is the test statistic for testing the null hypothesis that all the coefficients associated with predictors equal 0 versus them not all being zero. G = 12.825 with a p-value of 0.012, indicating that at a = 0.05, there is sufficient evidence for at least one coefficient being different from 0.
Goodness-of-Fit Tests displays Pearson and deviance goodness-of-fit tests. In our example, the p-value for the Pearson test is 0.730 and the p-value for the deviance test is 0.640, indicating that there is evidence to suggest the model fits the data. If the p-value is less than your selected a-level, the test would indicate that the model does not fit the data.