|
Binary Logistic RegressionGoodness-of-Fit Tests - Hosmer-Lemeshow Test |
When fitting a logistic model, you want to choose a model (link function and predictors) that results in a good fit to your data. Goodness-of-fit statistics can be used to compare the fits of different models. A low p-value indicates that the predicted probabilities deviate from the observed probabilities in a way that the binomial distribution does not predict.
By default, Minitab provides three goodness-of-fit tests: Pearson, Deviance, and Hosmer-Lemeshow.
The Hosmer-Lemeshow test assesses the model fit by comparing the observed and expected frequencies. The test groups the data by their estimated probabilities from lowest to highest, then performs a Chi-square test to determine if the observed and expected frequencies are significantly different.
Example Output |
Goodness-of-Fit Tests
Test DF Chi-Square P-Value Deviance 67 76.77 0.194 Pearson 67 76.11 0.209 Hosmer-Lemeshow 8 5.58 0.694
Observed and Expected Frequencies for Hosmer-Lemeshow Test
Event Probability Bought = 1 Bought = 0 Group Range Observed Expected Observed Expected 1 (0.000, 0.065) 1 0.4 6 6.6 2 (0.065, 0.137) 1 0.7 6 6.3 3 (0.137, 0.193) 1 1.1 6 5.9 4 (0.193, 0.232) 0 1.5 7 5.5 5 (0.232, 0.252) 2 1.7 5 5.3 6 (0.252, 0.304) 1 2.0 6 5.0 7 (0.304, 0.466) 4 2.8 3 4.2 8 (0.466, 0.514) 4 3.5 3 3.5 9 (0.514, 0.552) 5 4.3 3 3.7 10 (0.552, 0.568) 3 4.0 4 3.0 |
Interpretation |
|
For the cereal data, the relatively large p-value (0.694) for the test indicates that there is consistency between the observed and expected frequencies.
The largest difference between these values is found in Group = 4:
If you scan through the table of observed and expected frequencies you can see that the observed and expected values are generally quite close.