Binary Logistic Regression

Goodness-of-Fit Tests - Hosmer-Lemeshow Test

  

When fitting a logistic model, you want to choose a model (link function and predictors) that results in a good fit to your data. Goodness-of-fit statistics can be used to compare the fits of different models. A low p-value indicates that the predicted probabilities deviate from the observed probabilities in a way that the binomial distribution does not predict.

By default, Minitab provides three goodness-of-fit tests: Pearson, Deviance, and Hosmer-Lemeshow.

The Hosmer-Lemeshow test assesses the model fit by comparing the observed and expected frequencies. The test groups the data by their estimated probabilities from lowest to highest, then performs a Chi-square test to determine if the observed and expected frequencies are significantly different.

Example Output

Goodness-of-Fit Tests

 

Test             DF  Chi-Square  P-Value

Deviance         67       76.77    0.194

Pearson          67       76.11    0.209

Hosmer-Lemeshow   8        5.58    0.694

 

 

Observed and Expected Frequencies for Hosmer-Lemeshow Test

 

            Event

         Probability       Bought = 1          Bought = 0

Group       Range      Observed  Expected  Observed  Expected

    1  (0.000, 0.065)         1       0.4         6       6.6

    2  (0.065, 0.137)         1       0.7         6       6.3

    3  (0.137, 0.193)         1       1.1         6       5.9

    4  (0.193, 0.232)         0       1.5         7       5.5

    5  (0.232, 0.252)         2       1.7         5       5.3

    6  (0.252, 0.304)         1       2.0         6       5.0

    7  (0.304, 0.466)         4       2.8         3       4.2

    8  (0.466, 0.514)         4       3.5         3       3.5

    9  (0.514, 0.552)         5       4.3         3       3.7

   10  (0.552, 0.568)         3       4.0         4       3.0

Interpretation

For the cereal data, the relatively large p-value (0.694) for the test indicates that there is consistency between the observed and expected frequencies.

The largest difference between these values is found in Group = 4:

·    For Value = 1 the observed frequency is 0, but 1.5 observations were expected.

·    For Value = 0, the observed frequency is 7, but only 5.5 observations were expected.

If you scan through the table of observed and expected frequencies you can see that the observed and expected values are generally quite close.