Coding Categorical Predictors in General Linear Models
main topic
     
 

To include categorical variables in your general linear model, Minitab codes the categories so they can be included in the regression equation. General Linear Models does this automatically. You have two coding options: - 1, 0,  1 coding or 0, 1 coding. You can choose the coding scheme in the Coding subdialog box. Regardless of the coding that you choose, the test of the overall effect of the categorical variable remains the same.

When you have categorical variables, the coefficients are interpreted relative to a reference level. See Setting reference levels in General Linear Model for more information.

-1, 0, 1 coding

 - 1, 0, 1 coding is the default for General Linear Models and Design of Experiements (DOE), and is also known as effect or treatment coding. This type of coding gives estimates of effects in terms of differences with the mean.

In the design matrix, Minitab creates columns and assigns a 1 when a row belongs to the column group. No column is created for the reference level. In the example below, if the row of any column corresponds to New York (the reference level), it is assigned a -1.

If location is...

Hong Kong is coded as...

London is coded as...

Hong Kong

1

0

London

0

1

New York

-1

-1

1, 0 coding

1, 0 coding (also known as binary or dummy coding) is the default for regression analyses. This type of coding gives parameter estimates that can be interpreted as the difference of a level compared to a reference level.

For example, you want to include the categorical variable Location in your regression model. Location has three levels: Hong Kong, London, and New York. If you choose 1, 0 coding , Minitab codes the three levels of the variable as follows.

If location is...

London is coded as...

New York is coded as...

Hong Kong

0

0

London

1

0

New York

0

1

The design matrix does not include a separate column of codes for the reference level Hong Kong. When an observation equals the reference level, it is coded as 0 in all of the dummy variable columns.

Note

Calc > Make Indicator Variables uses 1, 0 coding.

Tip

To see how Minitab codes the categorical variables in your analysis, store the design matrix in the Storage subdialog box. Then choose Data > Display Data and select the design matrix to view it in the Session window.