Coding Categorical Predictors in Regression
main topic
     
 

To include categorical predictors in your general regression model, Minitab codes the categories so they can be included in the regression equation. Regression does this automatically. You have two coding options: - 1, 0,  1 coding or 0, 1 coding. You can choose the coding scheme in the Coding subdialog box. Regardless of the coding that you choose, the test of the overall effect of the categorical variable remains the same.

When you have categorical predictors, the regression coefficients are interpreted relative to a reference level. See Setting reference levels in Regression for more information.

-1, 0, 1 coding

You can also code categorical predictors using a - 1, 0, 1 scheme (also known as effect or treatment coding). This type of coding gives estimates of effects in terms of differences with the mean. - 1, 0, 1 coding is used in General Linear Models and Design of Experiements (DOE).

In the design matrix, Minitab creates columns and assigns a 1 when a row belongs to the column group. No column is created for the reference level. In the example below, if the row of any column corresponds to New York (the reference level), it is assigned a -1.

If location is...

Hong Kong is coded as...

London is coded as...

Hong Kong

1

0

London

0

1

New York

-1

-1

1, 0 coding

1, 0 coding (also known as binary or dummy coding) is commonly used in regression analyses. This type of coding gives parameter estimates that can be interpreted as the difference of a level compared to a reference level.

For example, you want to include the categorical predictor Location in your regression model. Location has three levels: Hong Kong, London, and New York. If you choose 1, 0 coding , Minitab codes the three levels of the predictor as follows.

If location is...

London is coded as...

New York is coded as...

Hong Kong

0

0

London

1

0

New York

0

1

The design matrix does not include a separate column of codes for the reference level Hong Kong. When an observation equals the reference level, it is coded as 0 in all of the dummy variable columns.

Note

Calc > Make Indicator Variables uses 1, 0 coding.

Tip

To see how Minitab codes the categorical predictors in your analysis, store the design matrix in the Storage subdialog box. Then choose Data > Display Data and select the design matrix to view it in the Session window.