Multiple comparisons of means
main topic
    
 

Multiple comparisons of means allow you to examine which means are different and to estimate by how much they are different. After you fit a general linear model, you can obtain multiple comparisons of means if you use Stat > ANOVA > General Linear Model > Comparisons.

You must make the  following choices when using multiple comparisons:

·    Pairwise comparisons or comparisons with a control

·    Which means to compare

·    The method of comparison

·    How to display the results

Pairwise comparisons or comparison with a control

Choose Pairwise when you do not have a control level and you want to compare all combinations of means.

Choose With a Control to compare the level means to the mean of a control group. When this method is suitable, it is inefficient to use pairwise comparisons because pairwise confidence intervals are wider and the hypothesis tests are less powerful for a given confidence level.

Which means to compare

Choosing which means to compare is an important consideration when using multiple comparisons; a poor choice can result in confidence levels that are not what you think. You may need to consider these issues:

·    Do you compare the means for only those terms with a significant F-test or for those sets of means for which differences appear to be large?

·    How deep into the design do you compare means: only within each factor, within each combination of first-level interactions, or across combinations of higher level interactions?

It is probably a good idea to decide which means you will compare before collecting your data. If you compare only those means with differences that appear to be large, which is called data snooping, then you are increasing the likelihood that the results suggest a real difference where no difference exists [15], [28]. Similarly, if you condition the application of multiple comparisons upon achieving a significant F-test, then the error rate of the multiple comparisons can be higher than the error rate in the unconditioned application of multiple comparisons [15], [23].

In practice, however, many people commonly use F-tests to guide the choice of which means to compare. The ANOVA F-tests and multiple comparisons are not entirely separate assessments. For example, if the p-value of an F-test is 0.9, you probably will not find statistically significant differences among means by multiple comparisons.

How deep within the design should you compare means? There is a trade-off: if you compare means at all two-factor combinations and higher orders turn out to be significant, then the means that you compare might be a mix of effects; if you compare means at too deep a level, you lose power because the sample sizes become smaller and the number of comparisons becomes larger. In practice, you might decide to compare means for factor level combinations for which you believe the interactions are meaningful.

Minitab restricts the terms that you can compare means for to fixed terms or interactions among fixed terms. Nesting is considered to be a form of interaction.

If you have 2 factors named A and B and you specify A B, Minitab displays multiple comparisons within each factor. If you specify the interaction (A * B), Minitab displays multiple comparisons for all level combination of factors A and B.

The multiple comparison method

Choose the comparison procedure based on the group means that you want to compare, the type of confidence level that you want to specify, and how conservative you want the results to be. "Conservative" in this context indicates that the true confidence level is likely to be greater than the confidence level that is displayed.

Except for Fisher's method, the multiple comparison methods have protection against false positives built-in. By protecting against false positives with multiple comparisons, the intervals are wider than if there were no protection.

Some characteristics of the multiple comparison methods are summarized below:

Comparison method

Properties

Confidence level that you specify

Tukey

all pairwise comparisons, not conservative

Simultaneous

Fisher

no protection against false positives due to multiple comparisons

Individual

Dunnett

comparison to a control, not conservative

Simultaneous

Bonferroni

most conservative

Simultaneous

Sidak

conservative, but slightly less than Bonferroni

Simultaneous

How to display the results

Minitab presents multiple comparison results in grouping information table, confidence interval, and hypothesis test form.

The grouping information table contains columns of letters that group the factor levels. Levels that share a letter are not significantly different. Conversely, if they do not share a letter, the means are significantly different.

When viewing confidence intervals, you can assess the practical significance of differences among means, in addition to statistical significance. As usual, you reject the null hypothesis of no difference between means when the confidence interval does not contain zero.

Except for Fisher's method, Minitab calculates adjusted p-values for hypothesis test statistics. The adjusted p-value for a particular hypothesis within a collection of hypotheses is the smallest family-wise a-level at which the particular hypothesis would be rejected.