Multiple Comparisons

When the null hypothesis $ H_0: \mu_1 = \cdots = \mu_k$ is rejected, the question arises on which pairs of $ \mu_i$'s are really different. Such investigation is carried out by an analysis of all pairwise differences, called Multiple comparisons.

The data from k groups are arranged either (a) all in a single variable with another categorical variable indicating factor levels, or (b) in multiple columns each of whose variables represents a factor level.

Here we construct the confidence intervals simultaneously for all pairwise differences $ \theta_{ij} = \mu_i - \mu_j$. Then the point estimate of $ \theta_{ij}$ and that of its variance $ \sigma_{ij}$ become $ \hat{\theta}_{ij} = \bar{X}_{i\cdot} - \bar{X}_{j\cdot}$ and $ \hat{\sigma}_{ij}^2 =
MS_{\mbox{error}}\times\left(\frac{1}{n_i} + \frac{1}{n_j}\right)$, respectively. Various methods are proposed to find a critical point $ \rho$ so that we can obtain the confidence intervals

$\displaystyle \left( \hat{\theta}_{ij} - \hat{\sigma}_{ij} \rho,\:
\hat{\theta}_{ij} - \hat{\sigma}_{ij} \rho \right)$    for every pairwise difference $ \theta_{ij}$

in which overall probability is no less than $ 1 - \alpha =$

  1. Tukey's method. Tukey introduced a studentized range distribution

    $\displaystyle W = \max_{1 \le i, j \le k} \frac{\vert Z_i - Z_j\vert}{\sqrt{\chi^2 / (n - k)}}

    where $ Z_i$'s are independent standard normal and $ \chi^2$ is $ \chi^2$-distribution with $ (n-k)$ degrees of freedom, independent of $ Z_i$'s. Then we use

    $\displaystyle \rho_{ij}
= q(\alpha) \sqrt{\displaystyle\sum_{c=1}^k \frac{1}{n_c}
\left/ k \left(\frac{1}{n_i} + \frac{1}{n_j}\right)\right.}

    where $ q(\alpha)$ is given as the $ (1-\alpha)$-th percentile of the distribution for $ W$.

  2. Scheffé's method. As a special case of Scheffé's S Method, we can obtain

    $\displaystyle \rho = \sqrt{(k-1) F_{\alpha,k-1,n-m}}

  3. Bonferroni's method. The Boole's inequality implies that we can choose $ \rho = t_{\beta/2,n - k}$ with $ \beta = \alpha\left/ \binom{k}{2}\right.$. Here $ t_{\alpha,n}$ is the $ (1-\alpha)$-th percentile for student $ t$-distribution with $ n$ degrees of freedom.

The significance tests for pairwise differences $ \theta_{ij} = \mu_i - \mu_j$ are then performed in the following manners: If the confidence interval for $ \theta_{ij}$ does not contain zero, then we reject `` $ \mu_i = \mu_j$.'' The larger the critical point $ \rho$ is, the harder it is to reject `` $ \mu_i = \mu_j$'' (that is, the more conservative the test is). Therefore, in practice we often choose the smallest critical point $ \rho =
\min\{\rho_{\mbox{Tukey}},\rho_{\mbox{Scheffe}},\rho_{\mbox{Bonferroni}}\}$ in order to obtain the least conservative confidence interval; thus, performing the least conservative test.

Remark on simultaneity. Whether we should conduct the analysis of variance (AOV) before multiple comparisons (MC) is a little sensitive issue, since it creates simultaneity of AOV and MC. However, because of the duality between the AOV and the Scheffé's S Method, a systematic approach popular among statistician requires the AOV in order to proceed with the MC. Also note that when we attempt different multiple comparison procedures (for example, Scheffé's and Tukey-Kramer's methods), naturally we do not discuss simultaneity of these procedures and understandably their conclusions may be inconsistent (for example, Scheffé's method may not detect any significance while Tukey-Kramer's method indicates significances for some pairs).