Residual Analysis

The assumption for an analysis of variance (AOV) test is described as follows: The random variable $ \varepsilon_{ij} = X_{ij} - \mu_i$ is called a residual, and it is assumed that all $ \varepsilon_{ij}$'s are independent and approximately normally distributed with mean 0 and common variance $ \sigma^2$.

The data from k groups are arranged either (a) all in a single variable with another categorical variable indicating ``levels,'' or (b) in multiple columns each of whose variables represents a ``level.'' In either case, (i) the original variables $ X_{ij}$'s are converted to new variables $ Y_{ij}$'s via the transformation if necessary; otherwise, leave it to ``no change.'' Then, (ii) they are moved to the column Variable in the table below, and (iii) the residual $ Y_{ij} - \bar{Y}_{i\cdot}$ is calculated.

When the nonnormality cannot be eliminated by the use of transformation, the Kruskal-Wallis test is appropriate for the hypothesis testing. Here the null hypothesis $ H_0$ is that k population distributions (not necessarily normal) are identical. It calculates the test statistic and the p-value . By rejecting $ H_0$ we can find some evidence supporting that not all the distributions are the same.