e-Statistics

## Goodness of Fit

In the experiment on pea breeding Mendel's theory predicts the probabilities of occurrence associated with the types of progeny, say round yellow'', wrinkled yellow'', round green'', and wrinkled green.'' Here we want to test whether the data from observation is consistent with his theory--goodness of fit.

The model probabilities

are specified (usually in the column Probability or Percentage) for k categories or cells.'' Out of the total size n each observation is classified into one of the k cells, and the expected cell frequencies

are calculated from the model probabilities by

The observed cell frequencies

gives the total size of cell frequencies. Then the goodness of fit to the model can be assessed by comparing the observed cell frequencies with the expected cell frequencies. Here the statement of null hypothesis becomes the model is valid.'' The discrepancy between the data and the model can be measured by the Pearson's chi-square statistic

Under the null hypothesis (that is, assuming that the model probabilities are correct), the distribution of Pearson's chi-square is approximated by the chi-square distribution with degrees of freedom. Therefore, we can reject the null hypothesis if you observe that and , casting doubt on the validity of the model. Or equivalently, by computing the -value

with a random variable having the chi-square distribution with degrees of freedom, we can find that the null hypothesis is rejected if .