e-Statistics

Boxplot

A boxplot is a schematic presentation of summary statistics: a measure of location (sample median), a measure of dispersion (IQR between upper and lower sample quartiles), and the largest and smallest data. It also indicates the symmetry or skewness of the distribution and the presence of possible outliers.

A box in the middle of boxplot stretches from the lower quartile Q1 (the 25-percentile) to the upper quartile Q3 (the 75-percentile). The median Q2 is shown as a line across the box. Therefore, 1/4 of the observations lies between this line and the right of the box and 1/4 of the observation between this line and the left of the box. Straight lines (either horizontal or vertical) are called whiskers, stretching out from the ends of the box to the largest and smallest value, Q0 and Q4. In the computer-generated boxplot potential outliers will be indicated separately from whiskers. The plot will be displayed.

The data of one variable are often grouped by another categorical variable. Such a categorical variable is often referred as ``factor,'' and its values are called ``levels.'' The presentation of side-by-side boxplots is quite useful for comparing the data from different groups (i.e., different levels).