e-Statistics

Histogram

Histogram uses bars of varying height which are proportional to the number of data within bands. This graphical presentation can be used to describe how the data are distributed. It is characterized by "peaks" and "tails" of histogram as follows. A histogram with one peak is called unimodal. When the histogram has two major peaks, it is said to be bimodal, suggesting a possibility of two distinct populations in the data. A symmetric histogram has two symmetric tails on both side, whereas a skewed histogram has a longer tail on one side than that on the other. We call it right-skewed or left-skewed accordingly as we observe the longer right-hand tail or the longer left-hand tail

The vertical axis of histogram can be . The frequency shows the number of observations within a band. When "Density" is selected from the drop-down menu, areas for each bar in the histogram represents the probability that an observation falls in the range specified by a bar. For this reason the density height must be calculated by the formula:

$\displaystyle ($density height$\displaystyle ) =
\frac{(\mbox{\it frequency})}{(\mbox{\it data size}) \times (\mbox{\it bandwidth})}.
$

The width of each class (or band) is called class width. If the class width is too large then the graph fails to convey the structure of the data. On the other hand if it is too small then it becomes too spiky.

If the interval between $ a =$ and $ b =$ is also specified, it calculates the frequency (or the proportion) of the observed values within the interval [a,b). (The interval will be [a,b] if $ b$ is the maximum of data.) With the density height at the vertical axis, the area under the histgram in the interval [a,b) represents the proportion of the sample distribution.