e-Statistics > 4480-5480 Probability and Statistics II


Normal Distribution

From the probabilistic point of view a data set of a certain population is merely a collection of values randomly drawn from a common distribution which is not yet known. Such a distribution is characterized by a probability density function (PDF) which is often described by means of parameters. The parameters are used to represent a family of functions having the same general shape, and the introduction of such parameters gave rise to the discipline of statistical modeling.

A normal density function

$\displaystyle f(x) = \frac{1}{\sigma\sqrt{2\pi}}
\exp\left[-\frac{(x - \mu)^2}{2 \sigma^2}\right]
$

is determined by two parameters $ \mu =$ and $ \sigma =$ . The first parameter $ \mu$ is called mean and it determines the center of the density. The second parameter $ \sigma$ is called standard deviation (SD). The normal density function $ f(x)$ is unimodal and symmetric around $ \mu$. A small value of $ \sigma$ leads to a high peak with sharp drop, and a larger value of $ \sigma$ leads to a flatter shape of function.

Assuming the normal density function $ f(x)$ for a certain population, we can predict the chance that the observation $ X$ from the population is between $ a$ and $ b$. Having formed the density function $ f(x)$, we can obtain such a probability, denoted by $ P(a < X < b)$, as the area under the curve over the range between $ a =$ and $ b =$ . The calculation of this probability is what is known as ``integration'' in light of calculus, and can be obtained numerically by

$ \displaystyle
P(a < X < b) = \int_a^b f(x) dx =$

In order to compute the probability $ P(-\infty < X < b)$ or $ P(a < X < \infty)$ in the form above, you must use ``$ a =$ -Inf'' or ``$ b =$ +Inf'' to indicate $ a=-\infty$ or $ b=+\infty$ in appropriate boxes.

Z-score. In particular, $ N(0,1)$ is called the standard normal distribution. Suppose that a random variable $ X$ is normally distributed with $ (\mu,\sigma)$. Then $ Z = \displaystyle\frac{X - \mu}{\sigma}$ is called the z-score, and the distribution of $ Z$ becomes the standard normal distribution $ N(0,1)$. Standard normal distribution table contains the area under the standard normal curve from $ Z=0$ to $ Z=z$. This table can be used to compute the probability involving a normal random variable.