Homepage > MCMC

Bayesian Approach

Let $ f(\mathbf{x};\theta)$ be a density function with parameter $ \theta\in\Omega$. In a Bayesian model the paramter space $ \Omega$ has a distribution  $ \pi(\theta)$, called a prior distribution. Furthermore, $ f(\mathbf{x};\theta)$ is viewed as the conditional distribution of $ \mathbf{X}$ given $ \theta$. By the Bayes' rule the conditional density  $ \pi(\theta\:\vert\:\mathbf{x})$ can be derived from

$\displaystyle \pi(\theta\:\vert\:\mathbf{x}) = \begin{cases}
\pi(\theta)f(\mat...
...;\theta) d\theta \right.
& \text{ if $\Omega$ is continuous. }
\end{cases}
$

Posterior distribution. The distribution  $ \pi(\theta\:\vert\:\mathbf{x})$ is called the posterior distribution. Whether $ \Omega$ is discrete or continuous, the posterior distribution  $ \pi(\theta\:\vert\:\mathbf{x})$ is ``proportional'' to $ \pi(\theta)f(\mathbf{x};\theta)$ up to the constant. Thus, we write

$\displaystyle \pi(\theta\:\vert\:\mathbf{x}) \propto \pi(\theta)f(\mathbf{x};\theta) .
$

It is often the case that both the prior density function $ \pi(\theta)$ and the posterior density function $ \pi(\theta\:\vert\:\mathbf{x})$ belong to the same family of density function $ \pi(\theta;\eta)$ with parameter $ \eta$. Then $ \pi(\theta;\eta)$ is called conjugate to $ f(\mathbf{x};\theta)$.

Exponential conjugate family. Suppose that the pdf has the form

$\displaystyle f(\mathbf{x}; \theta)
= \exp\left[ n c_0(\theta) + \sum_{j=1}^m c_j(\theta) k_j(\mathbf{x})
+ h(\mathbf{x}) \right],
$

and that a prior distribution is given by

$\displaystyle \pi(\theta;\eta_0,\eta_1,\ldots,\eta_{m})
\propto
\exp\left[ c_0(\theta)\eta_{0} + \sum_{j=1}^m c_j(\theta) \eta_j
\right].
$

Then we obtain the posterior density

$\displaystyle \pi(\theta\:\vert\:\mathbf{x})
= \pi(\theta;\eta_0+n,\eta_1+k_1(\mathbf{x}),\ldots,\eta_m+k_m(\mathbf{x})).
$

Thus, the family of $ \pi(\theta;\eta_0,\eta_1,\ldots,\eta_{m})$ is conjugate to $ f(\mathbf{x};\theta)$, and the parameter $ (\eta_0,\eta_1,\ldots,\eta_{m})$ of prior distribution is called the hyperparameter.

Bernoulli trials. Consider independent $ n$ Bernoulli trials. Let

$\displaystyle \pi(\theta; \eta_1, \eta_2) \propto \theta^{\eta_1}
(1-\theta)^{\eta_2}
$

be a prior density of beta distribution. Given the data $ \mathbf{x} = (x_1,\ldots,x_n)$, the posterior density is calculated as

$\displaystyle \pi(\theta\:\vert\:\mathbf{x})
= \textstyle\pi\left(\theta;
\eta_1 + \sum_{i=1}^n x_i, \eta_2 + n - \sum_{i=1}^n x_i
\right)
$

The expected value of posterior density becomes

$\displaystyle E(\theta\:\vert\:\mathbf{x}) =
\dfrac{\eta_1 + \sum_{i=1}^n x_i + 1}{\eta_2 + n + 2}
$


© TTU Mathematics