Given observation X used to estimate an unknown parameter $ \theta $ of distribution $ f_x(X) $ (i.e. $ f_x(X) = $ some function $ g(\theta) $
Consider three expressions (distributions):
Contents
1. Likehood
$ p(X; \theta) $ (discrete)
$ f_x(X; \theta) $ (continuous)
used for MLE: $ \overset{\land}\theta_{ML} = f_x(X | \theta) $
2. Prior
$ P(\theta) $ (discrete)
$ P_\theta(\theta) $ (continuous)
Indicates some prior knowledge as to what $ \theta $ should be. Prior refers to before seeing observation.
3. Posterior
$ p(\theta | x) $ (discrete)
$ f_x(\theta, x) $ (continuous)
"Posterior" refers to after seeing observations. Use Posterior to define maximum a-posterior i (map) estimate:
$ \overset{\land}\theta_{\mbox{MAP}} = \overset{\mbox{argmax}}\theta f_{\theta | X}(\theta | X) $
Using Bayes' Rule, we can expand the posterior $ f_{\theta | X}(\theta | X) $:
$ f_{\theta | X}(\theta | X) = \frac{f_{x|\theta}f_\theta(\theta)}{f_X(X)} $
$ \overset{\land}\theta_{\mbox{map}} = \overset{\mbox{argmax}}\theta f_{X | \theta}(X | \theta) F_\theta(\theta) $
Example 1
$ X \sim f_x(X) = \lambda e^{-\lambda X} $
but we don't know the parameter $ \lambda $. Let us assume, however, that $ \lambda $ is actually itself exponentially distributed, i.e.
$ \lambda \sim f_\lambda(\lambda) = \Lambda e^{-\Lambda\lambda} $
where $ \Lambda $ is fixed and known.
Find $ \overset{\land}\lambda_{\mbox{map}} $.
Solution:
$ \overset{\land}\lambda_{\mbox{map}} = \overset{\mbox{argmax}}\lambda f_{\lambda | X}(\lambda | X) $
$ \overset{\land}\lambda_{\mbox{map}} = \overset{\mbox{argmax}}\lambda f_x(\lambda)f_{x|\lambda}(x; \lambda) $
$ \overset{\land}\lambda_{\mbox{map}} = \overset{\mbox{argmax}}\lambda \Lambda e^{-\lambda \Lambda}\lambda e^{-\lambda X} $
$ \frac{d}{d\lambda} \lambda \Lambda e^{-\lambda(\Lambda + X)} = 0 $