Given observation X used to estimate an unknown parameter $ \theta $ of distribution $ f_x(X) $ (i.e. $ f_x(X) = $ some function $ g(\theta) $
Consider three expressions (distributions):
1. Likehood:
$ p(X; \theta) $ (discrete)
$ f_x(X; \theta) $ (continuous)
used for MLE: $ \overset{\land}\theta_{ML} = f_x(X | \theta) $
2. Prior:
$ P(\theta) $ (discrete)
$ P_\theta(\theta) $ (continuous)
Indicates some prior knowledge as to what $ \theta $ should be. Prior refers to before seeing observation.
3. Posterior:
$ p(\theta | x) $ (discrete)
$ f_x(\theta, x) $ (continuous)
"Posterior" refers to after seeing observations. Use Posterior to define maximum a-posterior i (map) estimate:
$ \overset{\land}\theta_{\mbox{MAP}} = \overset{\mbox{argmax}}\theta f_{\theta | X}(\theta | X) $