Contents
Covariance
- $ COV(X,Y)=E[(X-E[X])(Y-E[Y])]\! $
- $ COV(X,Y)=E[XY]-E[X]E[Y]\! $
X and Y are uncorrelated if cov(X,Y) = 0
Correlation Coefficient
$ \rho(X,Y)= \frac {cov(X,Y)}{\sqrt{var(X)} \sqrt{var(Y)}} \, $
Markov Inequality
Loosely speaking: In a nonnegative RV has a small mean, then the probability that it takes a large value must also be small.
- $ P(X \geq a) \leq E[X]/a\! $
for all a > 0
EXAMPLE:
On average it takes 1 hour to catch a fish. What is (an upper bound) the probability it will take 3 hours?
SOLUTION:
Using Markov's inequality, where E[X] = 1 and a = 3.
$ P(X \geq 3) \leq \frac {E[X]}{3} = \frac{1}{3} $
so 1/3 is the upper bound to the probability that it will take more than 3 hours to catch a fish.
Chebyshev Inequality
"Any RV is likely to be close to its mean"
- $ \Pr(\left|X-E[X]\right|\geq C)\leq\frac{var(X)}{C^2}. $
Weak Law of Large Numbers
The weak law of large numbers states that the sample average converges in probability towards the expected value
- $ \overline{X}_n \, \xrightarrow{P} \, \mu \qquad\textrm{for}\qquad n \to \infty. $
Mn = (X1 + ... + Xn)/n = X1/n + ... + Xn/n
E[Mn] = nE[X]/n = E[X]
Var[Mn] = Var(X1/n) + ... + Var(Xn/n) = Var(X)/n
Pr[ |Mn - E[X]| >= Var(Mn)/$ \sigma^2 = Var(X)/n\sigma^2 $
ML Estimation Rule
$ \hat a_{ML} = \text{max}_a ( f_{X}(x_i;a)) $ continuous
$ \hat a_{ML} = \text{max}_a ( Pr(x_i;a)) $ discrete
If X is a binomial (n,p), where is X is number of heads n tosses,
Then, for any fixed k-value;
$ \hat p_{ML}(k) = k/n $
MAP Estimation Rule
$ \hat \theta_{MAP} = \text{argmax}_\theta ( f_{\theta|X}(\theta|x)) $
Which can be expanded and turned into the following (if I am not mistaken):
$ \hat \theta_{MAP} = \text{argmax}_\theta ( f_{X|\theta}(x|\theta)f_{\theta}(\theta)) $
Bias of an Estimator, and Unbiased estimators
An estimator is unbiased if: $ E[\hat a_{ML}] = a $ for all values of a
Confidence Intervals, and how to get them via Chebyshev
$ \theta \text{ is unknown and fixed} $
$ \hat \theta \text{ is random and should be close to } \theta \text{ most of the time} $
$ if Pr[|\hat \theta \text{-} \theta|] <= (1-a) \text { then we say we have (1-a) confidence in the interval } [\hat \theta - E, \hat \theta + E] $
Confidence level of $ (1-a) $ if $ Pr[\hat \theta \text{-} \delta < \theta < \hat \theta + \delta] >= (1-a) for all \theta $