(One intermediate revision by the same user not shown)
Line 2: Line 2:
  
 
=Details of Lecture 8, [[ECE662]] Spring 2010=
 
=Details of Lecture 8, [[ECE662]] Spring 2010=
 +
In Lecture 8, we justified the commonly made assumption that the features are normally distributed with the Central Limit Theorem. We then discussed the probability of error when using Bayes decision rule. More precisely, we obtained the Chernoff Bound and the Bhattacharrya bound for the probability of error.
  
 +
Note for this lecture can be found [[noteslecture8ECE662S10|here]].
 +
  
Classic central limit Thm (Second Fundamental probabilistic):
+
Previous: [[Lecture7ECE662S10|Lecture 7]]
 +
Next: [[Lecture9ECE662S10|Lecture 9]]
  
"The distribution of the average of a large number of samples from a distribution tends to be normal"
 
 
let X1,X2,...,Xn be n independent and identically distributed variables (i.i.d) with finite mean <math>\mu</math> and finite variance <math>\sigma^2>0</math>.Then as n increases the distribution of <math>\Sigma_{i=1}^n \frac{X_i} {n}</math> approaches <math>N(\mu,\frac {\sigma^2}{n})</math>.
 
 
More precisely the random variable <math>Z_n = \frac{\Sigma_{i=1}^n X_i - n \mu}{\sigma \sqrt{n}}</math> has <math>P(Z_n)\longrightarrow N(0,1)</math> when <math>n \longrightarrow \infty</math>
 
 
More generalization of central limit Thm.
 
 
let X1,X2,...,Xn be n independent variables
 
 
Xi has mean <math>\mu_i</math> and finite variance <math>\sigma^2 > 0</math> ,i=1,2,...,n
 
 
Then <math>Z_n = \frac{\Sigma_{i=1}^n X_i - \Sigma_{i=1}^n \mu_i} {\sqrt{\Sigma_{i=1}^n \sigma^2}}</math> has <math>P(Z_n)\longrightarrow N(\mu ,\Sigma)</math> when <math>n \longrightarrow \infty</math>
 
 
-------------------------------------
 
Error bounds for Bayes decision rule:
 
 
As we know Bayes decision rule guarantees the lowest average error rate; It Does not tell what the probability of error actually is. Therefore people try to find upper bounds for that: Chernoff and Bhattacharyya bounds
 
 
<math>P(error)=\int_{R^n} p(error,x)dx = \int_{R^n} p(error|x)p(x)dx</math>
 
 
in two class case we have:
 
 
<math> p(error|x) = min \{ p(\omega_1|x) , p(\omega_2|x) \} </math>
 
<math> \Rightarrow P(error)= \int_{R^n}min\{p(\omega_1|x),p(\omega_2|x)\} p(x)dx </math>
 
 
as we know from Lemma: <math>min \{a,b\}\leq a^\beta b^ {1-\beta}</math>, <math>\forall a,b \geq 0 , \forall \beta s.t 0 \leq\beta\leq 1 </math>
 
 
<math>P(error)\leq \int_{R^n} p(\omega_1 |x)^\beta p(\omega_2 |x)^{1-\beta} p(x) dx </math> , <math>\forall 0 \leq \beta \leq 1</math>
 
 
<math>P(error)\leq \int_{R^n} ({\frac {p(x|\omega)P(\omega_1)}{p(x)}})^\beta ({\frac {p(x|\omega_2)P(\omega_2)}{p(x)}})^{1-\beta} p(x) dx </math> , <math>\forall 0 \leq \beta \leq 1</math>
 
 
<math>P(error)\leq P(\omega_1) ^\beta P(\omega_2)^ https://kiwi.ecn.purdue.edu/rhea/skins/common/images/button_math.png{1-\beta} \int_{R^n} p(x|\omega_1)^\beta p(x|\omega_2)^{1-\beta} dx =: \varepsilon_\beta </math> , <math>\forall 0 \leq \beta \leq 1</math>
 
 
The lower bound <math>S:= min\varepsilon_\beta</math> , <math>\beta \in [0,1]</math>is an upper bound for P(error).
 
 
S is called "Chernoff Bound" and <math> \varepsilon_{1/2}</math> is called "Bhattacharyya Bound"
 
 
<math> \varepsilon_{1/2}= \sqrt{P(\omega_1)P(\omega_2)} \int_{R^n} \sqrt{p(x|\omega_1)p(x|\omega_2)} dx = e^{-f(1/2)} \sqrt {P(\omega_1)P(\omega_2)}</math>
 
 
<math>\varepsilon_\beta = P(\omega_1) ^\beta P(\omega_2)^ {1-\beta} \int_{R^n} p(x|\omega_1)^\beta p(x|\omega_2)^{1-\beta} dx = e^{-f(\beta)} P(\omega_1)^\beta P(\omega_2)^ {1-\beta}</math>
 
 
Here <math>f(1/2)</math> is "Bhattacharyya distance" and <math>f(\beta)</math> is "Chernof distance"
 
 
Especial case: Bounds for Gaussian density:
 
 
<math>f(\beta)= {\frac{ \beta (1- \beta)}{2}} {(\mu_2 - \mu_1)^T} [ \beta \Sigma_1 + (1- \beta) \Sigma_2]^{-1} (\mu_2-\mu_1) + \frac {1}{2} ln{\frac {|\beta \Sigma_1 + (1-\beta) \Sigma_2|}{|\Sigma_1|^\beta |\Sigma_2|^ {1-\beta}}}</math>
 
 
<math>f(1/2)=\frac{1}{8} (\mu_2 - \mu_1)^T [\frac {\Sigma_1 + \Sigma_2} {2}]^{-1} (\mu_2 - \mu_1) + \frac {1} {2} ln \frac {[\frac {\Sigma_1 + \Sigma_2} {2}]} {\sqrt {|\Sigma_1| |\Sigma_2|}}</math>
 
 
The Chernoff bound is found by analytically or numerically minimizing <math>e^{-f(\beta)}</math>, therefore optimization is now in the one dimensional β space.
 
[[Image:Error.jpg]]
 
 
* Now consider a special case where : <math>\Sigma_1 = \Sigma_2 =: \Sigma</math>
 
 
<math>f(\beta) = \beta (1-\beta){(\mu_2 - \mu_1)^T} \Sigma^{-1}(\mu_2 - \mu_1)</math>
 
 
<math>\frac {df(\beta)}{d\beta} = \frac {1-2\beta}{2}{(\mu_2 - \mu_1)^T} \Sigma^{-1}(\mu_2 - \mu_1)= 0</math>
 
 
<math> \Leftrightarrow \beta = \frac {1}{2}</math>
 
therefore Bhattacharyya Bound is optimum in this case.
 
 
* A closer look at f(1/2):
 
 
<math>f(1/2)=\frac{1}{8} (\mu_2 - \mu_1)^T [\frac {\Sigma_1 + \Sigma_2} {2}]^{-1} (\mu_2 - \mu_1) + \frac {1} {2} ln \frac {[\frac {\Sigma_1 + \Sigma_2} {2}]} {\sqrt {|\Sigma_1| |\Sigma_2|}}</math>
 
 
As we see first term of this equation would be zero when <math>\mu_1 = \mu_2</math>; which shows class separability due to distance between means.(error would be higher when means of our classes are close to each other)
 
 
Second term of this equation would be zero when <math>\Sigma_1 = \Sigma_2</math>; which shows class separability due to difference of covariance matrices.
 
 
 
--[[User:Gmoayeri|Gmoayeri]] 19:49, 9 May 2010 (UTC)
 
 
Next: [[Lecture9ECE662S10|Lecture 9]]
 
 
----
 
----
[[OutlineECE662S10|Back to course outline]]
 
 
 
[[ 2010 Spring ECE 662 mboutin|Back to 2010 Spring ECE 662 mboutin]]
 
[[ 2010 Spring ECE 662 mboutin|Back to 2010 Spring ECE 662 mboutin]]
 
[[ECE662|Back to ECE662]]
 

Latest revision as of 08:09, 11 May 2010


Details of Lecture 8, ECE662 Spring 2010

In Lecture 8, we justified the commonly made assumption that the features are normally distributed with the Central Limit Theorem. We then discussed the probability of error when using Bayes decision rule. More precisely, we obtained the Chernoff Bound and the Bhattacharrya bound for the probability of error.

Note for this lecture can be found here.


Previous: Lecture 7 Next: Lecture 9


Back to 2010 Spring ECE 662 mboutin

Alumni Liaison

Prof. Math. Ohio State and Associate Dean
Outstanding Alumnus Purdue Math 2008

Jeff McNeal