(New page: = Discriminant Functions For The Normal Density = ----        Lets begin with the continuous univariate normal or Gaussian density. <div style="margin-left...)
 
 
(28 intermediate revisions by the same user not shown)
Line 1: Line 1:
= Discriminant Functions For The Normal Density =
+
= Discriminant Functions For The Normal Density - Part 1  =
  
 
----
 
----
 +
'''Introduction to Normal or Gaussian Distribution'''
  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Lets begin with the continuous univariate normal or Gaussian density.
+
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Before talking about discriminant functions for the normal density, we first need to know what a normal distribution is and how it is represented for just a single variable, and for a vector variable. Lets begin with the continuous univariate normal or Gaussian density.  
  
 
<div style="margin-left: 25em;">
 
<div style="margin-left: 25em;">
<math>f_x = \frac{1}{\sqrt{2 \pi} \sigma} \exp \left [- \frac{1}{2} \left ( \frac{x - \mu}{\sigma} \right)^2 \right ] </math>        
+
<math>f_x = \frac{1}{\sqrt{2 \pi} \sigma} \exp \left [- \frac{1}{2} \left ( \frac{x - \mu}{\sigma} \right)^2 \right ] </math>  
 +
</div>
 +
 
 +
<br> for which the expected value of ''x'' is
 +
 
 +
<div style="margin-left: 25em;">
 +
<math>\mu = \mathcal{E}[x] =\int\limits_{-\infty}^{\infty} xp(x)\, dx</math>
 +
</div>
 +
 
 +
and where the expected squared deviation or ''variance'' is
 +
 
 +
<div style="margin-left: 25em;">
 +
<math>\sigma^2 = \mathcal{E}[(x- \mu)^2] =\int\limits_{-\infty}^{\infty} (x- \mu)^2 p(x)\, dx</math>
 +
</div>
 +
 
 +
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; The univariate normal density is completely specified by two parameters; its mean ''&mu; '' and variance ''&sigma;<sup>2</sup>''.  The function f<sub>x</sub> can be written as ''N(&mu;,&sigma;)'' which says that ''x'' is distributed normally with mean ''&mu;'' and variance ''&sigma;<sup>2</sup>''. Samples from normal distributions tend to cluster about the mean with a spread related to the standard deviation ''&sigma;''.
 +
 
 +
For the multivariate normal density in ''d'' dimensions, f<sub>x</sub> is written as
 +
 
 +
<div style="margin-left: 25em;">
 +
<math>f_x = \frac{1}{(2 \pi)^ \frac{d}{2} |\boldsymbol{\Sigma}|^\frac{1}{2}} \exp \left [- \frac{1}{2} (\mathbf{x} -\boldsymbol{\mu})^t\boldsymbol{\Sigma}^{-1} (\mathbf{x} -\boldsymbol{\mu}) \right] </math>
 
</div>
 
</div>
 +
 +
where '''x''' is a ''d''-component column vector, '''&mu;''' is the ''d''-component mean vector, '''&Sigma;''' is the ''d''-by-''d'' covariance matrix, and '''|&Sigma;|''' and '''&Sigma;<sup>-1</sup>''' are its determinant and inverse respectively.  Also,('''x - &mu;''')<sup>t</sup> denotes the transpose of ('''x - &mu;''').
 +
 +
and
 +
 +
<div style="margin-left: 25em;">
 +
<math>\boldsymbol{\Sigma} = \mathcal{E} \left [(\mathbf{x} - \boldsymbol{\mu})(\mathbf{x} - \boldsymbol{\mu})^t \right] = \int(\mathbf{x} - \boldsymbol{\mu})(\mathbf{x} - \boldsymbol{\mu})^t p(\mathbf{x})\, dx</math>
 +
</div>
 +
 +
where the expected value of a vector or a matrix is found by taking the expected value of the individual components. i.e if ''x<sub>i</sub>'' is the ''i''th component of '''x''', ''&mu;<sub>i</sub>'' the ''i''th component of '''&mu;''', and ''&sigma;<sub>ij</sub>'' the ''ij''th component of '''&Sigma;''', then
 +
 +
<div style="margin-left: 25em;">
 +
<math>\mu_i = \mathcal{E}[x_i] </math>
 +
</div>
 +
 +
and
 +
 +
<div style="margin-left: 25em;">
 +
<math>\sigma_{ij} = \mathcal{E}[(x_i - \mu_i)(x_j - \mu_j)] </math>
 +
</div>
 +
 +
The covariance matrix '''&Sigma;''' is always symmetric and positive definite which means that the determinant of '''&Sigma;''' is strictly positive. The diagonal elements ''&sigma;<sub>ii</sub>'' are the variances of the respective ''x<sub>i</sub>'' ( i.e., ''&sigma;<sup>2</sup>''), and the off-diagonal elements ''&sigma;<sub>ij</sub>'' are the covariances of ''x<sub>i</sub>'' and ''x<sub>j</sub>''. If ''x<sub>i</sub>'' and ''x<sub>j</sub>'' are statistically independent, then ''&sigma;<sub>ij</sub>'' = 0. If all off-diagonanl elements are zero, ''p''('''x''') reduces to the product of the univariate normal densities for the components of '''x'''.
 +
 +
 +
'''Discriminant Functions'''
 +
 +
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Discriminant functions are used to find the minimum probability of error in decision making problems. In a problem with feature vector '''y''' and state of nature variable ''w'', we can represent the discriminant function as:
 +
 +
<div style="margin-left: 25em;">
 +
<math>g_i(\mathbf{Y}) = \ln p(\mathbf{Y}|w_i) + \ln P(w_i)  </math>
 +
</div>
 +
 +
where from [[Bayesian Decision Theory - Continuous Features|previous essays]] we defined p('''Y'''|''w<sub>i</sub>'') as the conditional probability density function for '''Y''' with ''w<sub>i</sub>'' being the state of nature, and ''P''(''w<sub>j</sub>'') is the prior probability that nature is in state ''w<sub>j</sub>''. If we take p('''Y'''|''w<sub>i</sub>'') as multivariate normal distributions. That is if p('''Y'''|''w<sub>i</sub>'') = ''N('''&mu;''','''&sigma;''')''. Then the discriminant function changes to;
 +
 +
<div style="margin-left: 25em;">
 +
<math>g_i(\mathbf{Y}) = - \frac{||\mathbf{x} - \boldsymbol{\mu}_i||^2}{\boldsymbol{\sigma}_i } + \ln P(w_i) </math>,
 +
</div>
 +
 +
where ||.|| denotes the ''Euclidean norm'', that is,
 +
 +
Next week, we will look more in depth into discriminant functions for the normal density, looking at the special cases of the covariance.
 +
 +
----
 +
*[[Honors_project_1_ECE302S12|Back to Tosin's Honors Project]]
 +
*[[2013 Spring ECE 302 Boutin|Back to ECE 302 Spring 2013. Prof Boutin]]
 +
*[[ECE302|Back to ECE 302]]
 +
 +
[[Category:Honors_project]] [[Category:ECE302]] [[Category:Pattern_recognition]]

Latest revision as of 04:43, 13 April 2013

Discriminant Functions For The Normal Density - Part 1


Introduction to Normal or Gaussian Distribution

      Before talking about discriminant functions for the normal density, we first need to know what a normal distribution is and how it is represented for just a single variable, and for a vector variable. Lets begin with the continuous univariate normal or Gaussian density.

$ f_x = \frac{1}{\sqrt{2 \pi} \sigma} \exp \left [- \frac{1}{2} \left ( \frac{x - \mu}{\sigma} \right)^2 \right ] $


for which the expected value of x is

$ \mu = \mathcal{E}[x] =\int\limits_{-\infty}^{\infty} xp(x)\, dx $

and where the expected squared deviation or variance is

$ \sigma^2 = \mathcal{E}[(x- \mu)^2] =\int\limits_{-\infty}^{\infty} (x- \mu)^2 p(x)\, dx $

       The univariate normal density is completely specified by two parameters; its mean μ and variance σ2. The function fx can be written as N(μ,σ) which says that x is distributed normally with mean μ and variance σ2. Samples from normal distributions tend to cluster about the mean with a spread related to the standard deviation σ.

For the multivariate normal density in d dimensions, fx is written as

$ f_x = \frac{1}{(2 \pi)^ \frac{d}{2} |\boldsymbol{\Sigma}|^\frac{1}{2}} \exp \left [- \frac{1}{2} (\mathbf{x} -\boldsymbol{\mu})^t\boldsymbol{\Sigma}^{-1} (\mathbf{x} -\boldsymbol{\mu}) \right] $

where x is a d-component column vector, μ is the d-component mean vector, Σ is the d-by-d covariance matrix, and |Σ| and Σ-1 are its determinant and inverse respectively. Also,(x - μ)t denotes the transpose of (x - μ).

and

$ \boldsymbol{\Sigma} = \mathcal{E} \left [(\mathbf{x} - \boldsymbol{\mu})(\mathbf{x} - \boldsymbol{\mu})^t \right] = \int(\mathbf{x} - \boldsymbol{\mu})(\mathbf{x} - \boldsymbol{\mu})^t p(\mathbf{x})\, dx $

where the expected value of a vector or a matrix is found by taking the expected value of the individual components. i.e if xi is the ith component of x, μi the ith component of μ, and σij the ijth component of Σ, then

$ \mu_i = \mathcal{E}[x_i] $

and

$ \sigma_{ij} = \mathcal{E}[(x_i - \mu_i)(x_j - \mu_j)] $

The covariance matrix Σ is always symmetric and positive definite which means that the determinant of Σ is strictly positive. The diagonal elements σii are the variances of the respective xi ( i.e., σ2), and the off-diagonal elements σij are the covariances of xi and xj. If xi and xj are statistically independent, then σij = 0. If all off-diagonanl elements are zero, p(x) reduces to the product of the univariate normal densities for the components of x.


Discriminant Functions

      Discriminant functions are used to find the minimum probability of error in decision making problems. In a problem with feature vector y and state of nature variable w, we can represent the discriminant function as:

$ g_i(\mathbf{Y}) = \ln p(\mathbf{Y}|w_i) + \ln P(w_i) $

where from previous essays we defined p(Y|wi) as the conditional probability density function for Y with wi being the state of nature, and P(wj) is the prior probability that nature is in state wj. If we take p(Y|wi) as multivariate normal distributions. That is if p(Y|wi) = N(μ,σ). Then the discriminant function changes to;

$ g_i(\mathbf{Y}) = - \frac{||\mathbf{x} - \boldsymbol{\mu}_i||^2}{\boldsymbol{\sigma}_i } + \ln P(w_i) $,

where ||.|| denotes the Euclidean norm, that is,

Next week, we will look more in depth into discriminant functions for the normal density, looking at the special cases of the covariance.


Alumni Liaison

Abstract algebra continues the conceptual developments of linear algebra, on an even grander scale.

Dr. Paul Garrett