Latest revision as of 05:43, 13 April 2013

Discriminant Functions For The Normal Density - Part 1

Introduction to Normal or Gaussian Distribution

Before talking about discriminant functions for the normal density, we first need to know what a normal distribution is and how it is represented for just a single variable, and for a vector variable. Lets begin with the continuous univariate normal or Gaussian density.

$f_x = \frac{1}{\sqrt{2 \pi} \sigma} \exp \left [- \frac{1}{2} \left ( \frac{x - \mu}{\sigma} \right)^2 \right ]$

for which the expected value of x is

$\mu = \mathcal{E}[x] =\int\limits_{-\infty}^{\infty} xp(x)\, dx$

and where the expected squared deviation or variance is

$\sigma^2 = \mathcal{E}[(x- \mu)^2] =\int\limits_{-\infty}^{\infty} (x- \mu)^2 p(x)\, dx$

The univariate normal density is completely specified by two parameters; its mean μ and variance σ². The function f_x can be written as N(μ,σ) which says that x is distributed normally with mean μ and variance σ². Samples from normal distributions tend to cluster about the mean with a spread related to the standard deviation σ.

For the multivariate normal density in d dimensions, f_x is written as

$f_x = \frac{1}{(2 \pi)^ \frac{d}{2} |\boldsymbol{\Sigma}|^\frac{1}{2}} \exp \left [- \frac{1}{2} (\mathbf{x} -\boldsymbol{\mu})^t\boldsymbol{\Sigma}^{-1} (\mathbf{x} -\boldsymbol{\mu}) \right]$

where x is a d-component column vector, μ is the d-component mean vector, Σ is the d-by-d covariance matrix, and |Σ| and Σ^-1 are its determinant and inverse respectively. Also,(x - μ)^t denotes the transpose of (x - μ).

and

$\boldsymbol{\Sigma} = \mathcal{E} \left [(\mathbf{x} - \boldsymbol{\mu})(\mathbf{x} - \boldsymbol{\mu})^t \right] = \int(\mathbf{x} - \boldsymbol{\mu})(\mathbf{x} - \boldsymbol{\mu})^t p(\mathbf{x})\, dx$

where the expected value of a vector or a matrix is found by taking the expected value of the individual components. i.e if x_i is the ith component of x, μ_i the ith component of μ, and σ_ij the ijth component of Σ, then

$\mu_i = \mathcal{E}[x_i]$

and

$\sigma_{ij} = \mathcal{E}[(x_i - \mu_i)(x_j - \mu_j)]$

The covariance matrix Σ is always symmetric and positive definite which means that the determinant of Σ is strictly positive. The diagonal elements σ_ii are the variances of the respective x_i ( i.e., σ²), and the off-diagonal elements σ_ij are the covariances of x_i and x_j. If x_i and x_j are statistically independent, then σ_ij = 0. If all off-diagonanl elements are zero, p(x) reduces to the product of the univariate normal densities for the components of x.

Discriminant Functions

Discriminant functions are used to find the minimum probability of error in decision making problems. In a problem with feature vector y and state of nature variable w, we can represent the discriminant function as:

$g_i(\mathbf{Y}) = \ln p(\mathbf{Y}|w_i) + \ln P(w_i)$

where from previous essays we defined p(Y|w_i) as the conditional probability density function for Y with w_i being the state of nature, and P(w_j) is the prior probability that nature is in state w_j. If we take p(Y|w_i) as multivariate normal distributions. That is if p(Y|w_i) = N(μ,σ). Then the discriminant function changes to;

$g_i(\mathbf{Y}) = - \frac{||\mathbf{x} - \boldsymbol{\mu}_i||^2}{\boldsymbol{\sigma}_i } + \ln P(w_i)$ ,

where ||.|| denotes the Euclidean norm, that is,

Next week, we will look more in depth into discriminant functions for the normal density, looking at the special cases of the covariance.

@@ Line 1: / Line 1: @@
-= Discriminant Functions For The Normal Density =
+= Discriminant Functions For The Normal Density - Part 1  =
 ----
+'''Introduction to Normal or Gaussian Distribution'''
-&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Lets begin with the continuous univariate normal or Gaussian density.
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Before talking about discriminant functions for the normal density, we first need to know what a normal distribution is and how it is represented for just a single variable, and for a vector variable. Lets begin with the continuous univariate normal or Gaussian density.
 <div style="margin-left: 25em;">
 <math>f_x = \frac{1}{\sqrt{2 \pi} \sigma} \exp \left [- \frac{1}{2} \left ( \frac{x - \mu}{\sigma} \right)^2 \right ] </math>
+</div>
+<br> for which the expected value of ''x'' is
+<div style="margin-left: 25em;">
+<math>\mu = \mathcal{E}[x] =\int\limits_{-\infty}^{\infty} xp(x)\, dx</math>
+</div>
+and where the expected squared deviation or ''variance'' is
+<div style="margin-left: 25em;">
+<math>\sigma^2 = \mathcal{E}[(x- \mu)^2] =\int\limits_{-\infty}^{\infty} (x- \mu)^2 p(x)\, dx</math>
+</div>
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; The univariate normal density is completely specified by two parameters; its mean ''&mu; '' and variance ''&sigma;<sup>2</sup>''.  The function f<sub>x</sub> can be written as ''N(&mu;,&sigma;)'' which says that ''x'' is distributed normally with mean ''&mu;'' and variance ''&sigma;<sup>2</sup>''. Samples from normal distributions tend to cluster about the mean with a spread related to the standard deviation ''&sigma;''.
+For the multivariate normal density in ''d'' dimensions, f<sub>x</sub> is written as
+<div style="margin-left: 25em;">
+<math>f_x = \frac{1}{(2 \pi)^ \frac{d}{2} |\boldsymbol{\Sigma}|^\frac{1}{2}} \exp \left [- \frac{1}{2} (\mathbf{x} -\boldsymbol{\mu})^t\boldsymbol{\Sigma}^{-1} (\mathbf{x} -\boldsymbol{\mu}) \right] </math>
 </div>
+where '''x''' is a ''d''-component column vector, '''&mu;''' is the ''d''-component mean vector, '''&Sigma;''' is the ''d''-by-''d'' covariance matrix, and '''|&Sigma;|''' and '''&Sigma;<sup>-1</sup>''' are its determinant and inverse respectively.  Also,('''x - &mu;''')<sup>t</sup> denotes the transpose of ('''x - &mu;''').
+and
+<div style="margin-left: 25em;">
+<math>\boldsymbol{\Sigma} = \mathcal{E} \left [(\mathbf{x} - \boldsymbol{\mu})(\mathbf{x} - \boldsymbol{\mu})^t \right] = \int(\mathbf{x} - \boldsymbol{\mu})(\mathbf{x} - \boldsymbol{\mu})^t p(\mathbf{x})\, dx</math>
+</div>
+where the expected value of a vector or a matrix is found by taking the expected value of the individual components. i.e if ''x<sub>i</sub>'' is the ''i''th component of '''x''', ''&mu;<sub>i</sub>'' the ''i''th component of '''&mu;''', and ''&sigma;<sub>ij</sub>'' the ''ij''th component of '''&Sigma;''', then
+<div style="margin-left: 25em;">
+<math>\mu_i = \mathcal{E}[x_i] </math>
+</div>
+and
+<div style="margin-left: 25em;">
+<math>\sigma_{ij} = \mathcal{E}[(x_i - \mu_i)(x_j - \mu_j)] </math>
+</div>
+The covariance matrix '''&Sigma;''' is always symmetric and positive definite which means that the determinant of '''&Sigma;''' is strictly positive. The diagonal elements ''&sigma;<sub>ii</sub>'' are the variances of the respective ''x<sub>i</sub>'' ( i.e., ''&sigma;<sup>2</sup>''), and the off-diagonal elements ''&sigma;<sub>ij</sub>'' are the covariances of ''x<sub>i</sub>'' and ''x<sub>j</sub>''. If ''x<sub>i</sub>'' and ''x<sub>j</sub>'' are statistically independent, then ''&sigma;<sub>ij</sub>'' = 0. If all off-diagonanl elements are zero, ''p''('''x''') reduces to the product of the univariate normal densities for the components of '''x'''.
+'''Discriminant Functions'''
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Discriminant functions are used to find the minimum probability of error in decision making problems. In a problem with feature vector '''y''' and state of nature variable ''w'', we can represent the discriminant function as:
+<div style="margin-left: 25em;">
+<math>g_i(\mathbf{Y}) = \ln p(\mathbf{Y}|w_i) + \ln P(w_i)  </math>
+</div>
+where from [[Bayesian Decision Theory - Continuous Features|previous essays]] we defined p('''Y'''|''w<sub>i</sub>'') as the conditional probability density function for '''Y''' with ''w<sub>i</sub>'' being the state of nature, and ''P''(''w<sub>j</sub>'') is the prior probability that nature is in state ''w<sub>j</sub>''. If we take p('''Y'''|''w<sub>i</sub>'') as multivariate normal distributions. That is if p('''Y'''|''w<sub>i</sub>'') = ''N('''&mu;''','''&sigma;''')''. Then the discriminant function changes to;
+<div style="margin-left: 25em;">
+<math>g_i(\mathbf{Y}) = - \frac{||\mathbf{x} - \boldsymbol{\mu}_i||^2}{\boldsymbol{\sigma}_i } + \ln P(w_i) </math>,
+</div>
+where ||.|| denotes the ''Euclidean norm'', that is,
+Next week, we will look more in depth into discriminant functions for the normal density, looking at the special cases of the covariance.
+----
+*[[Honors_project_1_ECE302S12|Back to Tosin's Honors Project]]
+*[[2013 Spring ECE 302 Boutin|Back to ECE 302 Spring 2013. Prof Boutin]]
+*[[ECE302|Back to ECE 302]]
+[[Category:Honors_project]] [[Category:ECE302]] [[Category:Pattern_recognition]]

Difference between revisions of "Discriminant Functions For The Normal(Gaussian) Density" - Rhea

Latest revision as of 05:43, 13 April 2013

Discriminant Functions For The Normal Density - Part 1

Alumni Liaison