Revision as of 00:08, 18 April 2008 by Lbachega (Talk)

The PCA, or Principal Component Analysis finds an orthonormal basis that best represents the data. The PCA diagonalizes the maximum likelihood estimate of the covariance matrix

$ C=\frac{1}{n} \sum_{i=1}^{n} \vec{x_i}\vec{x_i}^T $

by solving the eigenvalue equation

$ C\vec{e} = \lambda \vec{e} $

The solutions to these equations are eigenvalues $ \lambda_1 \lambda_2 \cdots \lambda_m $. Often only $ k m $ eigenvalues will have a nonzero value, meaning that the inherent dimensionality of the data is $ k $, being $ n-k $ dimensions noise.

In order to represent the data in the k dimensional space we first construct the matrix $ E=[\vec{e_1} \vec{e_2} \cdots \vec{e_k}] $

Alumni Liaison

Have a piece of advice for Purdue students? Share it through Rhea!

Alumni Liaison