Revision as of 08:16, 30 April 2014 by Dpbarret (Talk | contribs)


K Nearest Neighbors

A slecture by ECE student

Partly based on the ECE662 Spring 2014 lecture material of Prof. Mireille Boutin.



Post your slecture material here. Guidelines:

  • If you are making a text slecture
    • Type text using wikitext markup languages
    • Type all equations using latex code between <math> </math> tags.
    • You may include links to other Project Rhea pages.

K Nearest Neighbors is a classification algorithm based on local density estimation.

The goal of a classifier is to take a data-point $ x_{0} $, usually represented as a vector, and assign it a class label from some set of $ C $ classes $ w_1,w_2,...w_C $


If the data-points from each class are random variables, then it can be proven that the optimal decision rule to classify a point $ x_0 $ is Bayes Rule, which is to choose the class for which $ P(w_i|x_0) $, the is highest.

Therefore, it is desirable to assume that the data-points are random variables, and attempt to estimate $ P(w_i|x_0) $, in order to use it to classify the points.

By Bayes Theorem, $ P(w_i|x_0) = P(x_o|w_i)P(w_i) $. This is advantageous because it is often easier to estimate $ P(x_o|w_i) $ and $ P(w_i) $ separately than to estimate $ P(w_i|x_0) $ directly.

K Nearest Neighbors does not explicitly estimate $ P(w_i|x_0) = P(x_o|w_i)P(w_i) $, but instead does so implicitly, and chooses the class with the highest estimated $ P(w_i|x_0) $.

This method belongs to the class of local density estimation methods because it forms a separate density estimate locally around each test point $ x_0 $. This is in contrast to non-local methods which assume a symbolic form (for example gaussian) for the distributions and estimate the entire distribution by finding the parameters (the mean and variance for the gaussian example) of that distribution.

The classifier is given a set of $ N $ data-points $ x_{1}, x_{2}, ..., x_{N} $ along with corresponding labels $ y_{1}, y_{2}, ..., y_{N} $

Together these are often called the training set, because they are used to "train" the classifier, i.e. estimate the distributions.


$  \alpha = \sum_{i=1}^{N} \beta_{i} $



Questions and comments

If you have any questions, comments, etc. please post them on this page.


Back to ECE662, Spring 2014

Alumni Liaison

Basic linear algebra uncovers and clarifies very important geometry and algebra.

Dr. Paul Garrett