Difference between revisions of "KnnDensityEstimation" - Rhea

Revision as of 19:47, 24 April 2014

K-Nearest Neighbors Density Estimation

A slecture by CIT student Raj Praveen Selvaraj

Partly based on the ECE662 Spring 2014 lecture material of Prof. Mireille Boutin.

Introduction

This slecture discusses about the K-Nearest Neighbors(k-NN) approach to estimate the density of a given distribution. The approach of K-Nearest Neighbors is very popular in signal and image processing for clustering and classification of patterns. It is an non-parametric density estimation technique which lets the region volume be a function of the training data. We will discuss the basic principle behind the k-NN approach to estimate density at a point X and then move on to building a classifier using the k-NN Density estimate.

Basic Principle

The general formulation for density estimation states that, for N Observations x₁,x₂,x₃,...,x_n the density at a point x can be approximated by the following function,

where V is the volume of some neighborhood(say A) around x and k denotes the number of observations that are contained within the neighborhood. The basic idea of k-NN is to extend the neighborhood, until the k nearest values are included. If we consider the neighborhood around x as a sphere, for the given N Observations, we pick an integer,

{an equation goes here}

If x_l is the k^th closest sample point to x, then h_k = ||x_l - x||

{equation of estimated density at x here}

We approximate the density p(x) by,
{equation here }

Most of the time this estimate is, {equation here}

How to classify data using k-NN Density Estimate

Having seen how the density at a given point x is estimated based on the value of k and the given observations x₁,x₂,x₃,...,x_n, let's discuss about using the k-NN density estimate for classification.

Method 1:

Let x₀ from Rⁿ be a point to classify.

Given are samples x_i1,x_x2,..,x_xn for i classes.

We now pick a k_i for each class and a window function, and we try to approximate the density at x₀ for each class and then pick the class with the largest density based on,

{equation here}

If the priors of the classes are unknown, we use ROC curves to estimate the priors, based on,

{equation here}

Method 2: 

Given are samples x_i1,x_x2,..,x_xn from a Gaussian Mixture. We choose a single value of k and and one window function,

We then approximate p(x, w_i) by,
{equation here}

where V_i is the volume of the smallest window that contains k samples and k_{i_{is the number of samples among these k that belongs to class i.}}

Post your slecture material here. Guidelines:

If you are making a text slecture

Type text using wikitext markup languages

Type all equations using latex code between <math> </math> tags.

You may include links to other Project Rhea pages.

Questions and comments

If you have any questions, comments, etc. please post them on this page.

Back to ECE662, Spring 2014

@@ Line 34: / Line 34: @@
 {equation of estimated density at x here}
-We approximate the density p(x) by,
+We approximate the density p(x) by, <br/>
 {equation here } <br/>
+Most of the time this estimate is, {equation here} <br/>
+== How to classify data using k-NN Density Estimate ==
+Having seen how the density at a given point x is estimated based on the value of k and the given observations x<sub>1</sub>,x<sub>2</sub>,x<sub>3</sub>,...,x<sub>n</sub>, let's discuss about using the k-NN density estimate for classification. </br>
-Post your slecture material here. Guidelines:
+<b>Method 1:<b> <br/>
-*If you are making a text slecture
-**Type text using wikitext markup languages
+Let x<sub>0</sub> from R<sup>n</sup> be a point to classify.
-**Type all equations using latex code between <nowiki> <math> </math> </nowiki> tags.
-**You may include links to other [https://www.projectrhea.org/learning/about_Rhea.php Project Rhea] pages.
+Given are samples x<sub>i1</sub>,x<sub>x2</sub>,..,x<sub>xn</sub> for i classes. <br/>
+We now pick a k<sub>i</sub> for each class and a window function, and we try to approximate the density at x<sub>0</sub> for each class and then pick the class with the largest density based on, <br/>
+{equation here}
+If the priors of the classes are unknown, we use ROC curves to estimate the priors, based on,
+{equation here}
+<b>Method 2:<b> </br>
+Given are samples x<sub>i1</sub>,x<sub>x2</sub>,..,x<sub>xn</sub> from a Gaussian Mixture. We choose a single value of k and and one window function, <br/>
+We then approximate p(x, w<sub>i</sub>) by, <br/>
+{equation here}</br>
+where V<sub>i</sub> is the volume of the smallest window that contains k samples and k<sub>i<sub> is the number of samples among these k that belongs to class i. <br/>
@@ Line 49: / Line 69: @@
 ----
 ----
+Post your slecture material here. Guidelines:
+*If you are making a text slecture
+**Type text using wikitext markup languages
+**Type all equations using latex code between <nowiki> <math> </math> </nowiki> tags.
+**You may include links to other [https://www.projectrhea.org/learning/about_Rhea.php Project Rhea] pages.
 ==[[slecture_title_of_slecture_review|Questions and comments]]==

Difference between revisions of "KnnDensityEstimation" - Rhea

Revision as of 19:47, 24 April 2014

Contents

Introduction

Basic Principle

How to classify data using k-NN Density Estimate

Questions and comments

Alumni Liaison