Line 21: Line 21:
 
Clustering is a form of unsupervised learning. "Clustering is the problem of identifying groups, or clusters of data points in multidimensional space". For example, consider the following data set:
 
Clustering is a form of unsupervised learning. "Clustering is the problem of identifying groups, or clusters of data points in multidimensional space". For example, consider the following data set:
  
[[Image:runyan1.jpg]]
+
<center>[[Image:runyan1.jpg|frame|none|alt=Alt text|<font size= 4> '''Figure 1''' </font size>]] </center> <br />
 +
 
 +
Although this data set is not labelled, we can still clearly see three different groups, or clusters, of data points, as shown here:
  
 
----
 
----

Revision as of 20:11, 5 May 2014


Introduction to Clustering A slecture by CS student David Runyan


Introduction


In class, we covered the simple Bayesian classifier. This form of classification falls under a category known as supervised learning. What this means is that a set of labelled data data is used to to "train" the underlying model. However, it is not always possible to have such a data set, yet we may still wish to discover some form of underlying structure in an unlabelled data set. Such a task falls under the category of unsupervised learning.

Clustering is a form of unsupervised learning. "Clustering is the problem of identifying groups, or clusters of data points in multidimensional space". For example, consider the following data set:

Alt text
Figure 1

Although this data set is not labelled, we can still clearly see three different groups, or clusters, of data points, as shown here:




Back to ECE662, Spring 2014

Alumni Liaison

Prof. Math. Ohio State and Associate Dean
Outstanding Alumnus Purdue Math 2008

Jeff McNeal