Definition: A machine learning technique in which a function is created from training data.

In supervised learning, the function(model) defines the effect, one set of observations(inputs) has on another set of observations(outputs). The inputs are assumed to be at the beginning and outputs at the end of the causal chain and mediating variables are included in the function(model). The output of the function can be a continuous value or can predict a class label of the input object. The task of the supervised learner is to predict the value of the function for any valid input object after having seen a number of training examples (i.e. pairs of input and target output). To achieve this, the learner has to generalize from the presented data to unseen situations in a "reasonable" way.

Supervised learning can generate models of two types.

  1. Global model
  2. Local model

In order to solve a given problem of supervised learning (e.g. learning to recognize handwriting) one has to consider various steps:

  1. Determine the type of training examples. For instance, a single handwritten character, an entire handwritten word, or an entire line of handwriting.
  2. Gathering a training set. The set of input objects and corresponding outputs should be gathered real-world function measurements.
  3. Determine the input feature representation of the learned function. The accuracy of the learned function depends strongly on how the input object is represented. Typically, the input object is transformed into a feature vector, which contains a number of features that are descriptive of the object. The number of features should not be too large, because of the curse of dimensionality; but should be large enough to accurately predict the output.
  4. Determine the structure of the learned function and corresponding learning algorithm. For example, one may choose to use artificial neural networks or decision trees.
  5. Complete the design. Run the learning algorithm on the gathered training set. Parameters of the learning algorithm may be adjusted by optimizing performance on a subset (called a validation set) of the training set, or via cross-validation. After parameter adjustment and learning, the performance of the algorithm may be measured on a test set that is separate from the training set.

Another term for supervised learning is classification. Classifier performance depend greatly on the characteristics of the data to be classified. There is no single classifier that works best on all given problems. Determining a suitable classifier for a given problem is however still more an art than a science. The most widely used classifiers are the Neural Network (Multi-layer Perceptron), Support Vector Machines, k-Nearest Neighbors, Gaussian Mixture Model, Gaussian, Naive Bayes, Decision Tree and RBF classifiers.

Source: http://en.wikipedia.org/wiki/Supervised_learning

(See also: [Unsupervised Learning])

Alumni Liaison

Prof. Math. Ohio State and Associate Dean
Outstanding Alumnus Purdue Math 2008

Jeff McNeal