
This approach consist in using certain models for clusters and attempting to optimize the fit between the data and the model. In practice, each cluster can be mathematically represented by a parametric distribution, like a Gaussian. Thus, the entire data set is modelled by a mixture of these distributions.

Mixture of Gaussians

This is the most widely used clustering method of this kind is the one based on learning a mixture of Gaussians. The algorithms works in this way:

1) it chosses the component (the Gaussian) at random with probability $ P(w_i) $.

2) it samples a point $ N(m_i,std^2I) $ So basically it is trying to choose clusters so that the average distance between the data vectors and the chosen means (basically the model parameter) is a minimum. Tesentation of the way the data is distributed in the space.

3) Using Expectation-Maximization_OldKiwi algorithm we maximise the likelihood function.

Alumni Liaison

Ph.D. 2007, working on developing cool imaging technologies for digital cameras, camera phones, and video surveillance cameras.

Buyue Zhang