UCL logo

UCL Discovery

UCL home » Library Services » Electronic resources » UCL Discovery

A simulation study to compare robust clustering methods based on mixtures

Coretto, P; Hennig, C; (2010) A simulation study to compare robust clustering methods based on mixtures. ADV DATA ANAL CLASSI , 4 (2-3) 111 - 135. 10.1007/s11634-010-0065-4.

Full text not available from this repository.

Abstract

The following mixture model-based clustering methods are compared in a simulation study with one-dimensional data, fixed number of clusters and a focus on outliers and uniform "noise": an ML-estimator (MLE) for Gaussian mixtures, an MLE for a mixture of Gaussians and a uniform distribution (interpreted as "noise component" to catch outliers), an MLE for a mixture of Gaussian distributions where a uniform distribution over the range of the data is fixed (Fraley and Raftery in Comput J 41:578-588, 1998), a pseudo-MLE for a Gaussian mixture with improper fixed constant over the real line to catch "noise" (RIMLE; Hennig in Ann Stat 32(4):1313-1340, 2004), and MLEs for mixtures of t-distributions with and without estimation of the degrees of freedom (McLachlan and Peel in Stat Comput 10(4):339-348, 2000). The RIMLE (using a method to choose the fixed constant first proposed in Coretto, The noise component in model-based clustering. Ph.D thesis, Department of Statistical Science, University College London, 2008) is the best method in some, and acceptable in all, simulation setups, and can therefore be recommended.

Type:Article
Title:A simulation study to compare robust clustering methods based on mixtures
DOI:10.1007/s11634-010-0065-4
Keywords:Model-based clustering, Gaussian mixture, Mixture of t-distributions, Noise component, LOCATION-SCALE MIXTURES, MAXIMUM-LIKELIHOOD, EM ALGORITHM, T-DISTRIBUTION, ESTIMATORS
UCL classification:UCL > School of BEAMS > Faculty of Maths and Physical Sciences > Statistical Science

Archive Staff Only: edit this record