UCL logo

UCL Discovery

UCL home » Library Services » Electronic resources » UCL Discovery

Incorporating prior probabilities into high-dimensional classifiers

Hall, P; Xue, JH; (2010) Incorporating prior probabilities into high-dimensional classifiers. BIOMETRIKA , 97 (1) 31 - 48. 10.1093/biomet/asp081.

Full text not available from this repository.

Abstract

In standard parametric classifiers, or classifiers based on nonparametric methods but where there is an opportunity for estimating population densities, the prior probabilities of the respective populations play a key role. However, those probabilities are largely ignored in the construction of high-dimensional classifiers, partly because there are no likelihoods to be constructed or Bayes risks to be estimated. Nevertheless, including information about prior probabilities can reduce the overall error rate, particularly in cases where doing so is most important, i.e. when the classification problem is particularly challenging and error rates are not close to zero. In this paper we suggest a general approach to reducing error rate in this way, by using a method derived from Breiman's bagging idea. The potential improvements in performance are identified in theoretical and numerical work, the latter involving both applications to real data and simulations. The method is simple and explicit to apply, and does not involve choice of any tuning parameters.

Type:Article
Title:Incorporating prior probabilities into high-dimensional classifiers
DOI:10.1093/biomet/asp081
Keywords:Bagging, Bootstrap, Centroid-based classifier, Discrimination, Error rate, Genomics, Nearest-neighbour algorithms, Resampling, Support vector machine, GENE-EXPRESSION, MICROARRAY DATA, TEXT CATEGORIZATION, CENTROID CLASSIFIER, CANCER, RATIOS
UCL classification:UCL > School of BEAMS > Faculty of Maths and Physical Sciences > Statistical Science

Archive Staff Only: edit this record