UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

An Efficient Algorithm for Bayesian Nearest Neighbours

Nuti, G; (2019) An Efficient Algorithm for Bayesian Nearest Neighbours. Methodology and Computing in Applied Probability , 21 pp. 1251-1258. 10.1007/s11009-018-9670-z. Green open access

[thumbnail of An Efficient Algorithm for Bayesian.pdf]
Preview
Text
An Efficient Algorithm for Bayesian.pdf - Published Version

Download (841kB) | Preview

Abstract

K-Nearest Neighbours (k-NN) is a popular classification and regression algorithm, yet one of its main limitations is the difficulty in choosing the number of neighbours. We present a Bayesian algorithm to compute the posterior probability distribution for k given a target point within a data-set, efficiently and without the use of Markov Chain Monte Carlo (MCMC) methods or simulation—alongside an exact solution for distributions within the exponential family. The central idea is that data points around our target are generated by the same probability distribution, extending outwards over the appropriate, though unknown, number of neighbours. Once the data is projected onto a distance metric of choice, we can transform the choice of k into a change-point detection problem, for which there is an efficient solution: we recursively compute the probability of the last change-point as we move towards our target, and thus de facto compute the posterior probability distribution over k. Applying this approach to both a classification and a regression UCI data-sets, we compare favourably and, most importantly, by removing the need for simulation, we are able to compute the posterior probability of k exactly and rapidly. As an example, the computational time for the Ripley data-set is a few milliseconds compared to a few hours when using a MCMC approach.

Type: Article
Title: An Efficient Algorithm for Bayesian Nearest Neighbours
Open access status: An open access version is available from UCL Discovery
DOI: 10.1007/s11009-018-9670-z
Publisher version: https://doi.org/10.1007/s11009-018-9670-z
Language: English
Additional information: This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
Keywords: K-nearest neighbour, Non-parametric classification, Bayesian classification
UCL classification: UCL
UCL > Provost and Vice Provost Offices
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10064271
Downloads since deposit
163Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item