UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Regression on Probability Measures: A Simple and Consistent Algorithm

Szabo, Z; Sriperumbudur, B; Poczos, B; Gretton, A; (2015) Regression on Probability Measures: A Simple and Consistent Algorithm. Presented at: CRiSM Seminars, Department of Statistics, University of Warwick, Coventry, United Kingdom. Green open access

[thumbnail of Zoltan_Szabo_invited_talk_University_of_Warwick_29_05_2015.pdf]
Preview
Text
Zoltan_Szabo_invited_talk_University_of_Warwick_29_05_2015.pdf
Available under License : See the attached licence file.

Download (1MB)

Abstract

We address the distribution regression problem: we regress from probability measures to Hilbert-space valued outputs, where only samples are available from the input distributions. Many important statistical and machine learning problems can be phrased within this framework including point estimation tasks without analytical solution, or multi-instance learning. However, due to the two-stage sampled nature of the problem, the theoretical analysis becomes quite challenging: to the best of our knowledge the only existing method with performance guarantees requires density estimation (which of ten performs poorly in practise) and the distributions to be defined on a compact Euclidean domain. We present a simple, analytically tractable alternative to solve the distribution regression problem: we embed the distributions to a reproducing kernel Hilbert space and perform ridge regression from the embedded distributions to the outputs. We prove that this scheme is consistent under mild conditions (for distributions on separable topological domains endowed with kernels), and construct explicit finite sample bounds on the excess risk as a function of the sample numbers and the problem difficulty, which hold with high probability. Specifically, we establish the consistency of set kernels in regression, which was a 15-year-old-openquestion, and also present new kernels on embedded distributions. The practical efficiency of the studied technique is illustrated in supervised entropy learning and aerosol prediction using multispectral satellite images.

Type: Conference item (Presentation)
Title: Regression on Probability Measures: A Simple and Consistent Algorithm
Event: CRiSM Seminars, Department of Statistics, University of Warwick
Location: Coventry, United Kingdom
Dates: 29 May 2015
Open access status: An open access version is available from UCL Discovery
Publisher version: http://www2.warwick.ac.uk/fac/sci/statistics/event...
Language: English
Keywords: Consistency, convergence rate, distribution regression, mean embedding, set kernel, two-stage sampling.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Gatsby Computational Neurosci Unit
URI: https://discovery.ucl.ac.uk/id/eprint/1469118
Downloads since deposit
23Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item