UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Distribution Regression: Computational and Statistical Tradeoffs

Szabo, Z; Sriperumbudur, B; Poczos, B; Gretton, A; (2015) Distribution Regression: Computational and Statistical Tradeoffs. Presented at: CSML Lunch Talk Series, London, United Kingdom. Green open access

[thumbnail of Zoltan_Szabo_invited_talk_CSML_Lunch_Talk_Series_02_05_2014.pdf]
Preview
Text
Zoltan_Szabo_invited_talk_CSML_Lunch_Talk_Series_02_05_2014.pdf

Download (1MB) | Preview

Abstract

In this talk I am going to focus on the distribution regression problem: regressing to vector-valued outputs from probability measures. Many important machine learning and statistical tasks fit into this framework, including multi-instance learning or point estimation problems without analytical solution such as hyperparameter or entropy estimation. Despite the large number of available heuristics in the literature, the inherent two-stage sampled nature of the problem makes the theoretical analysis quite challenging: in practice only samples from sampled distributions are observable, and the estimates have to rely on similarities computed between sets of points. To the best of our knowledge, the only existing technique with consistency guarantees for distribution regression requires density estimation as an intermediate step (which often performs poorly in practice), and the domain of the distributions to be compact Euclidean. I propose a simple, analytically computable, ridge regression based alternative to distribution regression by embedding the distributions to a reproducing kernel Hilbert space, and learning the regressor from the embeddings to the outputs. I am going to present the main ideas why this scheme is consistent in the two-stage sampled setup under mild conditions (on separable topological domains enriched with kernels) and present an exact computational-statistical efficiency tradeoff description showing that the studied estimator is able to match the one-stage sampled minimax optimal rate. Specifically, this result answers a 16-year-old open question by establishing the consistency of the classical set kernel [Haussler, 1999; Gartner et. al, 2002] in regression, and also covers more recent kernels on distributions, including those due to [Christmann and Steinwart, 2010].

Type: Conference item (Presentation)
Title: Distribution Regression: Computational and Statistical Tradeoffs
Event: CSML Lunch Talk Series
Location: London, United Kingdom
Dates: 2 May, 2014
Open access status: An open access version is available from UCL Discovery
Publisher version: http://www.gatsby.ucl.ac.uk/~szabo/talks/invited_t...
Language: English
Keywords: Two-stage sampled distribution regression, kernel ridge regression, mean embedding, multi-instance learning, minimax optimality.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Gatsby Computational Neurosci Unit
URI: https://discovery.ucl.ac.uk/id/eprint/1473541
Downloads since deposit
17Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item