UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Vector-valued distribution regression: a simple and consistent approach

Szabo, Z; Gretton, A; Poczos, B; Sriperumbudur, B; (2014) Vector-valued distribution regression: a simple and consistent approach. Presented at: Statistical Science Seminars, London, UK. Green open access

[thumbnail of Zoltan_Szabo_invited_talk_Statistical_Science_Seminars_09_10_2014.pdf]
Preview
Text
Zoltan_Szabo_invited_talk_Statistical_Science_Seminars_09_10_2014.pdf
Available under License : See the attached licence file.

Download (1MB)
[thumbnail of Zoltan_Szabo_invited_talk_Statistical_Science_Seminars_09_10_2014_abstract.pdf]
Preview
Text
Zoltan_Szabo_invited_talk_Statistical_Science_Seminars_09_10_2014_abstract.pdf
Available under License : See the attached licence file.

Download (29kB)

Abstract

We address the distribution regression problem (DRP): regressing on the domain of probability measures, in the two-stage sampled setup when only samples from the distributions are given. The DRP formulation offers a unified framework for several important tasks in statistics and machine learning including multi-instance learning (MIL), or point estimation problems without analytical solution. Despite the large number of MIL heuristics, essentially there is no theoretically grounded approach to tackle the DRP problem in two-stage sampled case. To the best of our knowledge, the only existing technique with consistency guarantees requires kernel density estimation as an intermediate step (which often scale poorly in practice), and the domain of the distributions to be compact Euclidean. We analyse a simple (analytically computable) ridge regression alternative to DRP: we embed the distributions to a reproducing kernel Hilbert space, and learn the regressor from the embeddings to the outputs. We show that this scheme is consistent in the two-stage sampled setup under mild conditions, for probability measure inputs defined on separable, topological domains endowed with kernels, with vector-valued outputs belonging to an arbitrary separable Hilbert space. Specially, choosing the kernel on the space of embedded distributions to be linear and the output space to the real line, we get the consistency of set kernels in regression, which was a 15-year-old open question. In our talk we are going to present (i) the main ideas and results of consistency, (ii) concrete kernel constructions on mean embedded distributions, and (iii) two applications (supervised entropy learning, aerosol prediction based on multispectral satellite images) demonstrating the efficiency of our approach.

Type: Conference item (UNSPECIFIED)
Title: Vector-valued distribution regression: a simple and consistent approach
Event: Statistical Science Seminars
Location: London, UK
Dates: 2014-10-09 - 2014-10-09
Open access status: An open access version is available from UCL Discovery
Publisher version: http://www.gatsby.ucl.ac.uk/~szabo/talks/invited_t...
Language: English
Keywords: distribution regression, consistency, convergence rate, mean embedding, set kernel
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Gatsby Computational Neurosci Unit
URI: https://discovery.ucl.ac.uk/id/eprint/1447570
Downloads since deposit
91Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item