Szabo, Z;
Gretton, A;
Poczos, B;
Sriperumbudur, B;
(2014)
Vector-valued distribution regression: a simple and consistent approach.
Presented at: Statistical Science Seminars, London, UK.
![]() Preview |
Text
Zoltan_Szabo_invited_talk_Statistical_Science_Seminars_09_10_2014.pdf Available under License : See the attached licence file. Download (1MB) |
![]() Preview |
Text
Zoltan_Szabo_invited_talk_Statistical_Science_Seminars_09_10_2014_abstract.pdf Available under License : See the attached licence file. Download (29kB) |
Abstract
We address the distribution regression problem (DRP): regressing on the domain of probability measures, in the two-stage sampled setup when only samples from the distributions are given. The DRP formulation offers a unified framework for several important tasks in statistics and machine learning including multi-instance learning (MIL), or point estimation problems without analytical solution. Despite the large number of MIL heuristics, essentially there is no theoretically grounded approach to tackle the DRP problem in two-stage sampled case. To the best of our knowledge, the only existing technique with consistency guarantees requires kernel density estimation as an intermediate step (which often scale poorly in practice), and the domain of the distributions to be compact Euclidean. We analyse a simple (analytically computable) ridge regression alternative to DRP: we embed the distributions to a reproducing kernel Hilbert space, and learn the regressor from the embeddings to the outputs. We show that this scheme is consistent in the two-stage sampled setup under mild conditions, for probability measure inputs defined on separable, topological domains endowed with kernels, with vector-valued outputs belonging to an arbitrary separable Hilbert space. Specially, choosing the kernel on the space of embedded distributions to be linear and the output space to the real line, we get the consistency of set kernels in regression, which was a 15-year-old open question. In our talk we are going to present (i) the main ideas and results of consistency, (ii) concrete kernel constructions on mean embedded distributions, and (iii) two applications (supervised entropy learning, aerosol prediction based on multispectral satellite images) demonstrating the efficiency of our approach.
Type: | Conference item (UNSPECIFIED) |
---|---|
Title: | Vector-valued distribution regression: a simple and consistent approach |
Event: | Statistical Science Seminars |
Location: | London, UK |
Dates: | 2014-10-09 - 2014-10-09 |
Open access status: | An open access version is available from UCL Discovery |
Publisher version: | http://www.gatsby.ucl.ac.uk/~szabo/talks/invited_t... |
Language: | English |
Keywords: | distribution regression, consistency, convergence rate, mean embedding, set kernel |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Gatsby Computational Neurosci Unit |
URI: | https://discovery.ucl.ac.uk/id/eprint/1447570 |




Archive Staff Only
![]() |
View Item |