UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Simple consistent distribution regression on compact metric domains

Szabo, Z; Gretton, A; Póczos, B; Sriperumbudur, B; (2014) Simple consistent distribution regression on compact metric domains. Presented at: UCL-Duke Workshop on Sensing and Analysis of High-Dimensional Data (SAHD-2014), London, UK. Green open access

[thumbnail of szabo14simple_abstract.pdf]
Preview
PDF
szabo14simple_abstract.pdf
Available under License : See the attached licence file.

Download (28kB)
[thumbnail of szabo14simple_poster.pdf] PDF
szabo14simple_poster.pdf

Download (186kB)

Abstract

In a standard regression model, one assumes that both the inputs and outputs are finite dimensional vectors. We address a variant of the regression problem, the distribution regression task, where the inputs are probability measures. Many important machine learning tasks fit naturally into this framework, including multi- instance learning, point estimation problems of statistics without closed form analytical solutions, or tasks where simulation-based results are computationally expensive. Learning problems formulated on distributions have an inherent two-stage sampled challenge: only samples from sampled distributions are available for observation, and one has to construct estimates based on these sets of samples. We propose an algorithmically simple and parallelizable ridge regression based technique to solve the distribution regression problem: we embed the distributions to a reproducing kernel Hilbert space, and learn the regressor from the embeddings to the outputs. We show that under mild conditions (for probability measures on compact metric domains with characteristic kernels) this solution scheme is consistent in the two-stage sampled setup. Specially, we establish the consistency of set kernels in regression (a 15-year-old open question) and offer an efficient alternative to existing distribution regression methods, which focus on compact domains of Euclidean spaces and apply density estimation (which suffers from slow convergence issues in high dimensions).

Type: Poster
Title: Simple consistent distribution regression on compact metric domains
Event: UCL-Duke Workshop on Sensing and Analysis of High-Dimensional Data (SAHD-2014)
Location: London, UK
Dates: 4-5 September 2014
Open access status: An open access version is available from UCL Discovery
Language: English
Additional information: https://bitbucket.org/szzoli/ite/
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Gatsby Computational Neurosci Unit
URI: https://discovery.ucl.ac.uk/id/eprint/1435545
Downloads since deposit
158Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item