eprintid: 10188665
rev_number: 11
eprint_status: archive
userid: 699
dir: disk0/10/18/86/65
datestamp: 2024-03-08 17:14:47
lastmod: 2024-12-04 11:03:48
status_changed: 2024-03-08 17:14:47
type: proceedings_section
metadata_visibility: show
sword_depositor: 699
creators_name: Viallard, Paul
creators_name: Haddouche, Maxime
creators_name: Simsekli, Umut
creators_name: Guedj, Benjamin
title: Learning via Wasserstein-Based High Probability Generalisation Bounds
ispublished: pub
divisions: UCL
divisions: B04
divisions: C05
divisions: F48
note: This version is the author accepted manuscript. For information on re-use, please refer to the publisher's terms and conditions.
abstract: Minimising upper bounds on the population risk or the generalisation gap has been widely used in structural risk minimisation (SRM) -- this is in particular at the core of PAC-Bayesian learning. Despite its successes and unfailing surge of interest in recent years, a limitation of the PAC-Bayesian framework is that most bounds involve a Kullback-Leibler (KL) divergence term (or its variations), which might exhibit erratic behavior and fail to capture the underlying geometric structure of the learning problem -- hence restricting its use in practical applications.As a remedy, recent studies have attempted to replace the KL divergence in the PAC-Bayesian bounds with the Wasserstein distance. Even though these bounds alleviated the aforementioned issues to a certain extent, they either hold in expectation, are for bounded losses, or are nontrivial to minimize in an SRM framework. In this work, we contribute to this line of research and prove novel Wasserstein distance-based PAC-Bayesian generalisation bounds for both batch learning with independent and identically distributed (i.i.d.) data, and online learning with potentially non-i.i.d. data. Contrary to previous art, our bounds are stronger in the sense that (i) they hold with high probability, (ii) they apply to unbounded (potentially heavy-tailed) losses, and (iii) they lead to optimizable training objectives that can be used in SRM. As a result we derive novel Wasserstein-based PAC-Bayesian learning algorithms and we illustrate their empirical advantage on a variety of experiments.
date: 2023
date_type: published
publisher: NeurIPS
official_url: https://papers.nips.cc/paper_files/paper/2023/hash/af2bb2b2280d36f8842e440b4e275152-Abstract-Conference.html
oa_status: green
full_text_type: pub
language: eng
primo: open
primo_central: open_green
verified: verified_manual
elements_id: 2254253
lyricists_name: Guedj, Benjamin
lyricists_id: BGUED94
actors_name: Flynn, Bernadette
actors_id: BFFLY94
actors_role: owner
full_text_status: public
pres_type: paper
series: Advances in Neural Information Processing Systems
publication: NeurIPS
volume: 36
event_title: 37th Conference on Neural Information Processing Systems (NeurIPS 2023)
book_title: Advances in Neural Information Processing Systems (NeurIPS 2023)
editors_name: Oh, Alice
editors_name: Naumann, Tristan
editors_name: Globerson, Amir
editors_name: Saenko, Kate
editors_name: Hardt, Moritz
editors_name: Levine, Sergey
citation:        Viallard, Paul;    Haddouche, Maxime;    Simsekli, Umut;    Guedj, Benjamin;      (2023)    Learning via Wasserstein-Based High Probability Generalisation Bounds.                     In: Oh, Alice and Naumann, Tristan and Globerson, Amir and Saenko, Kate and Hardt, Moritz and Levine, Sergey, (eds.) Advances in Neural Information Processing Systems (NeurIPS 2023).    NeurIPS       Green open access   
 
document_url: https://discovery.ucl.ac.uk/id/eprint/10188665/1/69_learning_via_wasserstein_based.pdf