eprintid: 10188665 rev_number: 11 eprint_status: archive userid: 699 dir: disk0/10/18/86/65 datestamp: 2024-03-08 17:14:47 lastmod: 2024-12-04 11:03:48 status_changed: 2024-03-08 17:14:47 type: proceedings_section metadata_visibility: show sword_depositor: 699 creators_name: Viallard, Paul creators_name: Haddouche, Maxime creators_name: Simsekli, Umut creators_name: Guedj, Benjamin title: Learning via Wasserstein-Based High Probability Generalisation Bounds ispublished: pub divisions: UCL divisions: B04 divisions: C05 divisions: F48 note: This version is the author accepted manuscript. For information on re-use, please refer to the publisher's terms and conditions. abstract: Minimising upper bounds on the population risk or the generalisation gap has been widely used in structural risk minimisation (SRM) -- this is in particular at the core of PAC-Bayesian learning. Despite its successes and unfailing surge of interest in recent years, a limitation of the PAC-Bayesian framework is that most bounds involve a Kullback-Leibler (KL) divergence term (or its variations), which might exhibit erratic behavior and fail to capture the underlying geometric structure of the learning problem -- hence restricting its use in practical applications.As a remedy, recent studies have attempted to replace the KL divergence in the PAC-Bayesian bounds with the Wasserstein distance. Even though these bounds alleviated the aforementioned issues to a certain extent, they either hold in expectation, are for bounded losses, or are nontrivial to minimize in an SRM framework. In this work, we contribute to this line of research and prove novel Wasserstein distance-based PAC-Bayesian generalisation bounds for both batch learning with independent and identically distributed (i.i.d.) data, and online learning with potentially non-i.i.d. data. Contrary to previous art, our bounds are stronger in the sense that (i) they hold with high probability, (ii) they apply to unbounded (potentially heavy-tailed) losses, and (iii) they lead to optimizable training objectives that can be used in SRM. As a result we derive novel Wasserstein-based PAC-Bayesian learning algorithms and we illustrate their empirical advantage on a variety of experiments. date: 2023 date_type: published publisher: NeurIPS official_url: https://papers.nips.cc/paper_files/paper/2023/hash/af2bb2b2280d36f8842e440b4e275152-Abstract-Conference.html oa_status: green full_text_type: pub language: eng primo: open primo_central: open_green verified: verified_manual elements_id: 2254253 lyricists_name: Guedj, Benjamin lyricists_id: BGUED94 actors_name: Flynn, Bernadette actors_id: BFFLY94 actors_role: owner full_text_status: public pres_type: paper series: Advances in Neural Information Processing Systems publication: NeurIPS volume: 36 event_title: 37th Conference on Neural Information Processing Systems (NeurIPS 2023) book_title: Advances in Neural Information Processing Systems (NeurIPS 2023) editors_name: Oh, Alice editors_name: Naumann, Tristan editors_name: Globerson, Amir editors_name: Saenko, Kate editors_name: Hardt, Moritz editors_name: Levine, Sergey citation: Viallard, Paul; Haddouche, Maxime; Simsekli, Umut; Guedj, Benjamin; (2023) Learning via Wasserstein-Based High Probability Generalisation Bounds. In: Oh, Alice and Naumann, Tristan and Globerson, Amir and Saenko, Kate and Hardt, Moritz and Levine, Sergey, (eds.) Advances in Neural Information Processing Systems (NeurIPS 2023). NeurIPS Green open access document_url: https://discovery.ucl.ac.uk/id/eprint/10188665/1/69_learning_via_wasserstein_based.pdf