UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Bandits with Abstention under Expert Advice

Pasteris, Stephen; Rumi, Alberto; Thiessen, Maximilian; Saito, Shota; Miyauchi, Atsushi; Vitale, Fabio; Herbster, Mark; (2024) Bandits with Abstention under Expert Advice. In: Globerson, A and Mackey, L and Belgrave, D and Fan, A and Paquet, U and Tomczak, J and Zhang, C, (eds.) Advances in Neural Information Processing Systems 37 (NeurIPS 2024). (pp. pp. 1-29). NeurIPS: San Diego, CA, USA. Green open access

[thumbnail of NeurIPS-2024-bandits-with-abstention-under-expert-advice-Paper-Conference.pdf]
Preview
Text
NeurIPS-2024-bandits-with-abstention-under-expert-advice-Paper-Conference.pdf - Published Version

Download (17MB) | Preview

Abstract

We study the classic problem of prediction with expert advice under bandit feedback. Our model assumes that one action, corresponding to the learner’s abstention from play, has no reward or loss on every trial. We propose the confidence-rated bandits with abstentions (CBA) algorithm, which exploits this assumption to obtain reward bounds that can significantly improve those of the classical EXP4 algorithm. Our problem can be construed as the aggregation of confidence-rated predictors, with the learner having the option to abstain from play. We are the first to achieve bounds on the expected cumulative reward for general confidence-rated predictors. In the special case of specialists we achieve a novel reward bound, significantly improving the previous bounds of SPECIALISTEXP (treating abstention as another action). We discuss how CBA can be applied to the problem of adversarial contextual bandits with the option of abstaining from selecting any action. We are able to leverage a wide range of inductive biases, outperforming previous approaches both theoretically and in preliminary experimental analysis. Additionally, we achieve a reduction in runtime from quadratic to almost linear in the number of contexts for the specific case of metric space contexts.

Type: Proceedings paper
Title: Bandits with Abstention under Expert Advice
Event: 38th Conference on Neural Information Processing Systems (NeurIPS 2024)
ISBN-13: 9798331314385
Open access status: An open access version is available from UCL Discovery
Publisher version: https://papers.nips.cc/paper_files/paper/2024/hash...
Language: English
Additional information: This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10207448
Downloads since deposit
3Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item