Pasteris, Stephen;
Rumi, Alberto;
Thiessen, Maximilian;
Saito, Shota;
Miyauchi, Atsushi;
Vitale, Fabio;
Herbster, Mark;
(2024)
Bandits with Abstention under Expert Advice.
In: Globerson, A and Mackey, L and Belgrave, D and Fan, A and Paquet, U and Tomczak, J and Zhang, C, (eds.)
Advances in Neural Information Processing Systems 37 (NeurIPS 2024).
(pp. pp. 1-29).
NeurIPS: San Diego, CA, USA.
Preview |
Text
NeurIPS-2024-bandits-with-abstention-under-expert-advice-Paper-Conference.pdf - Published Version Download (17MB) | Preview |
Abstract
We study the classic problem of prediction with expert advice under bandit feedback. Our model assumes that one action, corresponding to the learner’s abstention from play, has no reward or loss on every trial. We propose the confidence-rated bandits with abstentions (CBA) algorithm, which exploits this assumption to obtain reward bounds that can significantly improve those of the classical EXP4 algorithm. Our problem can be construed as the aggregation of confidence-rated predictors, with the learner having the option to abstain from play. We are the first to achieve bounds on the expected cumulative reward for general confidence-rated predictors. In the special case of specialists we achieve a novel reward bound, significantly improving the previous bounds of SPECIALISTEXP (treating abstention as another action). We discuss how CBA can be applied to the problem of adversarial contextual bandits with the option of abstaining from selecting any action. We are able to leverage a wide range of inductive biases, outperforming previous approaches both theoretically and in preliminary experimental analysis. Additionally, we achieve a reduction in runtime from quadratic to almost linear in the number of contexts for the specific case of metric space contexts.
Type: | Proceedings paper |
---|---|
Title: | Bandits with Abstention under Expert Advice |
Event: | 38th Conference on Neural Information Processing Systems (NeurIPS 2024) |
ISBN-13: | 9798331314385 |
Open access status: | An open access version is available from UCL Discovery |
Publisher version: | https://papers.nips.cc/paper_files/paper/2024/hash... |
Language: | English |
Additional information: | This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions. |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/10207448 |
Archive Staff Only
![]() |
View Item |