Bubeck, S;
Wang, T;
Viswanathan, N;
(2013)
Multiple identifications in multi-armed bandits.
In: Dasgupta, Sanjoy and McAllester, David, (eds.)
Proceedings of the 30th International Conference on Machine Learning.
(pp. pp. 258-265).
ACM: New York, USA.
Preview |
Text
bandit_FA.pdf - Accepted Version Download (410kB) | Preview |
Abstract
We study the problem of identifying the top m arms in a multi-armed bandit game. Our proposed solution relies on a new algorithm based on successive rejects of the seemingly bad arms, and successive accepts of the good ones. This algorithmic contribution allows to tackle other multiple identifications settings that were previously out of reach. In particular we show that this idea of successive accepts and rejects applies to the multi-bandit best arm identification problem.
Type: | Proceedings paper |
---|---|
Title: | Multiple identifications in multi-armed bandits |
Event: | 30th International Conference on Machine Learning (ICML 2013), 16-21 June 2013, Atlanta, GA, USA |
ISBN-13: | 9781629933061 |
Open access status: | An open access version is available from UCL Discovery |
Publisher version: | http://proceedings.mlr.press/v28/bubeck13.pdf |
Language: | English |
Additional information: | This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions. |
UCL classification: | UCL UCL > Provost and Vice Provost Offices UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences > Dept of Statistical Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/10055410 |
Archive Staff Only
View Item |