Cella, Leonardo;
Pontil, Massimiliano;
Gentile, Claudio;
(2021)
Best Model Identification: A Rested Bandit Formulation.
In: Meila, M and Zhang, T, (eds.)
Proceedings of the 38th International Conference on Machine Learning.
(pp. pp. 1-11).
PMLR
Preview |
Text
cella21a.pdf - Published Version Download (347kB) | Preview |
Abstract
We introduce and analyze a best arm identification problem in the rested bandit setting, wherein arms are themselves learning algorithms whose expected losses decrease with the number of times the arm has been played. The shape of the expected loss functions is similar across arms, and is assumed to be available up to unknown parameters that have to be learned on the fly. We define a novel notion of regret for this problem, where we compare to the policy that always plays the arm having the smallest expected loss at the end of the game. We analyze an arm elimination algorithm whose regret vanishes as the time horizon increases. The actual rate of convergence depends in a detailed way on the postulated functional form of the expected losses. We complement our analysis with lower bounds, indicating strengths and limitations of the proposed solution.
Type: | Proceedings paper |
---|---|
Title: | Best Model Identification: A Rested Bandit Formulation |
Event: | 38th International Conference on Machine Learning |
Dates: | 18th-24th July 2021 |
Open access status: | An open access version is available from UCL Discovery |
Publisher version: | https://proceedings.mlr.press/v139/cella21a.html |
Language: | English |
Additional information: | This version is the version of record. For information on re-use, please refer to the publisher's terms and conditions. |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/10164243 |
Archive Staff Only
View Item |