UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Best Model Identification: A Rested Bandit Formulation

Cella, Leonardo; Pontil, Massimiliano; Gentile, Claudio; (2021) Best Model Identification: A Rested Bandit Formulation. In: Meila, M and Zhang, T, (eds.) Proceedings of the 38th International Conference on Machine Learning. (pp. pp. 1-11). PMLR Green open access

[thumbnail of cella21a.pdf]
Preview
Text
cella21a.pdf - Published Version

Download (347kB) | Preview

Abstract

We introduce and analyze a best arm identification problem in the rested bandit setting, wherein arms are themselves learning algorithms whose expected losses decrease with the number of times the arm has been played. The shape of the expected loss functions is similar across arms, and is assumed to be available up to unknown parameters that have to be learned on the fly. We define a novel notion of regret for this problem, where we compare to the policy that always plays the arm having the smallest expected loss at the end of the game. We analyze an arm elimination algorithm whose regret vanishes as the time horizon increases. The actual rate of convergence depends in a detailed way on the postulated functional form of the expected losses. We complement our analysis with lower bounds, indicating strengths and limitations of the proposed solution.

Type: Proceedings paper
Title: Best Model Identification: A Rested Bandit Formulation
Event: 38th International Conference on Machine Learning
Dates: 18th-24th July 2021
Open access status: An open access version is available from UCL Discovery
Publisher version: https://proceedings.mlr.press/v139/cella21a.html
Language: English
Additional information: This version is the version of record. For information on re-use, please refer to the publisher's terms and conditions.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10164243
Downloads since deposit
19Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item