Herbster, M;
              
      
            
                Pasteris, S;
              
      
            
                Vitale, F;
              
      
            
                Pontil, M;
              
      
        
        
  
(2021)
  A Gang of Adversarial Bandits.
    
    
      In: Ranzato, M and Beygelzimer, A and Dauphin, Y and Liang, PS and Wortman Vaughan, J, (eds.)
      Advances in Neural Information Processing Systems 34.
      
      (pp. pp. 2265-2279).
    
 NeurIPS
  
  
       
    
  
| Preview | Text a_gang_of_adversarial_bandits.pdf - Published Version Download (407kB) | Preview | 
Abstract
We consider running multiple instances of multi-armed bandit (MAB) problems in parallel. A main motivation for this study are online recommendation systems, in which each of N users is associated with a MAB problem and the goal is to exploit users' similarity in order to learn users' preferences to K items more efficiently. We consider the adversarial MAB setting, whereby an adversary is free to choose which user and which loss to present to the learner during the learning process. Users are in a social network and the learner is aided by a-priori knowledge of the strengths of the social links between all pairs of users. It is assumed that if the social link between two users is strong then they tend to share the same action. The regret is measured relative to an arbitrary function which maps users to actions. The smoothness of the function is captured by a resistance-based dispersion measure Ψ. We present two learning algorithms, GABA-I and GABA-II which exploit the network structure to bias towards functions of low Ψ values. We show that GABA-I has an expected regret bound of O(pln(N K/Ψ)ΨKT) and per-trial time complexity of O(K ln(N)), whilst GABA-II has a weaker O(pln(N/Ψ) ln(N K/Ψ)ΨKT) regret, but a better O(ln(K) ln(N)) per-trial time complexity. We highlight improvements of both algorithms over running independent standard MABs across users.
| Type: | Proceedings paper | 
|---|---|
| Title: | A Gang of Adversarial Bandits | 
| Event: | 35th Conference on Neural Information Processing Systems (NeurIPS 2021) | 
| ISBN-13: | 9781713845393 | 
| Open access status: | An open access version is available from UCL Discovery | 
| Publisher version: | https://proceedings.neurips.cc/paper/2021/hash/124... | 
| Language: | English | 
| Additional information: | This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions. | 
| UCL classification: | UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science UCL > Provost and Vice Provost Offices > UCL BEAMS UCL | 
| URI: | https://discovery.ucl.ac.uk/id/eprint/10150637 | 
Archive Staff Only
|  | View Item | 
 
                      
