UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Real-Time Bidding by Reinforcement Learning in Display Advertising

Cai, H; Ren, K; Zhag, W; Malialis, K; Wang, J; Yu, Y; Guo, D; (2017) Real-Time Bidding by Reinforcement Learning in Display Advertising. In: de Rijke, M and Shokouhi, M and Tomkins, A and Zhang, M, (eds.) WSDM '17: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining. (pp. pp. 661-670). ACM (Association for Computing Machinery): New York, USA. Green open access

[thumbnail of Cai_Real-Time_Bidding_Reinforcement_AAM.pdf]
Preview
Text
Cai_Real-Time_Bidding_Reinforcement_AAM.pdf - Accepted Version

Download (1MB) | Preview

Abstract

The majority of online display ads are served through realtime bidding (RTB) — each ad display impression is auctioned off in real-time when it is just being generated from a user visit. To place an ad automatically and optimally, it is critical for advertisers to devise a learning algorithm to cleverly bid an ad impression in real-time. Most previous works consider the bid decision as a static optimization problem of either treating the value of each impression independently or setting a bid price to each segment of ad volume. However, the bidding for a given ad campaign would repeatedly happen during its life span before the budget runs out. As such, each bid is strategically correlated by the constrained budget and the overall effectiveness of the campaign (e.g., the rewards from generated clicks), which is only observed after the campaign has completed. Thus, it is of great interest to devise an optimal bidding strategy sequentially so that the campaign budget can be dynamically allocated across all the available impressions on the basis of both the immediate and future rewards. In this paper, we formulate the bid decision process as a reinforcement learning problem, where the state space is represented by the auction information and the campaign’s real-time parameters, while an action is the bid price to set. By modeling the state transition via auction competition, we build a Markov Decision Process framework for learning the optimal bidding policy to optimize the advertising performance in the dynamic real-time bidding environment. Furthermore, the scalability problem from the large real-world auction volume and campaign budget is well handled by state value approximation using neural networks. The empirical study on two large-scale real-world datasets and the live A/B testing on a commercial platform have demonstrated the superior performance and high effi- ciency compared to state-of-the-art methods.

Type: Proceedings paper
Title: Real-Time Bidding by Reinforcement Learning in Display Advertising
Event: WSDM '17: Tenth ACM International Conference on Web Search and Data Mining, 6-10 February 2017, Cambridge, UK
Location: Cambridge, UK
Dates: 06 February 2017 - 10 February 2017
ISBN-13: 9781450346757
Open access status: An open access version is available from UCL Discovery
DOI: 10.1145/3018661.3018702
Publisher version: https://doi.org/10.1145/3018661.3018702
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: Bid Optimization, Reinforcement Learning, Display Ads
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/1524039
Downloads since deposit
190Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item