Roa Vicens, Jacobo;
(2024)
Bayesian and Adversarial Inverse Reinforcement Learning for Limit Order Book Simulators.
Doctoral thesis (Ph.D), UCL (University College London).
Preview |
Text
Roa Vicens_Thesis.pdf - Other Download (6MB) | Preview |
Abstract
Inverse reinforcement learning is a field of machine learning aimed at building models of the behaviour of expert reinforcement learning agents, by recovering their underlying reward and policy functions through observation of the actions they choose to execute as a response to the evolution of their target environments. Learning such models makes inverse reinforcement learning (IRL) particularly valuable in areas such as financial markets, where the aggregate behaviour of competing agents drives the evolution of asset prices. This is especially relevant for limit order books (LOBs), the electronic venues where demand and supply of publicly traded securities match and clear. While there is substantial literature in the field of limit order book simulation, and various methods of inverse reinforcement learning have showed promising results in tasks such as robotics, automatic control and video games thanks to the use of neural networks to learn reward functions, few works have explored the application of such IRL methods to limit order books. In particular, their compatibility with the complexities of LOB dynamics and data, and whether the rewards and policies recovered through IRL display performances comparable to those of the original agents in the order book. The works presented in this thesis explore various deep learning approaches to IRL applied to agents operating on a limit order books, as well as machine learning models specifically designed to learn challenging features of order book data. Namely, we introduce a method to solve IRL based on Bayesian neural networks; secondly, a combination of adversarial inverse reinforcement learning with a limit order book simulator trained with real market data; and finally, neural models including tensorial and graph components to address directly specific challenges of learning LOB time series of high dimensionality. These results open doors to further paths of research to learn and model the behaviour of expert agents in complex financial markets.
Type: | Thesis (Doctoral) |
---|---|
Qualification: | Ph.D |
Title: | Bayesian and Adversarial Inverse Reinforcement Learning for Limit Order Book Simulators |
Open access status: | An open access version is available from UCL Discovery |
Language: | English |
Additional information: | Copyright © The Author 2024. Original content in this thesis is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) Licence (https://creativecommons.org/licenses/by-nc/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request. |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences > Dept of Statistical Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/10193288 |
Archive Staff Only
![]() |
View Item |