UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

MERINA+: Improving Generalization for Neural Video Adaptation via Information-Theoretic Meta-Reinforcement Learning

Kan, N; Li, C; Jiang, Y; Dai, W; Zou, J; Xiong, H; Toni, L; (2025) MERINA+: Improving Generalization for Neural Video Adaptation via Information-Theoretic Meta-Reinforcement Learning. IEEE Transactions on Circuits and Systems for Video Technology pp. 1-17. 10.1109/TCSVT.2025.3596636. (In press). Green open access

[thumbnail of MERINA_Improving_Generalization_for_Neural_Video_Adaptation_via_Information-Theoretic_Meta-Reinforcement_Learning.pdf]
Preview
PDF
MERINA_Improving_Generalization_for_Neural_Video_Adaptation_via_Information-Theoretic_Meta-Reinforcement_Learning.pdf - Accepted Version

Download (4MB) | Preview

Abstract

Adaptive bitrate (ABR) streaming is a popular technique used to improve the quality of experience (QoE) for users who watch videos online, which, for example, can provide a smoother video playback by dynamically adjusting the requested video quality with associated bitrate according to the constrained yet diverse network conditions. Recently, learning-based ABR algorithms have achieved a notable performance gain with lower inference overhead than the conventional heuristic or model-based baselines. However, their performance may degrade significantly in an unseen network environment with time-varying and heterogeneous throughput dynamics. For a better generalization, in this paper, we propose a meta-reinforcement learning (meta-RL)-based neural ABR algorithm that is able to quickly adapt its policy to these unseen throughput dynamics. Specifically, we propose a model-free system framework comprising an inference network and a policy network. The inference network infers distribution of the latent representation for underlying dynamics based on the recent throughout context, while the policy network is trained to quickly adapt to the changing throughout dynamics with the sampled latent representation. To effectively learn the inference network and meta-policy on mixed dynamics of the practical ABR scenarios, we further design a variational information bottleneck theory-based loss function for training the inference and policy networks, whose objective is to strike a trade-off between brevity of the latent representation and expressiveness of the meta-policy. We also derive a theoretically necessary condition for the bitrate versions that yield higher long-term QoE, based on which a dynamic action pruning strategy is further developed for practical implementation. This pruning strategy can not only prevent unsafe policy outputs in midst of unseen throughput dynamics, but may also reduce the computational complexity of model-based ABR algorithms. Finally, the meta-training and meta-adaptation procedures of our proposed algorithm are implemented across a range of throughput dynamics. The empirical evaluations on various datasets containing real-world network traces verify that our algorithm surpasses the state-of-the-art ABR algorithms, particularly in terms of the average chunk QoE and fast adaptation across out-of-distribution throughput traces.

Type: Article
Title: MERINA+: Improving Generalization for Neural Video Adaptation via Information-Theoretic Meta-Reinforcement Learning
Open access status: An open access version is available from UCL Discovery
DOI: 10.1109/TCSVT.2025.3596636
Publisher version: https://doi.org/10.1109/tcsvt.2025.3596636
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher's terms and conditions.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Electronic and Electrical Eng
URI: https://discovery.ucl.ac.uk/id/eprint/10212381
Downloads since deposit
19Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item