UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Toward Generalized Psychovisual Preprocessing For Video Encoding

Chadha, A; Anam, MA; Treder, M; Fadeev, I; Andreopoulos, Y; (2022) Toward Generalized Psychovisual Preprocessing For Video Encoding. SMPTE Motion Imaging Journal , 131 (4) pp. 39-44. 10.5594/JMI.2022.3160801. Green open access

[thumbnail of SMPTE_v9_RPS.pdf]
Preview
PDF
SMPTE_v9_RPS.pdf - Accepted Version

Download (532kB) | Preview

Abstract

Deep perceptual preprocessing has recently emerged as a new way to enable further bitrate savings across several generations of video encoders without breaking standards or requiring any changes in client devices. In this article, we lay the foundation for a generalized psychovisual preprocessing framework for video encoding and describe one of its promising instantiations that is practically deployable for video-on-demand, live, gaming, and user-generated content (UGC). Results using state-of-the-art advanced video coding (AVC), high efficiency video coding (HEVC), and versatile video coding (VVC) encoders show that average bitrate [Bjontegaard delta-rate (BD-rate)] gains of 11%-17% are obtained over three state-of-the-art reference-based quality metrics [Netflix video multi-method assessment fusion (VMAF), structural similarity index (SSIM), and Apple advanced video quality tool (AVQT)], as well as the recently proposed nonreference International Telecommunication Union-Telecommunication?(ITU-T) P.1204 metric. The proposed framework on CPU is shown to be twice faster than × 264 medium-preset encoding. On GPU hardware, our approach achieves 714 frames/sec for 1080p video (below 2 ms/frame), thereby enabling its use in very-low-latency live video or game streaming applications.

Type: Article
Title: Toward Generalized Psychovisual Preprocessing For Video Encoding
Open access status: An open access version is available from UCL Discovery
DOI: 10.5594/JMI.2022.3160801
Publisher version: https://doi.org/10.5594/JMI.2022.3160801
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: Deep neural networks, perceptual optimization, video delivery
UCL classification: UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Electronic and Electrical Eng
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL
URI: https://discovery.ucl.ac.uk/id/eprint/10152967
Downloads since deposit
364Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item