TY  - JOUR
PB  - Society of Motion Picture and Television Engineers (SMPTE)
VL  - 131
SN  - 1545-0279
IS  - 4
JF  - SMPTE Motion Imaging Journal
EP  - 44
N1  - This version is the author accepted manuscript. For information on re-use, please refer to the publisher?s terms and conditions.
KW  - Deep neural networks
KW  -  perceptual optimization
KW  -  video delivery
AV  - public
ID  - discovery10152967
TI  - Toward Generalized Psychovisual Preprocessing For Video Encoding
UR  - https://doi.org/10.5594/JMI.2022.3160801
A1  - Chadha, A
A1  - Anam, MA
A1  - Treder, M
A1  - Fadeev, I
A1  - Andreopoulos, Y
Y1  - 2022/05/10/
N2  - Deep perceptual preprocessing has recently emerged as a new way to enable further bitrate savings across several generations of video encoders without breaking standards or requiring any changes in client devices. In this article, we lay the foundation for a generalized psychovisual preprocessing framework for video encoding and describe one of its promising instantiations that is practically deployable for video-on-demand, live, gaming, and user-generated content (UGC). Results using state-of-the-art advanced video coding (AVC), high efficiency video coding (HEVC), and versatile video coding (VVC) encoders show that average bitrate [Bjontegaard delta-rate (BD-rate)] gains of 11%-17% are obtained over three state-of-the-art reference-based quality metrics [Netflix video multi-method assessment fusion (VMAF), structural similarity index (SSIM), and Apple advanced video quality tool (AVQT)], as well as the recently proposed nonreference International Telecommunication Union-Telecommunication?(ITU-T) P.1204 metric. The proposed framework on CPU is shown to be twice faster than × 264 medium-preset encoding. On GPU hardware, our approach achieves 714 frames/sec for 1080p video (below 2 ms/frame), thereby enabling its use in very-low-latency live video or game streaming applications.
SP  - 39
ER  -