Rate-Accuracy Trade-Off In Video Classification With Deep Convolutional Neural Networks

Advanced search
Browse by:

Department | Year

UCL Theses | Latest

Deposit your research

Bookmark & Share

Rate-Accuracy Trade-Off In Video Classification With Deep Convolutional Neural Networks

Abbas, A; Jubran, M; Chadha, A; Andreopoulos, Y; (2018) Rate-Accuracy Trade-Off In Video Classification With Deep Convolutional Neural Networks. IEEE Transactions on Circuits and Systems for Video Technology pp. 793-797. 10.1109/TCSVT.2018.2887408. (In press). Green open access

Preview

Text
1810.03964.pdf - Accepted Version
Download (723kB) | Preview

Abstract

Advanced video classification systems decode video frames to derive the necessary texture and motion representations for ingestion and analysis by spatio-temporal deep convolutional neural networks (CNNs). However, when considering visual Internet-of-Things applications, surveillance systems and semantic crawlers of large video repositories, the video capture and the CNN-based semantic analysis parts do not tend to be colocated. This necessitates the transport of compressed video over networks and incurs significant overhead in bandwidth and energy consumption, thereby significantly undermining the deployment potential of such systems. In this paper, we investigate the trade-off between the encoding bitrate and the achievable accuracy of CNN-based video classification models that directly ingest AVC/H.264 and HEVC encoded videos. Instead of retaining entire compressed video bitstreams and applying complex optical flow calculations prior to CNN processing, we only retain motion vector and select texture information at significantly-reduced bitrates and apply no additional processing prior to CNN ingestion. Based on three CNN architectures and two action recognition datasets, we achieve 11%–94% saving in bitrate with marginal effect on classification accuracy. A model-based selection between multiple CNNs increases these savings further, to the point where, if up to 7% loss of accuracy can be tolerated, video classification can take place with as little as 3 kbps for the transport of the required compressed video information to the system implementing the CNN models.

Type:	Article
Title:	Rate-Accuracy Trade-Off In Video Classification With Deep Convolutional Neural Networks
Location:	Athens, GREECE
Open access status:	An open access version is available from UCL Discovery
DOI:	10.1109/TCSVT.2018.2887408
Publisher version:	https://doi.org/10.1109/TCSVT.2018.2887408
Language:	English
Additional information:	This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords:	Streaming media , Bit rate , Optical flow , Encoding , Standards , Visualization , Computer architecture, Video classification , convolutional neural networks , video streaming
UCL classification:	UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Electronic and Electrical Eng
URI:	https://discovery.ucl.ac.uk/id/eprint/10067518

Downloads since deposit

171Downloads

Download activity - last month

Download activity - last 12 months

Downloads by country - last 12 months

Archive Staff Only

View Item