Compressed-domain video classification with deep neural networks: “There's way too much information to decode the matrix”

Advanced search
Browse by:

Department | Year

UCL Theses | Latest

Deposit your research

Compressed-domain video classification with deep neural networks: “There's way too much information to decode the matrix”

Chadha, A; Abbas, A; Andreopoulos, Y; (2017) Compressed-domain video classification with deep neural networks: “There's way too much information to decode the matrix”. In: Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP). (pp. pp. 1832-1836). IEEE: Beijing, China. Green open access

Preview

Text
mv_cnn_v12_preprint.pdf - Accepted Version
Download (259kB) | Preview

Abstract

We investigate video classification via a 3D deep convolutional neural network (CNN) that directly ingests compressed bitstream information. This idea is based on the observation that video macroblock (MB) motion vectors (that are very compact and directly available from the compressed bitstream) are inherently capturing local spatio-temporal changes in each video scene. Our results on two standard video datasets show that our approach outperforms pixelbased approaches and remains within 7 percentile points from the best classification results reported by highly-complex optical-flow & deep-CNN methods. At the same time, a CPU-based realization of our approach is found to be more than 2500 times faster in the motion extraction in comparison to GPU-based optical flow methods and also offers 2 to 3.4-fold reduction in the utilized deep CNN weights compared to recent architectures. This indicates that deep learning based on compressed video bitstream information may allow for advanced video classification to be deployed in very large datasets using commodity CPU hardware. Source code is available at http://www.github.com/mvcnn.

Type:	Proceedings paper
Title:	Compressed-domain video classification with deep neural networks: “There's way too much information to decode the matrix”
Event:	24th IEEE International Conference on Image Processing (ICIP)
Location:	Beijing, PEOPLES R CHINA
Dates:	17 September 2017 - 20 September 2017
Open access status:	An open access version is available from UCL Discovery
DOI:	10.1109/ICIP.2017.8296598
Publisher version:	https://doi.org/10.1109/ICIP.2017.8296598
Language:	English
Additional information:	This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords:	video coding, classification, deep learning
UCL classification:	UCL UCL > Provost and Vice Provost Offices UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Electronic and Electrical Eng
URI:	https://discovery.ucl.ac.uk/id/eprint/10056313

Downloads since deposit

257Downloads

Download activity - last month

Download activity - last 12 months

Downloads by country - last 12 months

Archive Staff Only

View Item