UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Leveraging weak supervision for video understanding

Garcia Cifuentes, C; (2013) Leveraging weak supervision for video understanding. Doctoral thesis , UCL (University College London).

Full text not available from this repository.


This research deals with the challenging task of video classification, with a particular focus on action recognition, which is essential for a comprehensive understanding of videos. In the typical scenario, there is a list of semantic categories to be modeled, and example clips are given together with their associated category label, indicating which action of interests happens in that clip. No information is given about where or when the action happens, even less about why the annotator considered the clip to belong to a sometimes ambiguous category. Within the framework of the bag-of-words representation of videos, we explore how to leverage such weak labels from three points of view: (i) the use of coherent supervision from the earliest stages of the pipeline; (ii) the combination of heterogeneous features in nature and scale; and (iii) mid-level representations of videos based on regions, so as to increase the ability to discriminate relevant locations in the video. For the quantization of local features, we propose and evaluate a novel form of supervision to train random forests which explicitly aims at the discriminative power of the resulting bags of words. We show that our forests are better than traditional ones at incorporating contextual elements during quantization, and draw attention to the risk of naive combination of features. We also show that mid-level representations carry complementary information that can improve classification. Moreover, we propose a novel application of video classification to tracking. We show that weak clip labels can be used to successfully classify videos into categories of dynamic models. In this way, we improve tracking by performing classification-based dynamic model selection.

Type: Thesis (Doctoral)
Title: Leveraging weak supervision for video understanding
Language: English
UCL classification: UCL > Provost and Vice Provost Offices
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/1392377
Downloads since deposit
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item