UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Context Helps: Integrating context information with videos in a graph-based HAR framework

Xu, Binxia; Bikakis, Antonios; Onah, Daniel; Vlachidis, Andreas; Dickens, Luke; (2024) Context Helps: Integrating context information with videos in a graph-based HAR framework. In: Neural-Symbolic Learning and Reasoning. NeSy 2024. (pp. pp. 3-28). Springer, Cham

[thumbnail of NeSy_camera_ready.pdf] Text
NeSy_camera_ready.pdf - Published Version
Access restricted to UCL open access staff until 11 September 2025.

Download (27MB)

Abstract

Human Activity Recognition (HAR) from videos is a challenging, data intensive task. There have been significant strides in recent years, but even state-of-the-art (SoTA) models rely heavily on domain specific supervised fine-tuning of visual features, and even with this data- and compute-intensive fine-tuning, overall performance can still be limited. We argue that the next generation of HAR models could benefit from explicit neuro-symbolic mechanisms in order to flexibly exploit rich contextual information available in, and for, videos. With a view to this, we propose a Human Activity Recognition with Context Prompt (HARCP) task to investigate the value of contextual information for video-based HAR. We also present a neuro-symbolic graph neural network-based framework that integrates zero-shot object localisation to address the HARCP task. This captures the human activity as a sequence of graph-based scene representations relating parts of the human body to key objects, supporting the targeted injection of external contextual knowledge in symbolic form. We evaluate existing HAR baselines alongside our graph-based methods to demonstrate the advantage of being able to accommodate this additional channel of information. Our evaluations show that not only does context information from key objects boost accuracy beyond that provided by SoTA HAR models alone, there is also a greater semantic similarity between our model’s errors and the target class. We argue that this represents an improved model alignment with human-like errors and quantify this with a novel measure we call Semantic Prediction Dispersion.

Type: Proceedings paper
Title: Context Helps: Integrating context information with videos in a graph-based HAR framework
Event: Neural-symbolic learning and reasoning (NeSy)
Location: Barcelona, Spain
Dates: 9 Sep 2024 - 12 Sep 2024
DOI: 10.1007/978-3-031-71167-1_1
Publisher version: https://doi.org/10.1007/978-3-031-71167-1_1
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: context, human activity recognition, neuro-symbolic integration, graph neural network, human alignment
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL SLASH
UCL > Provost and Vice Provost Offices > UCL SLASH > Faculty of Arts and Humanities
UCL > Provost and Vice Provost Offices > UCL SLASH > Faculty of Arts and Humanities > Dept of Information Studies
URI: https://discovery.ucl.ac.uk/id/eprint/10195016
Downloads since deposit
0Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item