Xu, Binxia;
Bikakis, Antonios;
Onah, Daniel;
Vlachidis, Andreas;
Dickens, Luke;
(2024)
Context Helps: Integrating context information with videos in a graph-based HAR framework.
In:
Neural-Symbolic Learning and Reasoning. NeSy 2024.
(pp. pp. 3-28).
Springer, Cham
![]() |
Text
NeSy_camera_ready.pdf - Published Version Access restricted to UCL open access staff until 11 September 2025. Download (27MB) |
Abstract
Human Activity Recognition (HAR) from videos is a challenging, data intensive task. There have been significant strides in recent years, but even state-of-the-art (SoTA) models rely heavily on domain specific supervised fine-tuning of visual features, and even with this data- and compute-intensive fine-tuning, overall performance can still be limited. We argue that the next generation of HAR models could benefit from explicit neuro-symbolic mechanisms in order to flexibly exploit rich contextual information available in, and for, videos. With a view to this, we propose a Human Activity Recognition with Context Prompt (HARCP) task to investigate the value of contextual information for video-based HAR. We also present a neuro-symbolic graph neural network-based framework that integrates zero-shot object localisation to address the HARCP task. This captures the human activity as a sequence of graph-based scene representations relating parts of the human body to key objects, supporting the targeted injection of external contextual knowledge in symbolic form. We evaluate existing HAR baselines alongside our graph-based methods to demonstrate the advantage of being able to accommodate this additional channel of information. Our evaluations show that not only does context information from key objects boost accuracy beyond that provided by SoTA HAR models alone, there is also a greater semantic similarity between our model’s errors and the target class. We argue that this represents an improved model alignment with human-like errors and quantify this with a novel measure we call Semantic Prediction Dispersion.
Type: | Proceedings paper |
---|---|
Title: | Context Helps: Integrating context information with videos in a graph-based HAR framework |
Event: | Neural-symbolic learning and reasoning (NeSy) |
Location: | Barcelona, Spain |
Dates: | 9 Sep 2024 - 12 Sep 2024 |
DOI: | 10.1007/978-3-031-71167-1_1 |
Publisher version: | https://doi.org/10.1007/978-3-031-71167-1_1 |
Language: | English |
Additional information: | This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions. |
Keywords: | context, human activity recognition, neuro-symbolic integration, graph neural network, human alignment |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL SLASH UCL > Provost and Vice Provost Offices > UCL SLASH > Faculty of Arts and Humanities UCL > Provost and Vice Provost Offices > UCL SLASH > Faculty of Arts and Humanities > Dept of Information Studies |
URI: | https://discovery.ucl.ac.uk/id/eprint/10195016 |
Archive Staff Only
![]() |
View Item |