UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Audio Event-Relational Graph Representation Learning for Acoustic Scene Classification

Hou, Y; Song, S; Yu, C; Wang, W; Botteldooren, D; (2023) Audio Event-Relational Graph Representation Learning for Acoustic Scene Classification. IEEE Signal Processing Letters , 30 pp. 1382-1386. 10.1109/LSP.2023.3319233. Green open access

[thumbnail of Audio_Event-Relational_Graph_Representation_Learning_for_Acoustic_Scene_Classification.pdf]
Preview
Text
Audio_Event-Relational_Graph_Representation_Learning_for_Acoustic_Scene_Classification.pdf - Published Version

Download (1MB) | Preview

Abstract

Most deep learning-based acoustic scene classification (ASC) approaches identify scenes based on acoustic features converted from audio clips containing mixed information entangled by polyphonic audio events (AEs). However, these approaches have difficulties in explaining what cues they use to identify scenes. This letter conducts the first study on disclosing the relationship between real-life acoustic scenes and semantic embeddings from the most relevant AEs. Specifically, we propose an event-relational graph representation learning (ERGL) framework for ASC to classify scenes, and simultaneously answer clearly and straightly which cues are used in classifying. In the event-relational graph, embeddings of each event are treated as nodes, while relationship cues derived from each pair of nodes are described by multi-dimensional edge features. Experiments on a real-life ASC dataset show that the proposed ERGL achieves competitive performance on ASC by learning embeddings of only a limited number of AEs. The results show the feasibility of recognizing diverse acoustic scenes based on the audio event-relational graph.

Type: Article
Title: Audio Event-Relational Graph Representation Learning for Acoustic Scene Classification
Open access status: An open access version is available from UCL Discovery
DOI: 10.1109/LSP.2023.3319233
Publisher version: https://doi.org/10.1109/LSP.2023.3319233
Language: English
Additional information: This is an Open Access article published under a Creative Commons Attribution 4.0 International (CC BY 4.0) Licence (https://creativecommons.org/licenses/by/4.0/).
Keywords: Acoustic scene classification, event-relational graph, multi-dimensional edge, graph representation learning
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Div of Psychology and Lang Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Div of Psychology and Lang Sciences > UCL Interaction Centre
URI: https://discovery.ucl.ac.uk/id/eprint/10179898
Downloads since deposit
21Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item