eprintid: 10205071
rev_number: 6
eprint_status: archive
userid: 699
dir: disk0/10/20/50/71
datestamp: 2025-02-20 08:09:49
lastmod: 2025-02-20 08:09:49
status_changed: 2025-02-20 08:09:49
type: proceedings_section
metadata_visibility: show
sword_depositor: 699
creators_name: Zhang, Z
creators_name: Thwaites, A
creators_name: Woolgar, A
creators_name: Moore, B
creators_name: Zhang, C
title: SWIM: Short-Window CNN Integrated With Mamba for EEG-Based Auditory Spatial Attention Decoding
ispublished: pub
divisions: UCL
divisions: B02
divisions: C07
divisions: D05
divisions: F73
keywords: Training, Accuracy, Source coding, Benchmark testing, Brain modeling, Feature extraction, Data augmentation, Electroencephalography, Data models, Decoding
note: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
abstract: In complex auditory environments, the human auditory system possesses the remarkable ability to focus on a specific speaker while disregarding others. In this study, a new model named SWIM, a short-window convolution neural network (CNN) integrated with Mamba, is proposed for identifying the locus of auditory attention (left or right) from electroencephalography (EEG) signals without relying on speech envelopes. SWIM consists of two parts. The first is a short-window CNN (SWCNN), which acts as a short-term EEG feature extractor and achieves a final accuracy of 84.9% in the leave-one-speaker-out setup on the widely used KUL dataset. This improvement is due to the use of an improved CNN structure, data augmentation, multitask training, and model combination. The second part, Mamba, is a sequence model first applied to auditory spatial attention decoding to leverage the long-term dependency from previous SWCNN time steps. By joint training SWCNN and Mamba, the proposed SWIM structure uses both short-term and long-term information and achieves an accuracy of 86.2%, which reduces the classification errors by a relative 31.0% compared to the previous state-of-the-art result.
date: 2025-01-16
date_type: published
publisher: IEEE
official_url: https://doi.org/10.1109/slt61566.2024.10832311
oa_status: green
full_text_type: other
language: eng
primo: open
primo_central: open_green
verified: verified_manual
elements_id: 2362578
doi: 10.1109/SLT61566.2024.10832311
lyricists_name: Thwaites, Andrew
lyricists_id: ACGTH51
actors_name: Flynn, Bernadette
actors_id: BFFLY94
actors_role: owner
full_text_status: public
pres_type: paper
publication: Proceedings of 2024 IEEE Spoken Language Technology Workshop, SLT 2024
place_of_pub: Macao
pagerange: 1031-1038
event_title: 2024 IEEE Spoken Language Technology Workshop (SLT)
event_dates: 2 Dec 2024 - 5 Dec 2024
book_title: Proceedings of 2024 IEEE Spoken Language Technology Workshop, SLT 2024
citation:        Zhang, Z;    Thwaites, A;    Woolgar, A;    Moore, B;    Zhang, C;      (2025)    SWIM: Short-Window CNN Integrated With Mamba for EEG-Based Auditory Spatial Attention Decoding.                     In:  Proceedings of 2024 IEEE Spoken Language Technology Workshop, SLT 2024.  (pp. pp. 1031-1038).  IEEE: Macao.       Green open access   
 
document_url: https://discovery.ucl.ac.uk/id/eprint/10205071/1/2409.19884v2.pdf