eprintid: 10205071 rev_number: 6 eprint_status: archive userid: 699 dir: disk0/10/20/50/71 datestamp: 2025-02-20 08:09:49 lastmod: 2025-02-20 08:09:49 status_changed: 2025-02-20 08:09:49 type: proceedings_section metadata_visibility: show sword_depositor: 699 creators_name: Zhang, Z creators_name: Thwaites, A creators_name: Woolgar, A creators_name: Moore, B creators_name: Zhang, C title: SWIM: Short-Window CNN Integrated With Mamba for EEG-Based Auditory Spatial Attention Decoding ispublished: pub divisions: UCL divisions: B02 divisions: C07 divisions: D05 divisions: F73 keywords: Training, Accuracy, Source coding, Benchmark testing, Brain modeling, Feature extraction, Data augmentation, Electroencephalography, Data models, Decoding note: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions. abstract: In complex auditory environments, the human auditory system possesses the remarkable ability to focus on a specific speaker while disregarding others. In this study, a new model named SWIM, a short-window convolution neural network (CNN) integrated with Mamba, is proposed for identifying the locus of auditory attention (left or right) from electroencephalography (EEG) signals without relying on speech envelopes. SWIM consists of two parts. The first is a short-window CNN (SWCNN), which acts as a short-term EEG feature extractor and achieves a final accuracy of 84.9% in the leave-one-speaker-out setup on the widely used KUL dataset. This improvement is due to the use of an improved CNN structure, data augmentation, multitask training, and model combination. The second part, Mamba, is a sequence model first applied to auditory spatial attention decoding to leverage the long-term dependency from previous SWCNN time steps. By joint training SWCNN and Mamba, the proposed SWIM structure uses both short-term and long-term information and achieves an accuracy of 86.2%, which reduces the classification errors by a relative 31.0% compared to the previous state-of-the-art result. date: 2025-01-16 date_type: published publisher: IEEE official_url: https://doi.org/10.1109/slt61566.2024.10832311 oa_status: green full_text_type: other language: eng primo: open primo_central: open_green verified: verified_manual elements_id: 2362578 doi: 10.1109/SLT61566.2024.10832311 lyricists_name: Thwaites, Andrew lyricists_id: ACGTH51 actors_name: Flynn, Bernadette actors_id: BFFLY94 actors_role: owner full_text_status: public pres_type: paper publication: Proceedings of 2024 IEEE Spoken Language Technology Workshop, SLT 2024 place_of_pub: Macao pagerange: 1031-1038 event_title: 2024 IEEE Spoken Language Technology Workshop (SLT) event_dates: 2 Dec 2024 - 5 Dec 2024 book_title: Proceedings of 2024 IEEE Spoken Language Technology Workshop, SLT 2024 citation: Zhang, Z; Thwaites, A; Woolgar, A; Moore, B; Zhang, C; (2025) SWIM: Short-Window CNN Integrated With Mamba for EEG-Based Auditory Spatial Attention Decoding. In: Proceedings of 2024 IEEE Spoken Language Technology Workshop, SLT 2024. (pp. pp. 1031-1038). IEEE: Macao. Green open access document_url: https://discovery.ucl.ac.uk/id/eprint/10205071/1/2409.19884v2.pdf