TY  - GEN
N1  - This version is the author accepted manuscript. For information on re-use, please refer to the publisher?s terms and conditions.
TI  - SWIM: Short-Window CNN Integrated With Mamba for EEG-Based Auditory Spatial Attention Decoding
Y1  - 2025/01/16/
AV  - public
SP  - 1031
EP  - 1038
CY  - Macao
A1  - Zhang, Z
A1  - Thwaites, A
A1  - Woolgar, A
A1  - Moore, B
A1  - Zhang, C
KW  - Training
KW  -  Accuracy
KW  -  Source coding
KW  -  Benchmark testing
KW  -  Brain modeling
KW  -  Feature extraction
KW  -  Data augmentation
KW  -  Electroencephalography
KW  -  Data models
KW  -  Decoding
N2  - In complex auditory environments, the human auditory system possesses the remarkable ability to focus on a specific speaker while disregarding others. In this study, a new model named SWIM, a short-window convolution neural network (CNN) integrated with Mamba, is proposed for identifying the locus of auditory attention (left or right) from electroencephalography (EEG) signals without relying on speech envelopes. SWIM consists of two parts. The first is a short-window CNN (SWCNN), which acts as a short-term EEG feature extractor and achieves a final accuracy of 84.9% in the leave-one-speaker-out setup on the widely used KUL dataset. This improvement is due to the use of an improved CNN structure, data augmentation, multitask training, and model combination. The second part, Mamba, is a sequence model first applied to auditory spatial attention decoding to leverage the long-term dependency from previous SWCNN time steps. By joint training SWCNN and Mamba, the proposed SWIM structure uses both short-term and long-term information and achieves an accuracy of 86.2%, which reduces the classification errors by a relative 31.0% compared to the previous state-of-the-art result.
ID  - discovery10205071
PB  - IEEE
UR  - https://doi.org/10.1109/slt61566.2024.10832311
ER  -