UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Sandglasset: A Light Multi-Granularity Self-Attentive Network for Time-Domain Speech Separation

Lam, MWY; Wang, J; Su, D; Yuy, D; (2021) Sandglasset: A Light Multi-Granularity Self-Attentive Network for Time-Domain Speech Separation. In: Proceedings of the ICASSP 2021 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). (pp. pp. 5759-5763). IEEE Green open access

[thumbnail of 2103.00819.pdf]
Preview
Text
2103.00819.pdf - Accepted Version

Download (760kB) | Preview

Abstract

One of the leading single-channel speech separation (SS) models is based on a TasNet with a dual-path segmentation technique, where the size of each segment remains unchanged throughout all layers. In contrast, our key finding is that multi-granularity features are essential for enhancing contextual modeling and computational efficiency. We introduce a self-attentive network with a novel sandglass-shape, namely Sandglasset, which advances the state-of-the-art (SOTA) SS performance at significantly smaller model size and computational cost. Forward along each block inside Sandglasset, the temporal granularity of the features gradually becomes coarser until reaching half of the network blocks, and then successively turns finer towards the raw signal level. We also unfold that residual connections between features with the same granularity are critical for preserving information after passing through the bottleneck layer. Experiments show our Sandglasset with only 2.3M parameters has achieved the best results on two benchmark SS datasets - WSJ0-2mix and WSJ0- 3mix, where the SI-SNRi scores have been improved by absolute 0.6 dB and 2.4 dB, respectively, comparing to the prior SOTA results.

Type: Proceedings paper
Title: Sandglasset: A Light Multi-Granularity Self-Attentive Network for Time-Domain Speech Separation
Event: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Location: Toronto, ON, Canada
Dates: 6th-11th June 2021
ISBN-13: 978-1-7281-7605-5
Open access status: An open access version is available from UCL Discovery
DOI: 10.1109/ICASSP39728.2021.9413837
Publisher version: https://doi.org/10.1109/ICASSP39728.2021.9413837
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher's terms and conditions.
Keywords: Speech separation, multi-granularity, selfattentive network, single-channel
UCL classification: UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL
URI: https://discovery.ucl.ac.uk/id/eprint/10154107
Downloads since deposit
Loading...
34Downloads
Download activity - last month
Loading...
Download activity - last 12 months
Loading...
Downloads by country - last 12 months
Loading...

Archive Staff Only

View Item View Item