Badino, L;
Franceschi, L;
Arora, R;
Donini, M;
Pontil, M;
(2017)
A speaker adaptive DNN training approach for speaker-independent acoustic inversion.
In: Lacerda, F, (ed.)
Proceedings of Interspeech 2017.
(pp. pp. 984-988).
International Speech Communication Association (ISCA): Stockholm, Sweden.
Preview |
Text
0804.PDF - Published Version Download (602kB) | Preview |
Abstract
We address the speaker-independent acoustic inversion (AI) problem, also referred to as acoustic-to-articulatory mapping. The scarce availability of multi-speaker articulatory data makes it difficult to learn a mapping which generalizes from a limited number of training speakers and reliably reconstructs the articulatory movements of unseen speakers. In this paper, we propose a Multi-task Learning (MTL)-based approach that explicitly separates the modeling of each training speaker AI peculiarities from the modeling of AI characteristics that are shared by all speakers. Our approach stems from the well known Regularized MTL approach and extends it to feed-forward deep neural networks (DNNs). Given multiple training speakers, we learn for each an acoustic-to-articulatory mapping represented by a DNN. Then, through an iterative procedure, we search for a canonical speaker-independent DNN that is "similar" to all speaker-dependent DNNs. The degree of similarity is controlled by a regularization parameter. We report experiments on the University of Wisconsin X-ray Microbeam Database under different training/testing experimental settings. The results obtained indicate that our MTL-trained canonical DNN largely outperforms a standardly trained (i.e., single task learning-based) speaker independent DNN.
Type: | Proceedings paper |
---|---|
Title: | A speaker adaptive DNN training approach for speaker-independent acoustic inversion |
Event: | Interspeech 2017 |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.21437/Interspeech.2017-804 |
Publisher version: | https://www.isca-speech.org/archive/Interspeech_20... |
Language: | English |
Additional information: | This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions. |
Keywords: | acoustic inversion, acoustic-to-articulatory mapping, multi-task learning, XRMB |
UCL classification: | UCL UCL > Provost and Vice Provost Offices UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/10046566 |
Archive Staff Only
![]() |
View Item |