UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

A speaker adaptive DNN training approach for speaker-independent acoustic inversion

Badino, L; Franceschi, L; Arora, R; Donini, M; Pontil, M; (2017) A speaker adaptive DNN training approach for speaker-independent acoustic inversion. In: Lacerda, F, (ed.) Proceedings of Interspeech 2017. (pp. pp. 984-988). International Speech Communication Association (ISCA): Stockholm, Sweden. Green open access

[img]
Preview
Text
0804.PDF - Published version

Download (602kB) | Preview

Abstract

We address the speaker-independent acoustic inversion (AI) problem, also referred to as acoustic-to-articulatory mapping. The scarce availability of multi-speaker articulatory data makes it difficult to learn a mapping which generalizes from a limited number of training speakers and reliably reconstructs the articulatory movements of unseen speakers. In this paper, we propose a Multi-task Learning (MTL)-based approach that explicitly separates the modeling of each training speaker AI peculiarities from the modeling of AI characteristics that are shared by all speakers. Our approach stems from the well known Regularized MTL approach and extends it to feed-forward deep neural networks (DNNs). Given multiple training speakers, we learn for each an acoustic-to-articulatory mapping represented by a DNN. Then, through an iterative procedure, we search for a canonical speaker-independent DNN that is "similar" to all speaker-dependent DNNs. The degree of similarity is controlled by a regularization parameter. We report experiments on the University of Wisconsin X-ray Microbeam Database under different training/testing experimental settings. The results obtained indicate that our MTL-trained canonical DNN largely outperforms a standardly trained (i.e., single task learning-based) speaker independent DNN.

Type: Proceedings paper
Title: A speaker adaptive DNN training approach for speaker-independent acoustic inversion
Event: Interspeech 2017
Open access status: An open access version is available from UCL Discovery
DOI: 10.21437/Interspeech.2017-804
Publisher version: https://www.isca-speech.org/archive/Interspeech_20...
Language: English
Additional information: This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: acoustic inversion, acoustic-to-articulatory mapping, multi-task learning, XRMB
UCL classification: UCL > Provost and Vice Provost Offices
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10046566
Downloads since deposit
60Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item