UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Model-based Parametric Prosody Synthesis with Deep Neural Network

Liu, H; Lu, H; Shao, X; Xu, Y; (2016) Model-based Parametric Prosody Synthesis with Deep Neural Network. In: Interspeech 2016. (pp. pp. 2313-2317). ISCA Green open access

[thumbnail of Yi_Xu_Liu_etAl_Interspeech2016.pdf]
Preview
Text
Yi_Xu_Liu_etAl_Interspeech2016.pdf

Download (711kB) | Preview

Abstract

Conventional statistical parametric speech synthesis (SPSS) captures only frame-wise acoustic observations and computes probability densities at HMM state level to obtain statistical acoustic models combined with decision trees, which is therefore a purely statistical data-driven approach without explicit integration of any articulatory mechanisms found in speech production research. The present study explores an alternative paradigm, namely, model-based parametric prosody synthesis (MPPS), which integrates dynamic mechanisms of human speech production as a core component of F0 generation. In this paradigm, contextual variations in prosody are processed in two separate yet integrated stages: linguistic to motor, and motor to acoustic. Here the motor model is target approximation (TA), which generates syllable-sized F0 contours with only three motor parameters that are associated to linguistic functions. In this study, we simulate this two-stage process by linking the TA model to a deep neural network (DNN), which learns the “linguistic-motor” mapping given the “motor-acoustic” mapping provided by TA-based syllable-wise F0 production. The proposed prosody modeling system outperforms the HMM-based baseline system in both objective and subjective evaluations.

Type: Proceedings paper
Title: Model-based Parametric Prosody Synthesis with Deep Neural Network
Event: Interspeech 2016
Location: San Francisco, California, United States
Dates: 08 September 2016 - 12 September 2016
Open access status: An open access version is available from UCL Discovery
DOI: 10.21437/Interspeech.2016-1325
Publisher version: http://www.isca-speech.org/archive/Interspeech_201...
Language: English
Additional information: Copyright © 2016 ISCA
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Div of Psychology and Lang Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Div of Psychology and Lang Sciences > Speech, Hearing and Phonetic Sciences
URI: https://discovery.ucl.ac.uk/id/eprint/1536164
Downloads since deposit
192Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item