Chen, Y;
Xu, Y;
(2025)
Sequential perception of tone and focus in parallel–A computational simulation.
Speech Communication
, 168
, Article 103173. 10.1016/j.specom.2024.103173.
(In press).
Preview |
Text
Chen_Xu_SpCom2025.pdf - Published Version Download (8MB) | Preview |
Abstract
Speech is produced continuously over time. So, the information it conveys, including intonational functions, also unfolds over time. But many intonational functions are encoded across whole utterances rather than only within certain words. How can perception process speech signals continuously over time, even for communicative functions that are globally encoded? In this study we used computational simulation to test the idea that even for intonational functions with large temporal scopes, it is possible to process f0 contours syllable-by-syllable, and recognize the functions by continuous estimation of progressive probabilistic inference. We trained SVM and GRU models to simulate the perception of Mandarin tone and sentence focus with either syllable-sized or sentence-sized f0 contours as input. The sentence-wide f0 contours are gated at different syllable locations to test the incrementality of the recognition of tone and intonation. We also tested human listeners’ perception of tone and focus with full and fragmented f0 contours from the same dataset to evaluate the validity of the simulated perception. The results showed that the simulated syllable-by-syllable processing of tone and focus generated the closest recognition patterns to human perception. The simulations also show that there is little difference whether tone and focus are recognized separately or as tone-focus combinations, which suggests that despite sharing the same acoustic dimension, the two functions are sufficiently separated from each other in their f0 coding.
Type: | Article |
---|---|
Title: | Sequential perception of tone and focus in parallel–A computational simulation |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.1016/j.specom.2024.103173 |
Publisher version: | https://doi.org/10.1016/j.specom.2024.103173 |
Language: | English |
Additional information: | © 2024 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC license (http://creativecommons.org/licenses/bync/4.0/) |
Keywords: | Tone perception, Focus perception, Computational modeling, Gate recurrent unit (GRU), Sequential speech processing, Co-current processing |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Div of Psychology and Lang Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Div of Psychology and Lang Sciences > Speech, Hearing and Phonetic Sciences |
URI: | https://discovery.ucl.ac.uk/id/eprint/10204315 |
Archive Staff Only
![]() |
View Item |