UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Pushing the envelope: Evaluating speech rhythm with different envelope extraction techniques

Macintyre, AD; Cai, CQ; Scott, SK; (2022) Pushing the envelope: Evaluating speech rhythm with different envelope extraction techniques. Journal of the Acoustical Society of America , 151 (3) pp. 2002-2026. 10.1121/10.0009844. Green open access

[thumbnail of 10.0009844.pdf]
Preview
Text
10.0009844.pdf - Published Version

Download (5MB) | Preview

Abstract

The amplitude of the speech signal varies over time, and the speech envelope is an attempt to characterise this variation in the form of an acoustic feature. Although tacitly assumed, the similarity between the speech envelope-derived time series and that of phonetic objects (e.g., vowels) remains empirically unestablished. The current paper, therefore, evaluates several speech envelope extraction techniques, such as the Hilbert transform, by comparing different acoustic landmarks (e.g., peaks in the speech envelope) with manual phonetic annotation in a naturalistic and diverse dataset. Joint speech tasks are also introduced to determine which acoustic landmarks are most closely coordinated when voices are aligned. Finally, the acoustic landmarks are evaluated as predictors for the temporal characterisation of speaking style using classification tasks. The landmark that performed most closely to annotated vowel onsets was peaks in the first derivative of a human audition-informed envelope, consistent with converging evidence from neural and behavioural data. However, differences also emerged based on language and speaking style. Overall, the results show that both the choice of speech envelope extraction technique and the form of speech under study affect how sensitive an engineered feature is at capturing aspects of speech rhythm, such as the timing of vowels.

Type: Article
Title: Pushing the envelope: Evaluating speech rhythm with different envelope extraction techniques
Location: United States
Open access status: An open access version is available from UCL Discovery
DOI: 10.1121/10.0009844
Publisher version: https://doi.org/10.1121/10.0009844
Language: English
Additional information: This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: Humans, Language, Phonetics, Speech, Speech Perception, Voice
UCL classification: UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Div of Psychology and Lang Sciences > Institute of Cognitive Neuroscience
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Div of Psychology and Lang Sciences
URI: https://discovery.ucl.ac.uk/id/eprint/10147582
Downloads since deposit
26Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item