UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Learnings from curating a trustworthy, well-annotated, and useful dataset of disordered English speech

Jiang, PP; Tobin, J; Tomanek, K; MacDonald, RL; Seaver, K; Cave, R; Ladewig, M; ... Green, JR; + view all (2024) Learnings from curating a trustworthy, well-annotated, and useful dataset of disordered English speech. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. (pp. pp. 2490-2493). ISCA - International Speech Communication Association Green open access

[thumbnail of jiang24_interspeech.pdf]
Preview
PDF
jiang24_interspeech.pdf - Published Version

Download (565kB) | Preview

Abstract

Project Euphonia, a Google initiative, is dedicated to improving automatic speech recognition (ASR) of disordered speech. A central objective of the project is to create a large, high-quality, and diverse speech corpus. This report describes the project's latest advancements in data collection and annotation methodologies, such as expanding speaker diversity in the database, adding human-reviewed transcript corrections and audio quality tags to 350K (of the 1.2M total) audio recordings, and amassing a comprehensive set of metadata (including more than 40 speech characteristic labels) for over 75% of the speakers in the database. We report on the impact of transcript corrections on our machine-learning (ML) research, inter-rater variability of assessments of disordered speech patterns, and our rationale for gathering speech metadata. We also consider the limitations of using automated off-the-shelf annotation methods for assessing disordered speech.

Type: Proceedings paper
Title: Learnings from curating a trustworthy, well-annotated, and useful dataset of disordered English speech
Event: Annual Conference of the International Speech Communication Association, INTERSPEECH 2024
Open access status: An open access version is available from UCL Discovery
DOI: 10.21437/Interspeech.2024-578
Publisher version: https://doi.org/10.21437/interspeech.2024-578
Language: English
Additional information: This version is the version of record. For information on re-use, please refer to the publisher's terms and conditions.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10203956
Downloads since deposit
Loading...
1Download
Download activity - last month
Loading...
Download activity - last 12 months
Loading...
Downloads by country - last 12 months
Loading...

Archive Staff Only

View Item View Item