UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

ELO-SPHERES intelligibility prediction model for the Clarity Prediction Challenge 2022

Huckvale, Mark; Hilkhuysen, Gaston; (2022) ELO-SPHERES intelligibility prediction model for the Clarity Prediction Challenge 2022. In: Proceedings of the 23rd INTERSPEECH Conference. (pp. pp. 3934-3938). ISCA: Incheon, Korea. Green open access

[thumbnail of is2022clarity.pdf]
Preview
Text
is2022clarity.pdf - Submitted Version

Download (283kB) | Preview

Abstract

This paper describes and evaluates the ELO-SPHERES project sentence intelligibility model for the Clarity Prediction Challenge 2022. The aim of the model is to make predictions of the intelligibility of enhanced speech to hearing impaired listeners. Input to the model are binaural processed audio of short sentences generated in a simulated noisy and reverberant environment together with the original source audio. Output of the model is a prediction of the intelligibility of each sentence in terms of percentage words correct for a known hearing-impaired listener characterized by a pure-tone audiogram. Models are evaluated in terms of the root mean squared error of prediction. We approached this problem in three stages: (i) evaluation of the influences of the scene metadata on scores, (ii) construction of classifiers for estimation of scene metadata from audio, and (iii) training a non-linear regression model on the challenge data and evaluation using 5-fold cross validation. On the test data, a baseline system using only the standard short-time objective intelligibility metric on the better ear achieved a RMS prediction error of 27%, while our model that also took into account given and estimated scene data achieved an RMS error of 22%.

Type: Proceedings paper
Title: ELO-SPHERES intelligibility prediction model for the Clarity Prediction Challenge 2022
Event: Interspeech Conference
Location: Incheon, SOUTH KOREA
Dates: 18 Sep 2022 - 22 Sep 2022
Open access status: An open access version is available from UCL Discovery
DOI: 10.21437/Interspeech.2022-10521
Publisher version: https://doi.org/10.21437/Interspeech.2022-10521
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: speech-in-noise, speech intelligibility, hearing aids, hearing loss, machine learning
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Div of Psychology and Lang Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Div of Psychology and Lang Sciences > Speech, Hearing and Phonetic Sciences
URI: https://discovery.ucl.ac.uk/id/eprint/10181493
Downloads since deposit
7Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item