UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Neural Network Architecture That Combines Temporal and Summative Features for Infant Cry Classification in the Interspeech 2018 Computational Paralinguistics Challenge

Huckvale, M; (2018) Neural Network Architecture That Combines Temporal and Summative Features for Infant Cry Classification in the Interspeech 2018 Computational Paralinguistics Challenge. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2018. (pp. pp. 137-141). International Speech Communication Association (ISCA): Hyderabad, India. Green open access

[img]
Preview
Text
is2018crying-final.pdf - Published version

Download (318kB) | Preview

Abstract

This paper describes the application of a novel deep neural network architecture to the classification of infant vocalisations as part of the Interspeech 2018 Computational Paralinguistics Challenge. Previous approaches to infant cry classification have either applied a statistical classifier to summative features of the whole cry, or applied a syntactic pattern recognition technique to a temporal sequence of features. In this work we explore a deep neural network architecture that exploits both temporal and summative features to make a joint classification. The temporal input comprises centi-second frames of low-level signal features which are input to LSTM nodes, while the summative vector comprises a large set of statistical functionals of the same frames that are input to MLP nodes. The combined network is jointly optimized and evaluated using leave-one-speaker-out cross-validation on the challenge training set. Results are compared to independently-trained temporal and summative networks and to a baseline SVM classifier. The combined model outperforms the other models and the challenge baseline on the training set. While problems remain in finding the best configuration and training protocol for such networks, the approach seems promising for future signal classification tasks.

Type: Proceedings paper
Title: Neural Network Architecture That Combines Temporal and Summative Features for Infant Cry Classification in the Interspeech 2018 Computational Paralinguistics Challenge
Event: Annual Conference of the International Speech Communication Association, INTERSPEECH 2018
Open access status: An open access version is available from UCL Discovery
DOI: 10.21437/Interspeech.2018-1959
Publisher version: https://doi.org/10.21437/Interspeech.2018-1959
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
UCL classification: UCL
UCL > Provost and Vice Provost Offices
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Div of Psychology and Lang Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Div of Psychology and Lang Sciences > Speech, Hearing and Phonetic Sciences
URI: https://discovery.ucl.ac.uk/id/eprint/10060444
Downloads since deposit
62Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item