UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Modelling Segmental Variability for Automatic Speech Recognition

Holmes, Wendy J; (1997) Modelling Segmental Variability for Automatic Speech Recognition. Doctoral thesis (Ph.D), UCL (University College London). Green open access

[thumbnail of 10045590.pdf] Text

Download (10MB)


This thesis describes work developing an approach to automatic speech recognition which incorporates a more realistic underlying model of speech than the currently-successful technique of hidden Markov models (HMMs). Whereas HMM states are associated with individual (frame-based) feature vectors, the new models represent sequences or "segments" of features. The segment models described in this thesis are referred to as "segmental HMMs", and incorporate the concept of trajectories to describe how features change over time together with a novel representation of segmental variability. Extra-segmental variability between different examples of a sub-phonemic speech segment is modelled separately from intra- segmental variability within any one example. The extra-segmental component of the model is represented in terms of variability in the trajectory parameters, and can be regarded as providing a prior constraint on the possible observation sequences that can be generated by the model. The work which forms the basis for the thesis has concentrated on investigating the representation of the two types of variability in relation to characteristics of speech data and to recognition performance. Experiments have demonstrated that a segmental HMM can give improvements in recognition performance, both for a connected-digit recognition task and for a phonetic classification task. However, the model only worked well when the modelling assumptions were a reasonable approximation to the characteristics of real speech. Firstly, it was important that both the extra- and intra-segment model distributions were fairly accurate across all segment durations, in order for the two types of probability to balance appropriately in recognition tasks. In addition, the trajectory descriptions needed to be reasonably accurate, as demonstrated by the finding that segmental HMMs of sub-phone speech segments gave performance advantages when using a linear trajectory representation but not for a static trajectory description. The thesis concludes with a discussion of the experimental findings in relation to several design issues for developing a segmental model that truly reflects the characteristics of real speech signals.

Type: Thesis (Doctoral)
Qualification: Ph.D
Title: Modelling Segmental Variability for Automatic Speech Recognition
Open access status: An open access version is available from UCL Discovery
Language: English
Additional information: Thesis digitised by ProQuest.
Keywords: Language, literature and linguistics; Automatic speech recognition
URI: https://discovery.ucl.ac.uk/id/eprint/10104934
Downloads since deposit
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item