UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Consensus templates for protein structure recognition

Sillitoe, Ian; (2002) Consensus templates for protein structure recognition. Doctoral thesis (Ph.D), UCL (University College London). Green open access

[thumbnail of out.pdf] Text
out.pdf

Download (23MB)

Abstract

Molecular biology has moved into the new millennium with the human genome sequenced and publicly available. The challenge now facing the bioinformatics field is to assign structure and functional information to protein sequences generated by this and many other genomic projects. To meet this challenge, several structural genomics initiatives are currently underway with the aim of providing, where possible, a protein structure within homology modelling distance for every known sequence. As a result, structure classification databases will need to provide novel methods in order to cope with this high influx of structures. This thesis presents work on the classification, analysis and recognition of protein structures using the CATII protein structure classification database. Structural similarity is measured by comparing contact maps, or the points of contact between amino acid residues. By examining related structures, it has been possible to identify contacts that have been highly conserved during the process of evolution. Protocols to generate accurate multiple structure alignments and 3D templates based on consensus contact patterns found in these alignments have been developed. Templates have been generated for all homologous superfamilies in CATH to create a library of unique and identifying 'fingerprint' patterns. These templates were applied to the recognition of models generated at an early stage of ah initio protein structure prediction. Scanning these early models against a library of templates describing conserved contacts allowed the most likely superfamily to be identified. An algorithm was also written that performed fold recognition using only a limited set of contacts with the purpose of application to the early stages of experimental NMR structure determination. Finally, the multiple structural alignments have been used to generate a library of hidden Markov models (HMMs). These structure-based sequence profiles were thoroughly benchmarked using a strict dataset of remote homologues and appear to outperform other commonly used sequence methods. This work was generously supported by the Biotechnology and Biological Sciences Research Council.

Type: Thesis (Doctoral)
Qualification: Ph.D
Title: Consensus templates for protein structure recognition
Open access status: An open access version is available from UCL Discovery
Language: English
Additional information: Thesis digitised by ProQuest.
Keywords: Biological sciences; Protein structure
URI: https://discovery.ucl.ac.uk/id/eprint/10102411
Downloads since deposit
36Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item