UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Analysis and classification of protein structures

Michie, A. D.; (1997) Analysis and classification of protein structures. Doctoral thesis (Ph.D), UCL (University College London). Green open access

[thumbnail of Analysis_and_classification_of.pdf] Text

Download (11MB)


Classification of protein sequence and structure is one step towards a potential, but as yet unrealised solution of the protein folding problem and associated questions. The work presented in this thesis firstly examines the nature of the distribution of protein domains into 'structural classes' and secondly, describes the production of a complete hierarchical classification of the structures in the Brookhaven Protein Databank and its publication on the World Wide Web. To examine structural class, an automatic method is developed to assign classes to protein domains and the results compared against the original 4-class model of Levitt and Chothia (1976). It is found that the distinction between the α/β and α+β classes is impossible to characterise without using topological information - thus a 3-class model with a single αβ class replacing α/β and α+β more accurately represents the distribution present in larger contemporary datasets. Overall the method is capable of class assignment in 90% of cases. The unclassified remainder is an unavoidable consequence of the continuous nature of the structure distribution observed between the 3 classes. A complete hierarchical classification of protein structures entitled CATH after its first 4 levels (Class, Architecture, Topology and Homologous superfamily) is developed from the original work of Orengo et al. (1993), and the insights afforded by this data into the nature of protein fold space discussed. Finally, the presentation of the CATH and derived data in an electronic format accessible via the World Wide Web is described.

Type: Thesis (Doctoral)
Qualification: Ph.D
Title: Analysis and classification of protein structures
Open access status: An open access version is available from UCL Discovery
Language: English
Additional information: Thesis digitised by ProQuest.
Keywords: Biological sciences; Protein structures
URI: https://discovery.ucl.ac.uk/id/eprint/10101921
Downloads since deposit
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item