UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Functional classification of protein domain superfamilies for protein function annotation

Das, S; (2016) Functional classification of protein domain superfamilies for protein function annotation. Doctoral thesis , UCL (University College London). Green open access

[thumbnail of Sayoni_Das_PhD_Thesis_2016.pdf]
Preview
Text
Sayoni_Das_PhD_Thesis_2016.pdf - Submitted Version

Download (19MB) | Preview

Abstract

Proteins are made up of domains that are generally considered to be independent evolutionary and structural units having distinct functional properties. It is now well established that analysis of domains in proteins provides an effective approach to understand protein function using a `domain grammar'. Towards this end, evolutionarily-related protein domains have been classified into homologous superfamilies in CATH and SCOP databases. An ideal functional sub-classification of the domain superfamilies into `functional families' can not only help in function annotation of uncharacterised sequences but also provide a useful framework for understanding the diversity and evolution of function at the domain level. This work describes the development of a new protocol (FunFHMMer) for identifying functional families in CATH superfamilies that makes use of sequence patterns only and hence, is unaffected by the incompleteness of function annotations, annotation biases or misannotations existing in the databases. The resulting family classification was validated using known functional information and was found to generate more functionally coherent families than other domain-based protein resources. A protein function prediction pipeline was developed exploiting the functional annotations provided by the domain families which was validated by a database rollback benchmark set of proteins and an independent assessment by CAFA 2. The functional classification was found to capture the functional diversity of superfamilies well in terms of sequence, structure and the protein-context. This aided studies on evolution of protein domain function both at the superfamily level and in specific proteins of interest. The conserved positions in the functional family alignments were found to be enriched in catalytic site residues and ligand-binding site residues which led to the development of a functional site prediction tool. Lastly, the function prediction tools were assessed for annotation of moonlighting functions of proteins and a classification of moonlighting proteins was proposed based on their structure-function relationships.

Type: Thesis (Doctoral)
Title: Functional classification of protein domain superfamilies for protein function annotation
Event: UCL (University College London)
Open access status: An open access version is available from UCL Discovery
Language: English
Keywords: protein function, function prediction, function diversity, protein evolution, protein family, protein superfamily, bioinformatics, protein classification, protein function annotation, moonlighting proteins, funfhmmer, cath funfams, protein function classification
UCL classification: UCL
UCL > Provost and Vice Provost Offices
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences > Structural and Molecular Biology
URI: https://discovery.ucl.ac.uk/id/eprint/1519640
Downloads since deposit
386Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item