UCL logo

UCL Discovery

UCL home » Library Services » Electronic resources » UCL Discovery

Evolution of protein superfamilies and bacterial genome size

Ranea, JAG; Buchan, DWA; Thornton, JM; Orengo, CA; (2004) Evolution of protein superfamilies and bacterial genome size. J MOL BIOL , 336 (4) 871 - 887. 10.1016/j.jmb.2003.12.044.

Full text not available from this repository.


We present the structural annotation of 56 different bacterial species based on the assignment of genes to 816 evolutionary superfamilies in the CATH domain structure database. These assignments have enabled us to analyse the recurrence of specific superfamilies within and across the genomes. We have selected the superfamilies that have a very broad representation and therefore appear to be universally distributed in a significant number of bacterial lineages. Occurrence profiles of these universally distributed, superfamilies are compared with genome size in order to estimate the correlation between superfamily duplication and the increase in proteome size. This distinguishes between those size-dependent superfamilies where frequency of occurrence is highly correlated with increase in genome size, and size-independent superfamilies where no correlation is observed.Consideration of the size correlation and the ratio between the mean and the standard deviations for all the superfamily profiles allows more detailed subdivisions and classification of superfamilies. For example, within the size-independent superfamilies, we distinguished a group that are distributed evenly amongst all the genomes. Within the size-dependent superfamilies we differentiated two groups: linearly distributed and non-linearly distributed. Functional annotation using the COG database was performed for all superfamilies in each of these groups, and this revealed significant differences amongst the three sets of superfamilies. Evenly distributed, size-independent domains are shown to be involved primarily in protein translation and biosynthesis. For the size-dependent superfamilies, linearly distributed superfamilies are involved mainly in metabolism, and non-linearly distributed superfamily domains are involved principally in gene regulation. (C) 2003 Elsevier Ltd. All rights reserved.

Type: Article
Title: Evolution of protein superfamilies and bacterial genome size
DOI: 10.1016/j.jmb.2003.12.044
Keywords: protein family, three-dimensional structure, genome size, bacteria, domain distribution, STRUCTURAL GENOMICS, N-ACETYLTRANSFERASES, SIGNAL-TRANSDUCTION, GENE-TRANSFER, PSI-BLAST, DATABASE, SEQUENCES, CLASSIFICATION, CATH, SCOP
URI: http://discovery.ucl.ac.uk/id/eprint/119232
Downloads since deposit
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item