UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

An analysis of intron positions in relation to protein structure, function and evolution

Whamond, Gordon Stuart; (2004) An analysis of intron positions in relation to protein structure, function and evolution. Doctoral thesis (Ph.D), UCL (University College London). Green open access

[thumbnail of out.pdf] Text

Download (19MB)


The genes of most eukaryotic organisms are organised into coding (exon) and noncoding (intron) sequences. Since their discovery, introns have been the subject of a considerable amount of research focused on trying to locate correlations between intron positions and protein structure. However, in many cases these studies have suffered from using small datasets. In recent years, the amount of available nucleotide sequence data, protein sequence data, and protein structure data has rapidly increased, allowing more reliable and accurate conclusions to be drawn from contemporaneous analyses. The work presented in this thesis is an investigation of intron positions in relation to protein sequence, structure, and function. To facilitate this analysis a number of bespoke databases have been created, which include: an intron database, containing intron positions in protein coding sequences; a protein database, containing protein sequences and annotation such as secondary structure and domain boundaries; and an enzyme database, containing the position of catalytic residues in enzyme chains. This analysis shows that intron positions are non-randomly distributed in relation to amino acids, secondary structure, and solvent accessibility. However, this can be explained principally on the basis of nucleotide bias associated with the location of introns. Most protein domains are joined by regions not associated with introns, although where an association does occur, it is unlikely to be the result of random intron insertion. There is evidence that a small set of domains may have utilised introns during evolution. In enzymes, catalytic residues have no association with introns, and most introns tend not to split pairs of catalytic residues. This is explained by a preference for catalytic residues to be close together in sequence. These analyses suggest that biased intron distributions in relation to proteins are the result of non-random intron insertion. Although introns may have subsequently been utilised in the evolution of proteins, this involvement appears to have been limited.

Type: Thesis (Doctoral)
Qualification: Ph.D
Title: An analysis of intron positions in relation to protein structure, function and evolution
Open access status: An open access version is available from UCL Discovery
Language: English
Additional information: Thesis digitised by ProQuest.
Keywords: Biological sciences; Intron positions
URI: https://discovery.ucl.ac.uk/id/eprint/10102708
Downloads since deposit
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item