UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Using machine learning for decoy discrimination in protein tertiary structure prediction.

Tan, C.W.; (2006) Using machine learning for decoy discrimination in protein tertiary structure prediction. Doctoral thesis , University of London. Green open access

[thumbnail of U592470.pdf] PDF
U592470.pdf

Download (23MB)

Abstract

In this thesis, the novelty of using machine learning to identify the low-RMSD structures in decoy discrimination in protein tertiary structure prediction is investigated. More specifically, neural networks are used to learn to recognize low-RMSD structures, using native protein structures as positive training examples, and simulated decoy structures as negative training examples. Simulated decoy structures are derived by reversing the sequences of native structures in the set of positive training examples, and threading the reversed sequences back to the native structures. Various input features, extracted from these native and simulated decoy structures, are used as inputs to the neural networks. These input features are the identities of residue pairs, the separation between the residues along the sequence, the pairwise distance and the relative solvent accessibilities of the residues. Various neural networks are created depending on the amount of input features used. The neural networks are tested against the in-house pairwise potentials of mean force method, as well as against a K-Nearest Neighbours algorithm. The second novel idea of this thesis is to use evolutionary information in the decoy discrimination process. Evolutionary information, in the form of PSI-BLAST profiles, is used as inputs to the neural networks. Results have shown that the best performing neural network is the one that uses in put information comprising of PSI-BLAST profiles of residue pairs, pairwise distance and the relative solvent accessibilities of the residues. This neural network is the best among all methods tested, including the pairwise potentials method, in discriminating the native structures. Therefore this thesis has demonstrated the feasibility of using machine learning, more specifically neural networks, in the problem of decoy discrimination. More significantly, evolutionary information in the form of PSI-BLAST profiles has been success fully used to further improve decoy discrimination, particularly in the discrimination of native structures.

Type: Thesis (Doctoral)
Title: Using machine learning for decoy discrimination in protein tertiary structure prediction.
Identifier: PQ ETD:592470
Open access status: An open access version is available from UCL Discovery
Language: English
Additional information: Thesis digitised by ProQuest
UCL classification: UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/1445155
Downloads since deposit
121Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item