Tan, C.W.;
(2006)
Using machine learning for decoy discrimination in protein tertiary structure prediction.
Doctoral thesis , University of London.
PDF
U592470.pdf Download (23MB) |
Abstract
In this thesis, the novelty of using machine learning to identify the low-RMSD structures in decoy discrimination in protein tertiary structure prediction is investigated. More specifically, neural networks are used to learn to recognize low-RMSD structures, using native protein structures as positive training examples, and simulated decoy structures as negative training examples. Simulated decoy structures are derived by reversing the sequences of native structures in the set of positive training examples, and threading the reversed sequences back to the native structures. Various input features, extracted from these native and simulated decoy structures, are used as inputs to the neural networks. These input features are the identities of residue pairs, the separation between the residues along the sequence, the pairwise distance and the relative solvent accessibilities of the residues. Various neural networks are created depending on the amount of input features used. The neural networks are tested against the in-house pairwise potentials of mean force method, as well as against a K-Nearest Neighbours algorithm. The second novel idea of this thesis is to use evolutionary information in the decoy discrimination process. Evolutionary information, in the form of PSI-BLAST profiles, is used as inputs to the neural networks. Results have shown that the best performing neural network is the one that uses in put information comprising of PSI-BLAST profiles of residue pairs, pairwise distance and the relative solvent accessibilities of the residues. This neural network is the best among all methods tested, including the pairwise potentials method, in discriminating the native structures. Therefore this thesis has demonstrated the feasibility of using machine learning, more specifically neural networks, in the problem of decoy discrimination. More significantly, evolutionary information in the form of PSI-BLAST profiles has been success fully used to further improve decoy discrimination, particularly in the discrimination of native structures.
Type: | Thesis (Doctoral) |
---|---|
Title: | Using machine learning for decoy discrimination in protein tertiary structure prediction. |
Identifier: | PQ ETD:592470 |
Open access status: | An open access version is available from UCL Discovery |
Language: | English |
Additional information: | Thesis digitised by ProQuest |
UCL classification: | UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/1445155 |
Archive Staff Only
View Item |