UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

CATH functional families predict functional sites in proteins

Das, S; Scholes, HM; Sen, N; Orengo, C; (2020) CATH functional families predict functional sites in proteins. Bioinformatics 10.1093/bioinformatics/btaa937. (In press). Green open access

[thumbnail of btaa937.pdf]
Preview
Text
btaa937.pdf - Accepted Version

Download (663kB) | Preview

Abstract

MOTIVATION: Identification of functional sites in proteins is essential for functional characterization, variant interpretation and drug design. Several methods are available for predicting either a generic functional site, or specific types of functional site. Here, we present FunSite, a machine learning predictor that identifies catalytic, ligand-binding and protein-protein interaction functional sites using features derived from protein sequence and structure, and evolutionary data from CATH functional families (FunFams). RESULTS: FunSite's prediction performance was rigorously benchmarked using cross-validation and a holdout dataset. FunSite outperformed other publicly-available functional site prediction methods. We show that conserved residues in FunFams are enriched in functional sites. We found FunSite's performance depends greatly on the quality of functional site annotations and the information content of FunFams in the training data. Finally, we analyse which structural and evolutionary features are most predictive for functional sites. AVAILABILITY: https://github.com/UCL/cath-funsite-predictor. CONTACT: c.orengo@ucl.ac.uk. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Type: Article
Title: CATH functional families predict functional sites in proteins
Location: England
Open access status: An open access version is available from UCL Discovery
DOI: 10.1093/bioinformatics/btaa937
Publisher version: https://doi.org/10.1093/bioinformatics/btaa937
Language: English
Additional information: Copyright © The Author(s) 2020. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences > Structural and Molecular Biology
URI: https://discovery.ucl.ac.uk/id/eprint/10114671
Downloads since deposit
192Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item