UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Crohn Disease Risk Prediction-Best Practices and Pitfalls with Exome Data

Giollo, M; Jones, DT; Carraro, M; Leonardi, E; Ferrari, C; Tosatto, SC; (2017) Crohn Disease Risk Prediction-Best Practices and Pitfalls with Exome Data. Human Mutation , 38 (9) pp. 1193-1200. 10.1002/humu.23177. Green open access

[thumbnail of humu23177.pdf]
Preview
Text
humu23177.pdf - Accepted Version

Download (1MB) | Preview

Abstract

The Critical Assessment of Genome Interpretation (CAGI) experiment is the first attempt to evaluate the state-of-the-art in genetic data interpretation. Among the proposed challenges, Crohn disease (CD) risk prediction has become the most classic problem spanning three editions. The scientific question is very hard: can anybody assess the risk to develop CD given the exome data alone? This is one of the ultimate goals of genetic analysis, which motivated most CAGI participants to look for powerful new methods. In the 2016 CD challenge we implemented all the best methods proposed in the past editions. This resulted in 10 algorithms, which were evaluated fairly by CAGI organizers. We also used all the data available from CAGI 11 and 13 to maximize the amount of training samples. The most effective algorithms used known genes associated with CD from the literature. No method could evaluate effectively the importance of unannotated variants by using heuristics. As a downside, all CD datasets were strongly affected by sample stratification. This affected the performance reported by assessors. Therefore, we expect that future datasets will be normalized in order to remove population effects. This will improve methods comparison and promote algorithms focused on causal variants discovery.

Type: Article
Title: Crohn Disease Risk Prediction-Best Practices and Pitfalls with Exome Data
Location: United States
Open access status: An open access version is available from UCL Discovery
DOI: 10.1002/humu.23177
Publisher version: http://dx.doi.org/10.1002/humu.23177
Language: English
Additional information: This is the peer reviewed version of the following article: Giollo, M., Jones, D. T., Carraro, M., Leonardi, E., Ferrari, C. and Tosatto, S. C.E. (2017), Crohn Disease Risk Prediction–Best Practices and Pitfalls with Exome Data. Human Mutation. Accepted Author Manuscript. doi:10.1002/humu.23177, which has been published in final form at http://dx.doi.org/10.1002/humu.23177. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving.
Keywords: Crohn Disease, Disease Risk Prediction, Exome Data, Genetic Analysis, Linear Models, Machine Learning, Methods Comparison, Next-Generation Sequencing, SNV evaluation, Variants prioritization
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/1537232
Downloads since deposit
112Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item