UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Reproducible disease phenotyping at scale: Example of coronary artery disease in UK Biobank

Patel, Riyaz S; Denaxas, Spiros; Howe, Laurence J; Eggo, Rosalind M; Shah, Anoop D; Allen, Naomi E; Danesh, John; ... Hemingway, Harry; + view all (2022) Reproducible disease phenotyping at scale: Example of coronary artery disease in UK Biobank. PLoS One , 17 (4) , Article e0264828. 10.1371/journal.pone.0264828. Green open access

[thumbnail of Denaxas_journal.pone.0264828.pdf]
Preview
Text
Denaxas_journal.pone.0264828.pdf

Download (1MB) | Preview

Abstract

IMPORTANCE: A lack of internationally agreed standards for combining available data sources at scale risks inconsistent disease phenotyping limiting research reproducibility. OBJECTIVE: To develop and then evaluate if a rules-based algorithm can identify coronary artery disease (CAD) sub-phenotypes using electronic health records (EHR) and questionnaire data from UK Biobank (UKB). DESIGN: Case-control and cohort study. SETTING: Prospective cohort study of 502K individuals aged 40-69 years recruited between 2006-2010 into the UK Biobank with linked hospitalization and mortality data and genotyping. PARTICIPANTS: We included all individuals for phenotyping into 6 predefined CAD phenotypes using hospital admission and procedure codes, mortality records and baseline survey data. Of these, 408,470 unrelated individuals of European descent had a polygenic risk score (PRS) for CAD estimated. EXPOSURE: CAD Phenotypes. MAIN OUTCOMES AND MEASURES: Association with baseline risk factors, mortality (n = 14,419 over 7.8 years median f/u), and a PRS for CAD. RESULTS: The algorithm classified individuals with CAD into prevalent MI (n = 4,900); incident MI (n = 4,621), prevalent CAD without MI (n = 10,910), incident CAD without MI (n = 8,668), prevalent self-reported MI (n = 2,754); prevalent self-reported CAD without MI (n = 5,623), yielding 37,476 individuals with any type of CAD. Risk factors were similar across the six CAD phenotypes, except for fewer men in the self-reported CAD without MI group (46.7% v 70.1% for the overall group). In age- and sex- adjusted survival analyses, mortality was highest following incident MI (HR 6.66, 95% CI 6.07-7.31) and lowest for prevalent self-reported CAD without MI at baseline (HR 1.31, 95% CI 1.15-1.50) compared to disease-free controls. There were similar graded associations across the six phenotypes per SD increase in PRS, with the strongest association for prevalent MI (OR 1.50, 95% CI 1.46-1.55) and the weakest for prevalent self-reported CAD without MI (OR 1.08, 95% CI 1.05-1.12). The algorithm is available in the open phenotype HDR UK phenotype library (https://portal.caliberresearch.org/). CONCLUSIONS: An algorithmic, EHR-based approach distinguished six phenotypes of CAD with distinct survival and PRS associations, supporting adoption of open approaches to help standardize CAD phenotyping and its wider potential value for reproducible research in other conditions.

Type: Article
Title: Reproducible disease phenotyping at scale: Example of coronary artery disease in UK Biobank
Location: United States
Open access status: An open access version is available from UCL Discovery
DOI: 10.1371/journal.pone.0264828
Publisher version: https://doi.org/10.1371/journal.pone.0264828
Language: English
Additional information: Copyright: © 2022 Patel et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Institute of Health Informatics
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Institute of Health Informatics > Clinical Epidemiology
URI: https://discovery.ucl.ac.uk/id/eprint/10146673
Downloads since deposit
23Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item