UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Transforming and evaluating the UK Biobank to the OMOP Common Data Model for COVID-19 research and beyond

Papez, Vaclav; Moinat, Maxim; Voss, Erica A; Bazakou, Sofia; Van Winzum, Anne; Peviani, Alessia; Payralbe, Stefan; ... Denaxas, Spiros; + view all (2023) Transforming and evaluating the UK Biobank to the OMOP Common Data Model for COVID-19 research and beyond. Journal of the American Medical Informatics Association , 30 (1) pp. 103-111. 10.1093/jamia/ocac203. Green open access

[thumbnail of Papez_ocac203.pdf]
Preview
Text
Papez_ocac203.pdf

Download (636kB) | Preview

Abstract

OBJECTIVE: The COVID-19 pandemic has demonstrated the value of real-world data for public health research. International federated analyses are crucial for informing policy makers. Common data models (CDM) are critical for enabling these studies to be performed efficiently. Our objective was to convert the UK Biobank, a study of 500,000 participants with rich genetic and phenotypic data to the Observational Medical Outcomes Partnership (OMOP) CDM. MATERIALS AND METHODS: We converted UK Biobank data to OMOP CDM v. 5.3. We transformedparticipant research data on diseases collected at recruitment and electronic health records (EHR) from primary care, hospitalizations, cancer registrations, and mortality from providers in England, Scotland, and Wales. We performed syntactic and semantic validations and compared comorbidities and risk factors between source and transformed data. RESULTS: We identified 502,505 participants (3,086 with COVID-19) and transformed 690 fields (1,373,239,555 rows) to the OMOP CDM using eight different controlled clinical terminologies and bespoke mappings. Specifically, we transformed self-reported non-cancer illnesses 946,053 (83.91% of all source entries), cancers 37,802 (70.81%), medications 1,218,935 (88.25%), and prescriptions 864,788 (86.96%). In EHR, we transformed 1,3028,182 (99.95%) hospital diagnoses, 6,465,399 (89.2%) procedures, 337,896,333 primary care diagnoses (CTV3, SNOMED-CT), 139,966,587 (98.74%) prescriptions (dm+d) and 77,127 (99.95%) deaths (ICD-10). We observed good concordance across demographic, risk factor, and comorbidity factors between source and transformed data. DISCUSSION AND CONCLUSION: Our study demonstrated that the OMOP CDM can be successfully leveraged to harmonize complex large-scale biobanked studies combining rich multimodal phenotypic data. Our study uncovered several challenges when transforming data from questionnaires to the OMOP CDM which require further research. The transformed UK Biobank resource is a valuable tool that can enable federated research, like COVID-19 studies.

Type: Article
Title: Transforming and evaluating the UK Biobank to the OMOP Common Data Model for COVID-19 research and beyond
Location: England
Open access status: An open access version is available from UCL Discovery
DOI: 10.1093/jamia/ocac203
Publisher version: https://doi.org/10.1093/jamia/ocac203
Language: English
Additional information: Correction issued 27/2/23 (https://doi.org/10.1093/jamia/ocad032). - Copyright © The Author(s) 2022. Published by Oxford University Press on behalf of the American Medical Informatics Association. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
Keywords: Common data model, electronic health records, medical ontologies, omop, phenotyping
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Institute of Health Informatics
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Institute of Health Informatics > Clinical Epidemiology
URI: https://discovery.ucl.ac.uk/id/eprint/10159833
Downloads since deposit
104Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item