Papez, Vaclav;
Moinat, Maxim;
Voss, Erica A;
Bazakou, Sofia;
Van Winzum, Anne;
Peviani, Alessia;
Payralbe, Stefan;
... Denaxas, Spiros; + view all
(2023)
Transforming and evaluating the UK Biobank to the OMOP Common Data Model for COVID-19 research and beyond.
Journal of the American Medical Informatics Association
, 30
(1)
pp. 103-111.
10.1093/jamia/ocac203.
Preview |
Text
Papez_ocac203.pdf Download (636kB) | Preview |
Abstract
OBJECTIVE: The COVID-19 pandemic has demonstrated the value of real-world data for public health research. International federated analyses are crucial for informing policy makers. Common data models (CDM) are critical for enabling these studies to be performed efficiently. Our objective was to convert the UK Biobank, a study of 500,000 participants with rich genetic and phenotypic data to the Observational Medical Outcomes Partnership (OMOP) CDM. MATERIALS AND METHODS: We converted UK Biobank data to OMOP CDM v. 5.3. We transformedparticipant research data on diseases collected at recruitment and electronic health records (EHR) from primary care, hospitalizations, cancer registrations, and mortality from providers in England, Scotland, and Wales. We performed syntactic and semantic validations and compared comorbidities and risk factors between source and transformed data. RESULTS: We identified 502,505 participants (3,086 with COVID-19) and transformed 690 fields (1,373,239,555 rows) to the OMOP CDM using eight different controlled clinical terminologies and bespoke mappings. Specifically, we transformed self-reported non-cancer illnesses 946,053 (83.91% of all source entries), cancers 37,802 (70.81%), medications 1,218,935 (88.25%), and prescriptions 864,788 (86.96%). In EHR, we transformed 1,3028,182 (99.95%) hospital diagnoses, 6,465,399 (89.2%) procedures, 337,896,333 primary care diagnoses (CTV3, SNOMED-CT), 139,966,587 (98.74%) prescriptions (dm+d) and 77,127 (99.95%) deaths (ICD-10). We observed good concordance across demographic, risk factor, and comorbidity factors between source and transformed data. DISCUSSION AND CONCLUSION: Our study demonstrated that the OMOP CDM can be successfully leveraged to harmonize complex large-scale biobanked studies combining rich multimodal phenotypic data. Our study uncovered several challenges when transforming data from questionnaires to the OMOP CDM which require further research. The transformed UK Biobank resource is a valuable tool that can enable federated research, like COVID-19 studies.
Type: | Article |
---|---|
Title: | Transforming and evaluating the UK Biobank to the OMOP Common Data Model for COVID-19 research and beyond |
Location: | England |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.1093/jamia/ocac203 |
Publisher version: | https://doi.org/10.1093/jamia/ocac203 |
Language: | English |
Additional information: | Correction issued 27/2/23 (https://doi.org/10.1093/jamia/ocad032). - Copyright © The Author(s) 2022. Published by Oxford University Press on behalf of the American Medical Informatics Association. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
Keywords: | Common data model, electronic health records, medical ontologies, omop, phenotyping |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Institute of Health Informatics UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Institute of Health Informatics > Clinical Epidemiology |
URI: | https://discovery.ucl.ac.uk/id/eprint/10159833 |
Archive Staff Only
View Item |