UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Probabilistic integration of large Brazilian socioeconomic and clinical databases

Pinto, C; Pita, R; Barbosa, G; Araujo, B; Bertoldo, J; Sena, S; Reis, S; ... Denaxas, S; + view all (2017) Probabilistic integration of large Brazilian socioeconomic and clinical databases. In: Bamidis, PD and Konstantinidis, ST and Rodrigues, PP, (eds.) Proceedings of the 30th IEEE International Symposium on Computer-Based Medical Systems (CBMS). (pp. pp. 515-520). IEEE: New York, USA. Green open access

[thumbnail of CBMS2017_paper_204.pdf]
Preview
Text
CBMS2017_paper_204.pdf - Accepted Version

Download (819kB) | Preview

Abstract

The integration of disparate large and heterogeneous socioeconomic and clinical databases is considered essential to capture and model longitudinal and social aspects of diseases. However, such integration is challenging: databases are stored in disparate locations, make use of different identifiers, have variable data quality, record information in bespoke purpose-specific formats and have different levels of metadata. Novel computational methods are required to integrate them and enable their statistical analyses for epidemiological research purposes. In this paper, we describe a probabilistic approach for constructing a very large population-based cohort comprised of 114 million individuals using linkages between clinical databases from the National Health System and administrative databases from governmental social programmes. We present our data integration model for creating data marts (epidemiological data) and discuss our evaluation results in controlled and uncontrolled scenarios, which demonstrate that our model and tools achieve high accuracy (minimum of 91%) in different probabilistic data integration scenarios.

Type: Proceedings paper
Title: Probabilistic integration of large Brazilian socioeconomic and clinical databases
Event: 30th IEEE International Symposium on Computer-Based Medical Systems (CBMS), 22-24 June 2017, Thessaloniki, Greece
Location: Aristotle Univ Thessaloniki, Thessaloniki, GREECE
Dates: 22 June 2017 - 24 June 2017
ISBN-13: 978-1-5386-1711-3
Open access status: An open access version is available from UCL Discovery
DOI: 10.1109/CBMS.2017.64
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: Data integration; Probabilistic linkage; Health and social care data; Accuracy assessment
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Institute of Health Informatics
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Institute of Health Informatics > Clinical Epidemiology
URI: https://discovery.ucl.ac.uk/id/eprint/10067325
Downloads since deposit
155Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item