UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Data extraction methods for systematic review (semi)automation: A living systematic review [version 1; peer review: awaiting peer review]

Schmidt, L; Olorisade, BK; McGuinness, LA; Thomas, J; Higgins, JPT; (2021) Data extraction methods for systematic review (semi)automation: A living systematic review [version 1; peer review: awaiting peer review]. F1000Research , 10 p. 401. 10.12688/f1000research.51117.1. Green open access

[thumbnail of e26e3d11-1f92-4c2e-9edb-93174ceebc19_51117_-_lena_schmidt.pdf]
Preview
Text
e26e3d11-1f92-4c2e-9edb-93174ceebc19_51117_-_lena_schmidt.pdf - Published Version

Download (2MB) | Preview

Abstract

Background: The reliable and usable (semi)automation of data extraction can support the field of systematic review by reducing the workload required to gather information about the conduct and results of the included studies. This living systematic review examines published approaches for data extraction from reports of clinical studies. Methods: We systematically and continually search MEDLINE, Institute of Electrical and Electronics Engineers (IEEE), arXiv, and the dblp computer science bibliography databases. Full text screening and data extraction are conducted within an open-source living systematic review application created for the purpose of this review. This iteration of the living review includes publications up to a cut-off date of 22 April 2020. Results: In total, 53 publications are included in this version of our review. Of these, 41 (77%) of the publications addressed extraction of data from abstracts, while 14 (26%) used full texts. A total of 48 (90%) publications developed and evaluated classifiers that used randomised controlled trials as the main target texts. Over 30 entities were extracted, with PICOs (population, intervention, comparator, outcome) being the most frequently extracted. A description of their datasets was provided by 49 publications (94%), but only seven (13%) made the data publicly available. Code was made available by 10 (19%) publications, and five (9%) implemented publicly available tools. Conclusions: This living systematic review presents an overview of (semi)automated data-extraction literature of interest to different types of systematic review. We identified a broad evidence base of publications describing data extraction for interventional reviews and a small number of publications extracting epidemiological or diagnostic accuracy data. The lack of publicly available gold-standard data for evaluation, and lack of application thereof, makes it difficult to draw conclusions on which is the best-performing system for each data extraction target. With this living review we aim to review the literature continually.

Type: Article
Title: Data extraction methods for systematic review (semi)automation: A living systematic review [version 1; peer review: awaiting peer review]
Open access status: An open access version is available from UCL Discovery
DOI: 10.12688/f1000research.51117.1
Publisher version: http://doi.org/10.12688/f1000research.51117.1
Language: English
Additional information: This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
Keywords: Data Extraction, Natural Language Processing, Reproducibility, Systematic Reviews, Text Mining
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Education
UCL > Provost and Vice Provost Offices > School of Education > UCL Institute of Education
UCL > Provost and Vice Provost Offices > School of Education > UCL Institute of Education > IOE - Social Research Institute
URI: https://discovery.ucl.ac.uk/id/eprint/10128803
Downloads since deposit
583Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item