%0 Journal Article
%A Canas, LS
%A Sudre, CH
%A Pujol, JC
%A Polidori, L
%A Murray, B
%A Molteni, E
%A Graham, MS
%A Klaser, K
%A Antonelli, M
%A Berry, S
%A Davies, R
%A Nguyen, LH
%A Drew, DA
%A Wolf, J
%A Chan, AT
%A Spector, T
%A Steves, CJ
%A Ourselin, S
%A Modat, M
%D 2021
%F discovery:10136309
%I ELSEVIER
%J Lancet Digital Health
%K Science & Technology, Life Sciences & Biomedicine, Medical Informatics, Medicine, General & Internal, General & Internal Medicine, INDIVIDUALS
%N 9
%P e587-e598
%T Early detection of COVID-19 in the UK using self-reported symptoms: a large-scale, prospective, epidemiological surveillance study
%U https://discovery.ucl.ac.uk/id/eprint/10136309/
%V 3
%X Background Self-reported symptoms during the COVID-19 pandemic have been used to train artificial intelligence  models to identify possible infection foci. To date, these models have only considered the culmination or peak of  symptoms, which is not suitable for the early detection of infection. We aimed to estimate the probability of an  individual being infected with SARS-CoV-2 on the basis of early self-reported symptoms to enable timely self-isolation  and urgent testing.  Methods In this large-scale, prospective, epidemiological surveillance study, we used prospective, observational,  longitudinal, self-reported data from participants in the UK on 19 symptoms over 3 days after symptoms onset and  COVID-19 PCR test results extracted from the COVID-19 Symptom Study mobile phone app. We divided the study  population into a training set (those who reported symptoms between April 29, 2020, and Oct 15, 2020) and a test set  (those who reported symptoms between Oct 16, 2020, and Nov 30, 2020), and used three models to analyse the selfreported  symptoms: the UK’s National Health Service (NHS) algorithm, logistic regression, and the hierarchical  Gaussian process model we designed to account for several important variables (eg, specific COVID-19 symptoms,  comorbidities, and clinical information). Model performance to predict COVID-19 positivity was compared in terms  of sensitivity, specificity, and area under the receiver operating characteristic curve (AUC) in the test set. For the  hierarchical Gaussian process model, we also evaluated the relevance of symptoms in the early detection of COVID-19  in population subgroups stratified according to occupation, sex, age, and body-mass index.  Findings The training set comprised 182 991 participants and the test set comprised 15 049 participants. When trained  on 3 days of self-reported symptoms, the hierarchical Gaussian process model had a higher prediction AUC (0·80  [95% CI 0·80–0·81]) than did the logistic regression model (0·74 [0·74–0·75]) and the NHS algorithm (0·67  [0·67–0·67]). AUCs for all models increased with the number of days of self-reported symptoms, but were still high  for the hierarchical Gaussian process model at day 1 (0·73 [95% CI 0·73–0·74]) and day 2 (0·79 [0·78–0·79]). At  day 3, the hierarchical Gaussian process model also had a significantly higher sensitivity, but a non-statistically lower  specificity, than did the two other models. The hierarchical Gaussian process model also identified different sets of  relevant features to detect COVID-19 between younger and older subgroups, and between health-care workers and  non-health-care workers. When used during different pandemic periods, the model was robust to changes in  populations.  Interpretation Early detection of SARS-CoV-2 infection is feasible with our model. Such early detection is crucial to  contain the spread of COVID-19 and efficiently allocate medical resources.  Funding ZOE, the UK Government Department of Health and Social Care, the Wellcome Trust, the UK Engineering  and Physical Sciences Research Council, the UK National Institute for Health Research, the UK Medical Research  Council, the British Heart Foundation, the Alzheimer’s Society, the Chronic Disease Research Foundation, and the  Massachusetts Consortium on Pathogen Readiness.
%Z This work is licensed under a Creative Commons Attribution 4.0 International License. The images  or other third party material in this article are included in the Creative Commons license,  unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license,  users will need to obtain permission from the license holder to reproduce the material. To view a copy of this  license, visit http://creativecommons.org/licenses/by/4.0/