eprintid: 1469803 rev_number: 39 eprint_status: archive userid: 608 dir: disk0/01/46/98/03 datestamp: 2015-10-09 15:01:00 lastmod: 2021-09-26 22:49:14 status_changed: 2015-10-09 15:01:00 type: article metadata_visibility: show item_issues_count: 0 creators_name: Lampos, V creators_name: Yom-Tov, E creators_name: Pebody, R creators_name: Cox, IJ title: Assessing the impact of a health intervention via user-generated Internet content ispublished: pub divisions: UCL divisions: B04 divisions: C05 divisions: F48 keywords: Gaussian Process, Infectious diseases, Intervention, Search query logs, Social media, Supervised learning, User-generated content note: © The Author(s) 2015. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. abstract: Assessing the effect of a health-oriented intervention by traditional epidemiological methods is commonly based only on population segments that use healthcare services. Here we introduce a complementary framework for evaluating the impact of a targeted intervention, such as a vaccination campaign against an infectious disease, through a statistical analysis of user-generated content submitted on web platforms. Using supervised learning, we derive a nonlinear regression model for estimating the prevalence of a health event in a population from Internet data. This model is applied to identify control location groups that correlate historically with the areas, where a specific intervention campaign has taken place. We then determine the impact of the intervention by inferring a projection of the disease rates that could have emerged in the absence of a campaign. Our case study focuses on the influenza vaccination program that was launched in England during the 2013/14 season, and our observations consist of millions of geo-located search queries to the Bing search engine and posts on Twitter. The impact estimates derived from the application of the proposed statistical framework support conventional assessments of the campaign. date: 2015-07-02 publisher: Kluwer Academic Publishers official_url: http://dx.doi.org/10.1007/s10618-015-0427-9 vfaculties: VENG oa_status: green full_text_type: pub primo: open primo_central: open_green article_type_text: Article in Press verified: verified_manual elements_source: Manually entered elements_id: 1041180 doi: 10.1007/s10618-015-0427-9 lyricists_name: Cox, Ingemar lyricists_name: Lampos, Vasileios lyricists_id: IJCOX77 lyricists_id: VLAMP72 full_text_status: public publication: Data Mining and Knowledge Discovery volume: 29 number: 5 pagerange: 1434-1457 issn: 1384-5810 citation: Lampos, V; Yom-Tov, E; Pebody, R; Cox, IJ; (2015) Assessing the impact of a health intervention via user-generated Internet content. Data Mining and Knowledge Discovery , 29 (5) pp. 1434-1457. 10.1007/s10618-015-0427-9 <https://doi.org/10.1007/s10618-015-0427-9>. Green open access document_url: https://discovery.ucl.ac.uk/id/eprint/1469803/13/Assessing%20the%20impact%20of%20a%20health%20intervention%20via%20user-generated%20Internet%20content.pdf