Zou, B;
Lampos, V;
Cox, I;
(2018)
Multi-Task Learning Improves Disease Models from Web Search.
In:
WWW '18 Proceedings of the 2018 World Wide Web Conference.
(pp. pp. 87-96).
International World Wide Web Conferences Steering Committee
Preview |
Text
Lampos_www2018-google-flu-mtl.pdf - Published Version Download (2MB) | Preview |
Abstract
We investigate the utility of multi-task learning to disease surveillance using Web search data. Our motivation is two-fold. Firstly, we assess whether concurrently training models for various geographies - inside a country or across different countries - can improve accuracy. We also test the ability of such models to assist health systems that are producing sporadic disease surveillance reports that reduce the quantity of available training data. We explore both linear and nonlinear models, specifically a multi-task expansion of elastic net and a multi-task Gaussian Process, and compare them to their respective single task formulations. We use influenza-like illness as a case study and conduct experiments on the United States (US) as well as England, where both health and Google search data were obtained. Our empirical results indicate that multi-task learning improves regional as well as national models for the US. The percentage of improvement on mean absolute error increases up to 14.8% as the historical training data is reduced from 5 to 1 year(s), illustrating that accurate models can be obtained, even by training on relatively short time intervals. Furthermore, in simulated scenarios, where only a few health reports (training data) are available, we show that multi-task learning helps to maintain a stable performance across all the affected locations. Finally, we present results from a cross-country experiment, where data from the US improves the estimates for England. As the historical training data for England is reduced, the benefits of multi-task learning increase, reducing mean absolute error by up to 40%.
Type: | Proceedings paper |
---|---|
Title: | Multi-Task Learning Improves Disease Models from Web Search |
Event: | The Web Conference |
Location: | Lyon, France |
Dates: | 23 April 2018 - 27 April 2018 |
ISBN-13: | 978-1-4503-5639-8 |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.1145/3178876.3186050 |
Publisher version: | https://doi.org/10.1145/3178876.3186050 |
Language: | English |
Additional information: | © 2018 IW3C2 (International World Wide Web Conference Committee), published under Creative Commons CC BY 4.0 License. This paper is published under the Creative Commons Attribution 4.0 International (CC BY 4.0) license. Authors reserve their rights to disseminate the work on their personal and corporate Web sites with the appropriate attribution. |
Keywords: | Disease surveillance, Gaussian Processes, multi-task learning, regularized regression, user-generated content, Web search |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/10047721 |




Archive Staff Only
![]() |
View Item |