A Study of Machine Learning Models on Epidemic Surveillance: Using Query Logs of Search Engines
Date Issued
2010
Date
2010
Author(s)
Fang, Ze-Han
Abstract
Epidemics inevitably result in a large number of deaths and always cause considerable social and economic damage. Epidemic surveillance has thus become an important healthcare research issue. In 2009, Ginsberg et al. observed that the query logs of search engines can be used to estimate the status of epidemics in a timely manner. In this paper, we model epidemic surveillance as a classification problem and employ query statistics from Google to classify the status of a dengue fever epidemic. The query logs of twenty-three dengue-related keywords serve as observations for machine learning and testing, and a number of machine learning models are investigated to evaluate their surveillance performance. Evaluations based on a 5-year real world dataset demonstrate that search engine query logs can be used to construct accurate epidemic status classifiers. Moreover, the learned classifiers generally outperform conventional regression approaches. We also apply various machine learning models, including generative, discriminative, sequential, and non-sequential classification models, to demonstrate their applicability to epidemic surveillance.
Subjects
Text Mining
Classification
Query Log Analysis
SDGs
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-99-R97725031-1.pdf
Size
23.32 KB
Format
Adobe PDF
Checksum
(MD5):e68088163cd6290888f6b1d89e7fdba5
