Skip to main content
Log in

Notifiable infectious disease surveillance with data collected by search engine

  • Published:
Journal of Zhejiang University SCIENCE C Aims and scope Submit manuscript

Abstract

Notifiable infectious diseases are a major public health concern in China, causing about five million illnesses and twelve thousand deaths every year. Early detection of disease activity, when followed by a rapid response, can reduce both social and medical impact of the disease. We aim to improve early detection by monitoring health-seeking behavior and disease-related news over the Internet. Specifically, we counted unique search queries submitted to the Baidu search engine in 2008 that contained disease-related search terms. Meanwhile we counted the news articles aggregated by Baidu’s robot programs that contained disease-related keywords. We found that the search frequency data and the news count data both have distinct temporal association with disease activity. We adopted a linear model and used searches and news with 1–200-day lead time as explanatory variables to predict the number of infections and deaths attributable to four notifiable infectious diseases, i.e., scarlet fever, dysentery, AIDS, and tuberculosis. With the search frequency data and news count data, our approach can quantitatively estimate up-to-date epidemic trends 10–40 days ahead of the release of Chinese Centers for Disease Control and Prevention (Chinese CDC) reports. This approach may provide an additional tool for notifiable infectious disease surveillance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Brownstein, J.S., Freifeld, C.C., Reis, B.Y., Mandl, K.D., 2008. Surveillance sans frontiers: Internet-based emerging infectious disease intelligence and the Healthmap project. PLoS Med., 5(7):e151. [doi:10.1371/journal.pmed.0050151]

    Article  Google Scholar 

  • Bundorf, M.K., Wagner, T.H., Singer, S.J., Baker, L.C., 2006. Who searches the Internet for health information? Health Serv. Res., 41:819–836. [doi:10.1111/j.1475-6773.2006.00510.x]

    Article  Google Scholar 

  • Cooper, C.P., Mallon, K.P., Leadbetter, S., Pollack, L.A., Peipins, L.A., 2005. Cancer Internet search activity on a major search engine, United States 2001–2003. J. Med. Internet Res., 7(3):e36. [doi:10.2196/jmir.7.3.e36]

    Article  Google Scholar 

  • Diaz, J.A., Griffith, R.A., Ng, J.J., Reinert, S.E., Friedmann, P.D., Moulton, A.W., 2002. Patients’ use of the Internet for medical information. J. Gener. Intern. Med., 17(3): 180–185. [doi:10.1046/j.1525-1497.2002.10603.x]

    Article  Google Scholar 

  • Ettredge, M., Gerdes, J., Karuga, G., 2005. Using Web-based search data to predict macroeconomic statistics. Commun. ACM, 48:87–92. [doi:10.1145/1096000.1096010]

    Article  Google Scholar 

  • Fox, S., 2006. Pew Internet and American Life Project. Online Health Search. Available from http://www.pewinternet.org/PPF/r/190/report display.asp [Accessed on Apr. 25, 2008].

  • Ginsberg, J., Mohebbi, M.H., Patel, R.S., Brammer, L., Smolinski, M.S., Brilliant, L., 2009. Detecting influenza epidemics using search engine query data. Nature, 457(7232):1012–1014. [doi:10.1038/nature07634]

    Article  Google Scholar 

  • Johnson, H.A., Wagner, M.M., Hogan, W.R., Chapman, W., Olszewski, R.T., Dowling, J., Barnas, G., 2004. Analysis of Web access logs for surveillance of influenza. Stud. Health Technol. Inform., 107:1202–1208.

    Google Scholar 

  • Polgreen, P.M., Chen, Y., Pennock, D.M., Nelson, D., 2008. Using Internet searches for influenza surveillance. Clin. Infect. Dis., 47(11):1443–1448. [doi:10.1086/593098]

    Article  Google Scholar 

  • Wilson, K., Brownstein, J.S., 2009. Early detection of disease outbreaks using the Internet. CMAJ, 180(8). [doi:10.1503/cmaj.090215]

  • Ybarra, M.L., Suman, M., 2006. Help seeking behavior and the Internet: a national survey. Int. J. Med. Inform., 75(1): 29–41. [doi:10.1016/j.ijmedinf.2005.07.029]

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hai-bin Shen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, Xc., Shen, Hb. Notifiable infectious disease surveillance with data collected by search engine. J. Zhejiang Univ. - Sci. C 11, 241–248 (2010). https://doi.org/10.1631/jzus.C0910371

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/jzus.C0910371

Key words

CLC number

Navigation