Skip to main content

Estimating County Health Indices Using Graph Neural Networks

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1127))

Abstract

Population health analytics is fundamental to developing responsive public health promotion programs. A traditional method to interpret health statistics at population level is analyzing data aggregated from individuals, typically through telephone surveys. Recent studies have found that social media can be utilized as an alternative population health surveillance system, providing quality and timely data at virtually no cost. In this paper, we further investigate the use of social media to the task of population health estimation, based on a graph neural network approach. Specifically, we first introduce a graph modeling method to construct the representation of each county as a graph of interactions between health-related features in the community. We then adopt a graph neural network model to learn the population health representation, ended by a regression layer, to estimate the health indices. We validate our proposed method by large-scale experiments on Twitter data for the task of predicting health indices of the US counties. Empirical results show a significant correlation with the reported health statistics, up to a Spearman correlation coefficient (\(\rho \)) value of 0.69, and that our graph-based approach outperforms the existing methods. These promising results also suggest potential application of graph-based models to a range of societal-level analytics tasks through social media.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://www.cdc.gov/chronicdisease/resources/publications/aag/brfss.htm.

  2. 2.

    https://www.cdc.gov/brfss/.

  3. 3.

    https://www.cdc.gov/brfss/index.html.

References

  1. Andalibi, N., Ozturk, P., Forte, A.: Depression-related imagery on Instagram. In: Proceedings of the ACM Conference Companion on Computer Supported Cooperative Work & Social Computing, pp. 231–234 (2015)

    Google Scholar 

  2. Atwood, J., Towsley, D.: Diffusion-convolutional neural networks. In: Proceedings of the Advances in Neural Information Processing Systems, pp. 1993–2001 (2016)

    Google Scholar 

  3. Bagroy, S., Kumaraguru, P., De Choudhury, M.: A social media based index of mental well-being in college campuses. In: Proceedings of the CHI Conference on Human factors in Computing Systems, pp. 1634–1646. ACM (2017)

    Google Scholar 

  4. Belkin, M., Niyogi, P.: Towards a theoretical foundation for Laplacian-based manifold methods. In: Proceedings of the International Conference on Computational Learning Theory, pp. 486–500 (2005)

    Chapter  Google Scholar 

  5. Bruna, J., Zaremba, W., Szlam, A., LeCun, Y.: Spectral networks and locally connected networks on graphs. arXiv preprint arXiv:1312.6203 (2013)

  6. Chen, M.K.: The effect of language on economic behavior: evidence from savings rates, health behaviors, and retirement assets. Am. Econ. Rev. 103(2), 690–731 (2013)

    Article  Google Scholar 

  7. Culotta, A.: Estimating county health statistics with Twitter. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1335–1344 (2014)

    Google Scholar 

  8. De Choudhury, M., Counts, S., Horvitz, E.: Social media as a measurement tool of depression in populations. In: Proceedings of the Annual ACM Web Science Conference, pp. 47–56 (2013)

    Google Scholar 

  9. De Choudhury, M., Gamon, M., Counts, S., Horvitz, E.: Predicting depression via social media. In: Proceedings of the International AAAI Conference on Weblogs and Social Media, pp. 128–137 (2013)

    Google Scholar 

  10. Defferrard, M., Bresson, X., Vandergheynst, P.: Convolutional neural networks on graphs with fast localized spectral filtering. In: Proceedings of the Advances in Neural Information Processing Systems, pp. 3844–3852 (2016)

    Google Scholar 

  11. Dittrich, J., Quiané-Ruiz, J.-A.: Efficient big data processing in Hadoop MapReduce. Proc. VLDB Endow. 5(12), 2014–2015 (2012)

    Article  Google Scholar 

  12. Dredze, M., Paul, M.J.: Natural language processing for health and social media. IEEE Intell. Syst. 29(2), 64–67 (2014)

    Google Scholar 

  13. Gottschalk, L.A., Gleser, G.C.: The Measurement of Psychological States Through the Content Analysis of Verbal Behavior. University of California Press, Berkeley (1979)

    Google Scholar 

  14. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: ICLR (2017)

    Google Scholar 

  15. Mowery, D., Bryan, C., Conway, M.: Feature studies to inform the classification of depressive symptoms from Twitter data for population health. arXiv:1701.08229 (2017)

  16. Nguyen, T., et al.: Using spatiotemporal distribution of geocoded Twitter data to predict US county-level health indices. Future Gener. Comput. Syst. (2018)

    Google Scholar 

  17. Nguyen, T., et al.: Kernel-based features for predicting population health indices from geocoded social media data. Decis. Support Syst. 102, 22–31 (2017)

    Article  Google Scholar 

  18. Nguyen, T., et al.: Prediction of population health indices from social media using kernel-based textual and temporal features. In: Proceedings of the International Conference on World Wide Web Companion, pp. 99–107 (2017)

    Google Scholar 

  19. Niepert, M., Ahmed, M., Kutzkov, K.: Learning convolutional neural networks for graphs. In: Proceedings of the International Conference on Machine Learning, pp. 2014–2023 (2016)

    Google Scholar 

  20. Paul, M.J., Dredze, M.: You are what you tweet: analysing Twitter for public health. In: Processing of the International AAAI Conference on Weblogs and Social Media (2011)

    Google Scholar 

  21. Paul, M.J., Dredze, M.: A model for mining public health topics from Twitter. Health 11, 16–6 (2012)

    Google Scholar 

  22. Pennebaker, J.W., Beall, S.K.: Confronting a traumatic event: toward an understanding of inhibition and disease. J. Abnorm. Psychol. 95(3), 274 (1986)

    Article  Google Scholar 

  23. Pennebaker, J.W., Booth, R.J., Boyd, R.L., Francis, M.E.: Linguistic Inquiry and Word Count: LIWC 2015 [Computer software]. Pennebaker Conglomerates Inc. (2015)

    Google Scholar 

  24. Pennebaker, J.W., Francis, M.E., Booth, R.J.: Linguistic inquiry and word count: LIWC 2001. Mahway: Lawrence Erlbaum Associates, vol. 71, no. 2001, p. 2001 (2001)

    Google Scholar 

  25. Pennebaker, J.W., Mehl, M.R., Niederhoffer, K.G.: Psychological aspects of natural language use: our words, our selves. Ann. Rev. Psychol. 54(1), 547–577 (2003)

    Article  Google Scholar 

  26. Reece, A.G., Danforth, C.M.: Instagram photos reveal predictive markers of depression. EPJ Data Sci. 6(1), 15 (2017)

    Article  Google Scholar 

  27. Salathe, M., et al.: Digital epidemiology. PLoS Comput. Biol. 8(7), e1002616 (2012)

    Article  Google Scholar 

  28. Schwartz, H.A., et al.: Characterizing geographic variation in well-being using tweets. In: Proceedings of the International AAAI Conference on Weblogs and Social Media, pp. 583–591 (2013)

    Google Scholar 

  29. Veličković, P., Cucurull, G., Casanova, A., Lio, P., Bengio, Y., Romero, A.: Graph attention networks. In: ICLR (2018)

    Google Scholar 

  30. Von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)

    Article  MathSciNet  Google Scholar 

  31. Xu, K., Hu, W., Leskovec, J., Jegelka, S.: How powerful are graph neural networks? In: ICLR (2019)

    Google Scholar 

  32. Zaharia, M., et al.: Fast and interactive analytics over Hadoop data with Spark. Usenix Login 37(4), 45–51 (2012)

    MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hung Nguyen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nguyen, H., Nguyen, D.T., Nguyen, T. (2019). Estimating County Health Indices Using Graph Neural Networks. In: Le, T., et al. Data Mining. AusDM 2019. Communications in Computer and Information Science, vol 1127. Springer, Singapore. https://doi.org/10.1007/978-981-15-1699-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-1699-3_6

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-1698-6

  • Online ISBN: 978-981-15-1699-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics