Skip to main content

Identifying Health-Related Topics on Twitter

An Exploration of Tobacco-Related Tweets as a Test Topic

  • Conference paper
Book cover Social Computing, Behavioral-Cultural Modeling and Prediction (SBP 2011)

Abstract

Public health-related topics are difficult to identify in large conversational datasets like Twitter. This study examines how to model and discover public health topics and themes in tweets. Tobacco use is chosen as a test case to demonstrate the effectiveness of topic modeling via LDA across a large, representational dataset from the United States, as well as across a smaller subset that was seeded by tobacco-related queries. Topic modeling across the large dataset uncovers several public health-related topics, although tobacco is not detected by this method. However, topic modeling across the tobacco subset provides valuable insight about tobacco use in the United States. The methods used in this paper provide a possible toolset for public health researchers and practitioners to better understand public health problems through large datasets of conversational data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Armour, B.S., Woolery, T., Malarcher, A., Pechacek, T.F., Husten, C.: Annual Smoking-Attributable Mortality, Years of Potential Life Lost, and Productivity Losses. Morbidity and Mortality Weekly Report 54, 625–628 (2005)

    Google Scholar 

  2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet Allocation. Journal of Machine Learning Research 3, 993–1022 (2003)

    MATH  Google Scholar 

  3. Boyd, D.M., Ellison, N.B.: Social Network Sites: Definition, History, and Scholarship. Journal of Computer-Mediated Communication 13, 210–230 (2008)

    Article  Google Scholar 

  4. Centers for Disease Control and Prevention, http://www.cdc.gov/od/ocphp/nphpsp/essentialphservices.htm

  5. Pear Analytics, http://www.pearanalytics.com/blog/2009/twitter-study-reveals-interesting-results-40-percent-pointless-babble/

  6. Chew, C.M., Eysenbach, G.: Pandemics in the Age of Twitter: Content Analysis of “tweets” During the, H1N1 Outbreak. Public Library of Science 5(11), e14118 (2010) (Paper presented 09/17/09 at Medicine 2.0, Naastricht, NL)

    Google Scholar 

  7. Culotta, A.: Towards Detecting Influenza Epidemics by Analyzing Twitter Messages. In: Proceedings of the KDD Workshop on Social Media Analytics (2010)

    Google Scholar 

  8. Eissenberg, T., Ward, K.D., Smith-Simone, S., Maziak, W.: Waterpipe Tobacco Smoking on a U.S. College Campus: Prevalence and Correlates. Journal of Adolescent Health 42, 526–529 (2008)

    Article  Google Scholar 

  9. Griffiths, T.L., Steyvers, M.: Finding Scientific Topics. Proceedings of the National Academy of Sciences 101, 5228–5235 (2004)

    Article  Google Scholar 

  10. Haythornthwaite, C.: Social Networks and Internet Connectivity Effects. Information, Communication, & Society 8, 125–147 (2005)

    Article  Google Scholar 

  11. Healthy People (2010), http://www.healthypeople.gov/lhi/

  12. Mokdad, A.H., Marks, J.S., Stroup, D.F., Gerberding, J.L.: Actual Causes of Death in the United States. Journal of the American Medical Association 291, 1238–1245 (2004)

    Article  Google Scholar 

  13. Primack, B.A., Aronson, J.D., Agarwal, A.A.: An Old Custom, a New Threat to Tobacco Control. American Journal of Public Health 96, 1339 (2006)

    Article  Google Scholar 

  14. Scanfield, D., Scanfield, V., Larson, E.: Dissemination of Health Information through Social Networks: Twitter and Antibiotics. American Journal of Infection Control 38, 182–188 (2010)

    Article  Google Scholar 

  15. Twitter API documentation, http://dev.twitter.com/doc

  16. U.S. Department of Health and Human Services. The Health Consequences of Smoking: A Report for the Surgeon General. Report, USDHHS, Centers for Disease Control and Prevention, National Center for Chronic Disease Prevention and Health Promotion, Office on Smoking and Health (2004)

    Google Scholar 

  17. Quantcast, http://www.quantcast.com/twitter.com

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Prier, K.W., Smith, M.S., Giraud-Carrier, C., Hanson, C.L. (2011). Identifying Health-Related Topics on Twitter. In: Salerno, J., Yang, S.J., Nau, D., Chai, SK. (eds) Social Computing, Behavioral-Cultural Modeling and Prediction. SBP 2011. Lecture Notes in Computer Science, vol 6589. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19656-0_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19656-0_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19655-3

  • Online ISBN: 978-3-642-19656-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics