skip to main content
10.1145/3277104.3277105acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccbdConference Proceedingsconference-collections
research-article

Less is More: With a 280-character limit, Twitter Provides a Valuable Source for Detecting Self-reported Flu Cases

Published: 08 September 2018 Publication History

Abstract

People in social media post massive amounts of different types of data including text messages, photos, and links. They share their personal opinions, feelings, and even their health status. The high volume of health-related tweets can be used as a tool to track the activities of different infectious diseases. In this paper, we describe our work to process Twitter data to detect self-reported cases of the flu using supervised machine learning methods. The results obtained on a large set of tweets posted in English during the winter season prove that machine learning classifiers are effective in detecting possible self-reported flu cases.

References

[1]
Harshavardhan Achrekar, Avinash Gandhe, Ross Lazarus, Ssu-Hsin Yu, and Benyuan Liu. 2011. Predicting flu trends using twitter data. In Computer Communications Workshops (INFOCOM WKSHPS), 2011 IEEE Conference on. IEEE, 702--707.
[2]
Aramaki, Eiji and Maskawa, Sachiko and Morita, Mizuki. 2011. Twitter catches the flu: detecting influenza epidemics using Twitter. Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, 1568--1576.
[3]
Leo Breiman. 2001. Random forests. Machine learning 45, 1(2001), 5--32.
[4]
Chew, Cynthia and Eysenbach, Gunther.2010. Pandemics in the age of Twitter: content analysis of Tweets during the 2009 H1N1 outbreak. PloS one. Public Library of Science, 5, 11(2010), e14118.
[5]
Corley, Courtney and Mikler, Armin R and Singh, Karan P and Cook, Diane J. 2009. Monitoring Influenza Trends through Mining Social Media. In Proceedings of the 2009 International Conference on Bioinformatics and Computational Biology (BIOCOMP09). 340--346.
[6]
Aron Culotta. 2010. Towards detecting influenza epidemics by analyzing Twitter messages. In Proceedings of the first workshop on social media analytics. ACM, 115--122.
[7]
Ginsberg, Jeremy and Mohebbi, Matthew H and Patel, Rajan S and Brammer, Lynnette and Smolinski, Mark S and Brilliant, Larry. 2009. Detecting influenza epidemics using search engine query data. Nature 457,7232(2009), 1012--1014.
[8]
He, Haibo and Garcia, Edwardo A. 2009. Learning from imbalanced data. IEEE Transactions on knowledge and data engineering 21, 9(2009), 1263--1284.
[9]
A Danielle Iuliano, Katherine M Roguski, Howard H Chang, David J Muscatello, Rakhee Palekar, Stefano Tempia, Cheryl Cohen, Jon Michael Gran, Dena Schanzer, Benjamin J Cowling, et al. 2018. Estimates of global seasonal infulenza-associated respiratory mortality: a modelling study. The Lancet 391, 10127 (2018), 1285--1300.
[10]
Joachims, Thorsten. 1998. Text categorization with support vector machines: Learning with many relevant features. European conference on machine learning. Springer, 137--142.
[11]
Daniel Klein et al. 2017. Assessing inter-rater agreement in Stata. German Stata Users' Group Meetings 2017. Stata Users Group.
[12]
Krieck, Manuela and Dreesman, Johannes and Otrusina, Lubomir and Denecke, Kerstin. 2011. A new age of public health: Identifying disease outbreaks by analyzing tweets. In Proceedings of Health Web-Science Workshop, ACM Web Science Conference. ACM.
[13]
Kwak, Haewoon and Lee, Changhyun and Park, Hosung and Moon, Sue. 2010. What is Twitter, a social network or a news media? Proceedings of the 19th international conference on World wide web. ACM, 591--600.
[14]
Lee, Kathy and Agrawal, Ankit and Choudhary, Alok. 2013. Real-time disease surveillance using twitter data: demonstration on flu and cancer Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 1474--1477.
[15]
Paul, Michael J and Dredze, Mark and Broniatowski, David. 2014. Twitter improves influenza forecasting. PLoS currents 6(2014). Public Library of Science.
[16]
Yom-Tov, Elad and Borsa, Diana and Cox, Ingemar J and McKendry, Rachel A. 2014. Detecting disease outbreaks in mass gatherings using Internet data. Journal of medical Internet research 16, 6(2014).

Cited By

View all
  • (2024)Enhancing Hajj and Umrah Rituals and Crowd Management Through AI Technologies: A Comprehensive Survey of Applications and Future DirectionsIEEE Access10.1109/ACCESS.2024.348792312(161820-161841)Online publication date: 2024
  • (2024)Deep learning approach to detect cyberbullying on twitterMultimedia Tools and Applications10.1007/s11042-024-19869-3Online publication date: 23-Jul-2024
  • (2023)Investigating and Analyzing Self-Reporting of Long COVID on Twitter: Findings from Sentiment AnalysisApplied System Innovation10.3390/asi60500926:5(92)Online publication date: 12-Oct-2023
  • Show More Cited By

Index Terms

  1. Less is More: With a 280-character limit, Twitter Provides a Valuable Source for Detecting Self-reported Flu Cases

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICCBD '18: Proceedings of the 2018 International Conference on Computing and Big Data
    September 2018
    103 pages
    ISBN:9781450365406
    DOI:10.1145/3277104
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 September 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Big Data
    2. Classification
    3. Flu
    4. Infectious Disease
    5. Influenza
    6. Internet Data
    7. Machine Learning
    8. Twitter

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICCBD '18

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)12
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 07 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Enhancing Hajj and Umrah Rituals and Crowd Management Through AI Technologies: A Comprehensive Survey of Applications and Future DirectionsIEEE Access10.1109/ACCESS.2024.348792312(161820-161841)Online publication date: 2024
    • (2024)Deep learning approach to detect cyberbullying on twitterMultimedia Tools and Applications10.1007/s11042-024-19869-3Online publication date: 23-Jul-2024
    • (2023)Investigating and Analyzing Self-Reporting of Long COVID on Twitter: Findings from Sentiment AnalysisApplied System Innovation10.3390/asi60500926:5(92)Online publication date: 12-Oct-2023
    • (2023)Uncovering the Complexity of Perinatal Polysubstance Use Patterns on X: A Mixed Methods Approach (Preprint)Journal of Medical Internet Research10.2196/53171Online publication date: 28-Sep-2023
    • (2023)Efficient parameter tuning of neural foundation models for drug perspective prediction from unstructured socio-medical dataEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.106214123:PAOnline publication date: 1-Aug-2023
    • (2022)Solving Hajj and Umrah Challenges Using Information and Communication Technology: A SurveyIEEE Access10.1109/ACCESS.2022.319085310(75404-75427)Online publication date: 2022
    • (2021)Predicting Vaccine Hesitancy and Vaccine Sentiment Using Topic Modeling and Evolutionary OptimizationNatural Language Processing and Information Systems10.1007/978-3-030-80599-9_23(255-263)Online publication date: 20-Jun-2021
    • (2020)Kemampuan New Media Literacy Remaja dalam Mengenali Cyber Sexual Harassment di SurabayaPalimpsest: Jurnal Ilmu Informasi dan Perpustakaan10.20473/pjil.v11i2.2419711:2(69)Online publication date: 30-Dec-2020

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media