skip to main content
10.1145/2740908.2741729acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Towards a Data-driven Approach to Identify Crisis-Related Topics in Social Media Streams

Published: 18 May 2015 Publication History

Abstract

While categorizing any type of user-generated content online is a challenging problem, categorizing social media messages during a crisis situation adds an additional layer of complexity, due to the volume and variability of information, and to the fact that these messages must be classified as soon as they arrive. Current approaches involve the use of automatic classification, human classification, or a mixture of both. In these types of approaches, there are several reasons to keep the number of information categories small and updated, which we examine in this article. This means at the onset of a crisis an expert must select a handful of information categories into which information will be categorized. The next step, as the crisis unfolds, is to dynamically change the initial set as new information is posted online. In this paper, we propose an effective way to dynamically extract emerging, potentially interesting, new categories from social media data.

References

[1]
Z. Ashktorab, C. Brown, M. Nandi, and A. Culotta. Tweedr: Mining twitter to inform disaster response, 2014.
[2]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. the Journal of machine Learning research, 3:993--1022, 2003.
[3]
A. Bruns, J. E. Burgess, K. Crawford, and F. Shaw.# qldfloods and@ qpsmedia: Crisis communication on twitter in the 2011 south east queensland floods, 2012.
[4]
J. D. Fraustino, B. Liu, and Y. Jin. Social media use during disasters: A review of the knowledge base and gaps. National Consortium for the Study of Terrorism and Responses to Terrorism, 2012.
[5]
A. L. Hughes and L. Palen. Twitter adoption and use in mass convergence and emergency events. International Journal of Emergency Management, 6(3):248--260, 2009.
[6]
M. Imran, C. Castillo, F. Diaz, and S. Vieweg. Processing social media messages in mass emergency: A survey. arXiv preprint arXiv:1407.7071, 2014.
[7]
M. Imran, C. Castillo, J. Lucas, P. Meier, and S. Vieweg. Aidr: Artificial intelligence for disaster response. In Proceedings of the companion publication of the 23rd international conference on World wide web companion, pages 159--162. International World Wide Web Conferences Steering Committee, 2014.
[8]
M. Imran, C. Castillo, J. Lucas, M. Patrick, and J. Rogstadius. Coordinating human and machine intelligence to classify microblog communications in crises. Proc. of ISCRAM, 2014.
[9]
M. Imran, S. M. Elbassuoni, C. Castillo, F. Diaz, and P. Meier. Extracting information nuggets from disaster-related messages in social media. Proc. of ISCRAM, Baden-Baden, Germany, 2013.
[10]
M. Imran, I. Lykourentzou, and C. Castillo. Engineering crowdsourced stream processing systems. arXiv preprint arXiv:1310.5463, 2013.
[11]
K. Kireyev, L. Palen, and K. Anderson. Applications of topics models to analysis of disaster-related twitter data. In NIPS Workshop on Applications for Topic Models: Text and Beyond, volume 1, 2009.
[12]
G. A. Miller. The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychological review, 63(2):81, 1956.
[13]
A. Olteanu, S. Vieweg, and C. Castillo. What to expect when the unexpected happens: Social media communications across crises. In In Proc. of 18th ACM Computer Supported Cooperative Work and Social Computing (CSCW'15), 2015.
[14]
K. W. Prier, M. S. Smith, C. Giraud-Carrier, and C. L. Hanson. Identifying health-related topics on twitter. In Social computing, behavioral-cultural modeling and prediction, pages 18--25. Springer, 2011.
[15]
S. Roy Chowdhury, M. Imran, M. R. Asghar, S. Amer-Yahia, and C. Castillo. Tweet4act: Using incident-specific profiles for classifying crisis-related messages. In 10th International ISCRAM Conference, 2013.
[16]
K. Starbird, L. Palen, A. L. Hughes, and S. Vieweg. Chatter on the red: what hazards threat reveals about the social life of microblogged information. In Proceedings of the 2010 ACM conference on Computer supported cooperative work, pages 241--250. ACM, 2010.
[17]
S. E. Vieweg. Situational awareness in mass emergency: A behavioral and linguistic analysis of microblogged communications, 2012.
[18]
I. Zliobaite, A. Bifet, G. Holmes, and B. Pfahringer. Moa concept drift active learning strategies for streaming data. In WAPA, pages 48--55, 2011.

Cited By

View all
  • (2025)Examining hurricane–related social media topics longitudinally and at scale: A transformer-based approachPLOS ONE10.1371/journal.pone.031685220:1(e0316852)Online publication date: 24-Jan-2025
  • (2025)Automatic Seed Word Selection for Topic ModelingIEEE Access10.1109/ACCESS.2025.354041013(31269-31285)Online publication date: 2025
  • (2024)OntoDSumm: Ontology-Based Tweet Summarization for Disaster EventsIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.326602511:2(2724-2739)Online publication date: Apr-2024
  • Show More Cited By

Index Terms

  1. Towards a Data-driven Approach to Identify Crisis-Related Topics in Social Media Streams

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      WWW '15 Companion: Proceedings of the 24th International Conference on World Wide Web
      May 2015
      1602 pages
      ISBN:9781450334730
      DOI:10.1145/2740908

      Sponsors

      • IW3C2: International World Wide Web Conference Committee

      In-Cooperation

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 18 May 2015

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. information types
      2. social media content analysis
      3. stream classification
      4. text classification

      Qualifiers

      • Research-article

      Conference

      WWW '15
      Sponsor:
      • IW3C2

      Acceptance Rates

      Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)11
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 05 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)Examining hurricane–related social media topics longitudinally and at scale: A transformer-based approachPLOS ONE10.1371/journal.pone.031685220:1(e0316852)Online publication date: 24-Jan-2025
      • (2025)Automatic Seed Word Selection for Topic ModelingIEEE Access10.1109/ACCESS.2025.354041013(31269-31285)Online publication date: 2025
      • (2024)OntoDSumm: Ontology-Based Tweet Summarization for Disaster EventsIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.326602511:2(2724-2739)Online publication date: Apr-2024
      • (2024)Topic Modeling Based Clustering of Disaster Tweets Using BERTopic2024 MIT Art, Design and Technology School of Computing International Conference (MITADTSoCiCon)10.1109/MITADTSoCiCon60330.2024.10575555(1-6)Online publication date: 25-Apr-2024
      • (2024)ADSumm: annotated ground-truth summary datasets for disaster tweet summarizationSocial Network Analysis and Mining10.1007/s13278-024-01323-914:1Online publication date: 5-Aug-2024
      • (2023)Review and Application of Knowledge Graph in Crisis ManagementInternational Journal of Software Engineering and Knowledge Engineering10.1142/S021819402330003834:03(393-425)Online publication date: 18-Nov-2023
      • (2023)Rise of social bots: The impact of social bots on public opinion dynamics in public health emergencies from an information ecology perspectiveTelematics and Informatics10.1016/j.tele.2023.10205185(102051)Online publication date: Nov-2023
      • (2023)Exploring the impact of sentiment on multi-dimensional information dissemination using COVID-19 data in ChinaComputers in Human Behavior10.1016/j.chb.2023.107733144:COnline publication date: 1-Jul-2023
      • (2022)Influence of information attributes on information dissemination in public health emergenciesHumanities and Social Sciences Communications10.1057/s41599-022-01278-29:1Online publication date: 5-Aug-2022
      • (2021)Social Cohesion: Mitigating Societal Risk in Case Studies of Digital Media in Hurricanes Harvey, Irma, and MariaRisk Analysis10.1111/risa.1382042:8(1686-1703)Online publication date: 8-Sep-2021
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media