skip to main content
10.1145/2740908.2742125acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Combining Automatic and Manual Approaches: Towards a Framework for Discovering Themes in Disaster-related Tweets

Published:18 May 2015Publication History

ABSTRACT

In this paper, we present a framework that combines automatic and manual approaches to discover themes in disaster-related tweets. As case study, we decided to focus on tweets related to typhoon Haiyan, which caused billions of dollars in damages. We collected tweets from November 2013 to March 2014 and used the local typhoon name "Yolanda" as the filter. Data association was used to expand the tweet set and k-means clustering was then applied. Clusters with high number of instances were subjected to open coding for labeling. The Silhouette indices ranged from 0.27 to 0.50. Analyses reveal that the use of automated Natural Language Processing (NLP) approach has the potential to deal with huge volumes of tweets by clustering frequently occurring words and phrases. This complements the manual approach to surface themes from a more manageable set of tweet pool, allowing for a more nuanced analysis of tweets from a human expert. As application, the themes identified during open coding were used as labels to train a classifier system. Future work could explore on using topic models and focusing on specific content or issues, such as natural calamities and citizen's participation in addressing these.

References

  1. Agrawal, R., T. Imielinski, and A. Swami. 1993. Mining Association Rules between Sets of Items in Large Databases. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Beduya, L. J. and K. J. Espinosa. 2014. Flood-Related Disaster Tweet Classification Using Support Vector Machines. Proceedings of the 10th National Natural Language Processing Research Symposium.Google ScholarGoogle Scholar
  3. Caragea, C., N. McNeese, A. Jaiswal, G. Traylor, H.-W. Kim, P. Mitra, D. Wu, A. Tapia, L. Giles, B. Jansen, and J. Yen. 2011. Classifying Text Messages for Haiti Earthquake. 8th International Conference on Information Systems for Crisis Response and Management.Google ScholarGoogle Scholar
  4. Centeno, D. 2010. Celebrification in Philippine Politics: Exploring the Relationship Between Celebrity Endorsers? Parasociability and the Public Voting Behavior. Social Science Diliman,6(1): 66--85.Google ScholarGoogle Scholar
  5. De Vicente, J., J. Lanchares, and R. Hermida. 2003. Placement by Thermodynamic Simulated Annealing. Physics Letters A, 317(5-6): 415--423.Google ScholarGoogle ScholarCross RefCross Ref
  6. Feldman, R. 2013. Techniques and Applications for Sentiment Analysis. Communications of the ACM, 56(4): 82--89. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Freeman, M. 2011. Fire, Wind and Water: Social Networks in Natural Disasters. Journal of Cases on Information Technology, 13(2): 69--79. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Glaser, B. 1978. The Grounded Theory Perspective II: Description's Remodeling of Grounded Theory Methodology. The Sociology Press, CA.Google ScholarGoogle Scholar
  9. Gonzales, H. A. and K. J. Espinosa. 2014. Community Structure Detection and Analysis in Disaster Related Tweets. Proceedings of the 10th National Natural Language Processing Research Symposium.Google ScholarGoogle Scholar
  10. Hartigan, J. A. and M. A. Wong. 1979. Algorithm AS 136: A K-Means Clustering Algorithm. Journal of the Royal Statistical Society, Series C, 28(1): 100--108.Google ScholarGoogle ScholarCross RefCross Ref
  11. Hartley, J. 2008. Television Truths: Forms of Knowledge in Popular Culture. Wiley-Blackwell, Malden, MA.Google ScholarGoogle ScholarCross RefCross Ref
  12. Heverin, T. and L. Zach. 2010. Twitter for City Police Department Information Sharing. Proceedings of the 73rd American Society for Information Science and Technology Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Honeycutt, C. and S. Herring. 2009. Beyond Microblogging: Conversation and Collaboration via Twitter. Proceedings of the 42nd Hawaii International Conference on System Sciences. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Imran, M., C. Castillo, F. Diaz, and S. Vieweg. 2012. Processing Social Media Messages in Mass Emergency: A Survey. arXiv.org. Available: http://arxiv.org/abs/1407.7071Google ScholarGoogle Scholar
  15. Imran, M., C. Castillo, J. Lucas, P. Meier, and S. Vieweg. AIDR: Artificial intelligence for disaster response. 2014. In Proceedings of the Companion Publication of the 23rd International Conference on World Wide Web. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Jansen, B., M. Zhang, K. Sobel, and A. Chowdury. 2009. Twitter Power: Tweets as Electronic Word of Mouth. Journal of the American Society for Information Science and Technology archive, 60(11): 2169--2188. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Kaufman, L. and P. Rousseeuw. 1990. Finding Groups in Data -- An Introduction to Cluster Analysis. Probability and Mathematical Statistics. John Wiley and Sons, Inc., NY.Google ScholarGoogle Scholar
  18. Koehn, P., H. Hoang, A. Birch, C. Callison-Burch, M. Federico, N. Bertoldi, B. Cowan, W. Shen, C. Moran, R. Zens, C. Dyer, O. Bojar, A. Constantin, and E. Herbst. 2007. Moses: Open Source Toolkit for Statistical Machine Translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Demonstration Session. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Lam, A.J., I. Paner, J. M. Macatangay, D.D. Delos Santos. 2014. Classifying Typhoon Related Tweets. Proceedings of the 10th National Natural Language Processing Research Symposium.Google ScholarGoogle Scholar
  20. Li, J. and H.R. Rao. 2010. Twitter as a Rapid Response News Service: An Exploration in the Context of the 2008 China Earthquake. The Electronic Journal on Information Systems in Developing Countries, 42(4): 1--22.Google ScholarGoogle ScholarCross RefCross Ref
  21. Manning, C., P. Raghavan, and H. Schütze. 2008. Introduction to Information Retrieval. Cambridge University Press, Cambridge, UK. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Metaxas, P., E. Mustafaraj, and D. Gayo-Avello. How (Not) To Predict Elections. Proceedings of the 3rd IEEE International Conference on Social Computing and the 3rd IEEE International Conference on Privacy, Security, Risk and Trust.Google ScholarGoogle Scholar
  23. Meier, P. 2012. How the UN Used Social Media in Response to Typhoon Pablo (Updated). iRevolutions: From innovation to Revolutions. Available: http://irevolution.net/2012/12/08/digital-response-typhoon-pablo/Google ScholarGoogle Scholar
  24. Morales, X.Y.Z. 2010. Networks to the Rescue Tweeting Aid and Relief During Ondoy. M.A. Thesis, Georgetown University.Google ScholarGoogle Scholar
  25. Neuman, W.R., L. Guggenheim, S. Mo Jang, and S. Young Bae. 2014. The Dynamics of Public Attention: Agenda Setting Meets Big Data. Journal of Communication, 64(2): 193--214.Google ScholarGoogle ScholarCross RefCross Ref
  26. Oco, N. and R. E. Roxas. 2012. Pattern Matching Refinements to Dictionary-Based Code-Switching Point Detection. In Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation.Google ScholarGoogle Scholar
  27. Oco, N., L. R. Syliongka, J. Ilao, and R. E. Roxas. 2013. Dice's Coefficient on Trigram Profiles as Metric for Language Similarity. In Proceedings of the 16th Oriental COCOSDA.Google ScholarGoogle Scholar
  28. Pablo, Z.C., N. Oco, M. D. G. Roldan, C. Cheng, and R. E. Roxas. Toward an Enriched Understanding of Factors Influencing Filipino Behavior during Elections through the Analysis of Twitter Data. Philippine Political Science Journal, 35(2): 203--224.Google ScholarGoogle Scholar
  29. Parks, M. 2014. Big Data in Communication Research: Its Contents, and Discontents. Journal of Communication, 64(2): 355--360.Google ScholarGoogle ScholarCross RefCross Ref
  30. Rajaraman, A. and J. D. Ullman. 2011. Mining of Massive Datasets. Cambridge University Press, Cambridge, UK. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Richards, L. 2005. Handling Qualitative Data: A Practical Guide. Sage, London.Google ScholarGoogle Scholar
  32. Skinner, J. 2013. Natural Disasters and Twitter: Thinking from both sides of the Tweet. First Monday, 18(9).Google ScholarGoogle Scholar
  33. Smith, B. 2010. Socially Distributing Public Relations: Twitter, Haiti, and Interactivity in Social Media. Public Relations Review, 36(4): 329--335.Google ScholarGoogle ScholarCross RefCross Ref
  34. Syliongka, L.R. and N. Oco. 2014. Using Language Modeling and Data Association to Perform Named Entity Recognition. Proceedings of the 10th National Natural Language Processing Research Symposium.Google ScholarGoogle Scholar
  35. Tumasjan, A., T. Sprenger, P. Sandner, I. Welpe. 2011. Election Forecasts With Twitter: How 140 Characters Reflect the Political Landscape. Social Science Computer Review, 29(4): 402--418. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Vargo, C., L. Guo, M. McCombs, and D. Shaw. 2014. Network Issue Agendas on Twitter During the 2012 Presidential Elections. Journal of Communication, 64(2): 296--316.Google ScholarGoogle ScholarCross RefCross Ref
  37. Verbeke, M., B. Berendt, L. d'Haenens, and M. Opgenhaffen. 2014. When Two Disciplines Meet, Data Mining for Communication Science. Proceedings of the 2014 Annual Meeting of International Communication Association.Google ScholarGoogle Scholar
  38. Yin, J., A. Lampert, M. Cameron, B. Robinson, and R. Power. 2012. Using Social Media to Enhance Emergency Situation Awareness. IEEE Intelligent Systems, 27(6): 52--59. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Combining Automatic and Manual Approaches: Towards a Framework for Discovering Themes in Disaster-related Tweets

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        WWW '15 Companion: Proceedings of the 24th International Conference on World Wide Web
        May 2015
        1602 pages
        ISBN:9781450334730
        DOI:10.1145/2740908

        Copyright © 2015 Copyright is held by the International World Wide Web Conference Committee (IW3C2)

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 18 May 2015

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate1,899of8,196submissions,23%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader