ABSTRACT
In this paper, we present a framework that combines automatic and manual approaches to discover themes in disaster-related tweets. As case study, we decided to focus on tweets related to typhoon Haiyan, which caused billions of dollars in damages. We collected tweets from November 2013 to March 2014 and used the local typhoon name "Yolanda" as the filter. Data association was used to expand the tweet set and k-means clustering was then applied. Clusters with high number of instances were subjected to open coding for labeling. The Silhouette indices ranged from 0.27 to 0.50. Analyses reveal that the use of automated Natural Language Processing (NLP) approach has the potential to deal with huge volumes of tweets by clustering frequently occurring words and phrases. This complements the manual approach to surface themes from a more manageable set of tweet pool, allowing for a more nuanced analysis of tweets from a human expert. As application, the themes identified during open coding were used as labels to train a classifier system. Future work could explore on using topic models and focusing on specific content or issues, such as natural calamities and citizen's participation in addressing these.
- Agrawal, R., T. Imielinski, and A. Swami. 1993. Mining Association Rules between Sets of Items in Large Databases. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data. Google ScholarDigital Library
- Beduya, L. J. and K. J. Espinosa. 2014. Flood-Related Disaster Tweet Classification Using Support Vector Machines. Proceedings of the 10th National Natural Language Processing Research Symposium.Google Scholar
- Caragea, C., N. McNeese, A. Jaiswal, G. Traylor, H.-W. Kim, P. Mitra, D. Wu, A. Tapia, L. Giles, B. Jansen, and J. Yen. 2011. Classifying Text Messages for Haiti Earthquake. 8th International Conference on Information Systems for Crisis Response and Management.Google Scholar
- Centeno, D. 2010. Celebrification in Philippine Politics: Exploring the Relationship Between Celebrity Endorsers? Parasociability and the Public Voting Behavior. Social Science Diliman,6(1): 66--85.Google Scholar
- De Vicente, J., J. Lanchares, and R. Hermida. 2003. Placement by Thermodynamic Simulated Annealing. Physics Letters A, 317(5-6): 415--423.Google ScholarCross Ref
- Feldman, R. 2013. Techniques and Applications for Sentiment Analysis. Communications of the ACM, 56(4): 82--89. Google ScholarDigital Library
- Freeman, M. 2011. Fire, Wind and Water: Social Networks in Natural Disasters. Journal of Cases on Information Technology, 13(2): 69--79. Google ScholarDigital Library
- Glaser, B. 1978. The Grounded Theory Perspective II: Description's Remodeling of Grounded Theory Methodology. The Sociology Press, CA.Google Scholar
- Gonzales, H. A. and K. J. Espinosa. 2014. Community Structure Detection and Analysis in Disaster Related Tweets. Proceedings of the 10th National Natural Language Processing Research Symposium.Google Scholar
- Hartigan, J. A. and M. A. Wong. 1979. Algorithm AS 136: A K-Means Clustering Algorithm. Journal of the Royal Statistical Society, Series C, 28(1): 100--108.Google ScholarCross Ref
- Hartley, J. 2008. Television Truths: Forms of Knowledge in Popular Culture. Wiley-Blackwell, Malden, MA.Google ScholarCross Ref
- Heverin, T. and L. Zach. 2010. Twitter for City Police Department Information Sharing. Proceedings of the 73rd American Society for Information Science and Technology Conference. Google ScholarDigital Library
- Honeycutt, C. and S. Herring. 2009. Beyond Microblogging: Conversation and Collaboration via Twitter. Proceedings of the 42nd Hawaii International Conference on System Sciences. Google ScholarDigital Library
- Imran, M., C. Castillo, F. Diaz, and S. Vieweg. 2012. Processing Social Media Messages in Mass Emergency: A Survey. arXiv.org. Available: http://arxiv.org/abs/1407.7071Google Scholar
- Imran, M., C. Castillo, J. Lucas, P. Meier, and S. Vieweg. AIDR: Artificial intelligence for disaster response. 2014. In Proceedings of the Companion Publication of the 23rd International Conference on World Wide Web. Google ScholarDigital Library
- Jansen, B., M. Zhang, K. Sobel, and A. Chowdury. 2009. Twitter Power: Tweets as Electronic Word of Mouth. Journal of the American Society for Information Science and Technology archive, 60(11): 2169--2188. Google ScholarDigital Library
- Kaufman, L. and P. Rousseeuw. 1990. Finding Groups in Data -- An Introduction to Cluster Analysis. Probability and Mathematical Statistics. John Wiley and Sons, Inc., NY.Google Scholar
- Koehn, P., H. Hoang, A. Birch, C. Callison-Burch, M. Federico, N. Bertoldi, B. Cowan, W. Shen, C. Moran, R. Zens, C. Dyer, O. Bojar, A. Constantin, and E. Herbst. 2007. Moses: Open Source Toolkit for Statistical Machine Translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Demonstration Session. Google ScholarDigital Library
- Lam, A.J., I. Paner, J. M. Macatangay, D.D. Delos Santos. 2014. Classifying Typhoon Related Tweets. Proceedings of the 10th National Natural Language Processing Research Symposium.Google Scholar
- Li, J. and H.R. Rao. 2010. Twitter as a Rapid Response News Service: An Exploration in the Context of the 2008 China Earthquake. The Electronic Journal on Information Systems in Developing Countries, 42(4): 1--22.Google ScholarCross Ref
- Manning, C., P. Raghavan, and H. Schütze. 2008. Introduction to Information Retrieval. Cambridge University Press, Cambridge, UK. Google ScholarDigital Library
- Metaxas, P., E. Mustafaraj, and D. Gayo-Avello. How (Not) To Predict Elections. Proceedings of the 3rd IEEE International Conference on Social Computing and the 3rd IEEE International Conference on Privacy, Security, Risk and Trust.Google Scholar
- Meier, P. 2012. How the UN Used Social Media in Response to Typhoon Pablo (Updated). iRevolutions: From innovation to Revolutions. Available: http://irevolution.net/2012/12/08/digital-response-typhoon-pablo/Google Scholar
- Morales, X.Y.Z. 2010. Networks to the Rescue Tweeting Aid and Relief During Ondoy. M.A. Thesis, Georgetown University.Google Scholar
- Neuman, W.R., L. Guggenheim, S. Mo Jang, and S. Young Bae. 2014. The Dynamics of Public Attention: Agenda Setting Meets Big Data. Journal of Communication, 64(2): 193--214.Google ScholarCross Ref
- Oco, N. and R. E. Roxas. 2012. Pattern Matching Refinements to Dictionary-Based Code-Switching Point Detection. In Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation.Google Scholar
- Oco, N., L. R. Syliongka, J. Ilao, and R. E. Roxas. 2013. Dice's Coefficient on Trigram Profiles as Metric for Language Similarity. In Proceedings of the 16th Oriental COCOSDA.Google Scholar
- Pablo, Z.C., N. Oco, M. D. G. Roldan, C. Cheng, and R. E. Roxas. Toward an Enriched Understanding of Factors Influencing Filipino Behavior during Elections through the Analysis of Twitter Data. Philippine Political Science Journal, 35(2): 203--224.Google Scholar
- Parks, M. 2014. Big Data in Communication Research: Its Contents, and Discontents. Journal of Communication, 64(2): 355--360.Google ScholarCross Ref
- Rajaraman, A. and J. D. Ullman. 2011. Mining of Massive Datasets. Cambridge University Press, Cambridge, UK. Google ScholarDigital Library
- Richards, L. 2005. Handling Qualitative Data: A Practical Guide. Sage, London.Google Scholar
- Skinner, J. 2013. Natural Disasters and Twitter: Thinking from both sides of the Tweet. First Monday, 18(9).Google Scholar
- Smith, B. 2010. Socially Distributing Public Relations: Twitter, Haiti, and Interactivity in Social Media. Public Relations Review, 36(4): 329--335.Google ScholarCross Ref
- Syliongka, L.R. and N. Oco. 2014. Using Language Modeling and Data Association to Perform Named Entity Recognition. Proceedings of the 10th National Natural Language Processing Research Symposium.Google Scholar
- Tumasjan, A., T. Sprenger, P. Sandner, I. Welpe. 2011. Election Forecasts With Twitter: How 140 Characters Reflect the Political Landscape. Social Science Computer Review, 29(4): 402--418. Google ScholarDigital Library
- Vargo, C., L. Guo, M. McCombs, and D. Shaw. 2014. Network Issue Agendas on Twitter During the 2012 Presidential Elections. Journal of Communication, 64(2): 296--316.Google ScholarCross Ref
- Verbeke, M., B. Berendt, L. d'Haenens, and M. Opgenhaffen. 2014. When Two Disciplines Meet, Data Mining for Communication Science. Proceedings of the 2014 Annual Meeting of International Communication Association.Google Scholar
- Yin, J., A. Lampert, M. Cameron, B. Robinson, and R. Power. 2012. Using Social Media to Enhance Emergency Situation Awareness. IEEE Intelligent Systems, 27(6): 52--59. Google ScholarDigital Library
Index Terms
- Combining Automatic and Manual Approaches: Towards a Framework for Discovering Themes in Disaster-related Tweets
Recommendations
Twitter Flu Trend: A Hybrid Deep Neural Network for Tweet Analysis
Artificial Intelligence XXXIXAbstractPopular social networks such as Twitter have been proposed as a data source for public health monitoring because they have the potential to show infection disease surveillance like Influenza-Like Illnesses (ILI). However, shortness, data sparsity, ...
French presidential elections: what are the most efficient measures for tweets?
PLEAD '12: Proceedings of the first edition workshop on Politics, elections and dataTweets exchanged over the Internet are an important source of information even if their characteristics make them difficult to analyze (e.g., a maximum of 140 characters; noisy data). In this paper, we address the problem of extracting relevant topics ...
Forecasting COVID-19 Vaccination Rates using Social Media Data
WWW '23 Companion: Companion Proceedings of the ACM Web Conference 2023The COVID-19 pandemic has had a profound impact on the global community, and vaccination has been recognized as a crucial intervention. To gain insight into public perceptions of COVID-19 vaccines, survey studies and the analysis of social media ...
Comments