ABSTRACT
Qualitative data is part of the things that most social scientists would deal with. In this study, qualitative Disaster Risk Reduction suggestions were analyzed using topic modeling techniques. Latent Dirichlet allocation is one of the topic modeling that was utilized in this study. The ideal number of topic models being generated for LDA is 10 with a score of 530.1495. Hierarchical Dirichlet Process model was also used to get the topic models from the corpus. The HDP model generated 11 topic models with a log-likelihood score of -4.08997. The topic models being generated by the parametric LDA and non-parametric LDA are almost similar. To analyze the result of the topic models, open coding technique was utilized. The following narratives were the focus of the DRR responses: Solid waste management and improve drainage system, Relief and Emergency Plan and Early warning system and Disaster Preparedness.
- [UNISDR. (2011). National Disaster Risk Reduction and Management Plan (NDRRMP), 70. Retrieved from http://www.ndrrmc.gov.ph/attachments/article/567/Signed_NDRRMP.pdfGoogle Scholar
- Philippines, Department of Education (2008). Disaster Risk Reduction Resource Manual.Google Scholar
- Anselm L Strauss. 1987. Qualitative analysis for social scientists. Cambridge University PressGoogle Scholar
- Jasy Liew Suet Yan, Nancy McCracken, Shichun Zhou and Kevin Crowston. 2014. Optimizing Features in Active Machine Learning for Complex Qualitative Content Analysis. ACL 2014: 44.Google Scholar
- McCallum, A. K. (2002). MALLET: A Machine Learning for Language Toolkit. Retrieved from http://mallet.cs.umass.eduGoogle Scholar
- Latent Dirichlet Allocation for Beginners: A high level intuition. Retrieved from https://medium.com/@pratikbarhate/latent-diric hletallocation-for-beginners-a-high-level-intuitio n-23f8a5cbad71Google Scholar
- Wang, C., Paisley, J., & Blei, D. (2011, June). Online variational inference for the hierarchical Dirichlet process. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (pp. 752-760). JMLR Workshop and Conference Proceedings.Google Scholar
- Hoffman, M. D., Blei, D. M., Wang, C., & Paisley, J. (2013). Stochastic variational inference. Journal of Machine Learning Research, 14(5).Google Scholar
- Malasakit. https://opinion.berkeley.edu/pcari/en/landing/Google Scholar
- Nonnecke, B., Mohanty, S., Lee, A., Lee, J., Beckman, S., Mi, J., ... & Goldberg, K. (2018, October). Malasakit 2.0: A Participatory Online Platform with Feature Phone Integration and Voice Recognition for Crowdsourcing Disaster Risk Reduction Strategies in the Philippines. In 2018 IEEE Global Humanitarian Technology Conference (GHTC) (pp. 1-6). IEEE.Google ScholarCross Ref
- Gorro, K., Ancheta, J. R., Capao, K., Oco, N., Roxas, R. E., Sabellano, M. J., ... & Goldberg, K. (2017, December). Qualitative data analysis of disaster risk reduction suggestions assisted by topic modeling and word2vec. In 2017 International Conference on Asian Language Processing (IALP) (pp. 293-297). IEEE.Google ScholarCross Ref
- Bui, S. M. G., Gorro, K., Aquino, G. A., & Sabellano, M. J. (2017, December). An analysis of DRR suggestions using K-means clustering. In Proceedings of the 2017 International Conference on Information Technology (pp. 76-80).Google ScholarDigital Library
- Bakharia, A., Bruza, P., Watters, J., Narayan, B., and Sitbon, L. 2016. Interactive Topic Modeling for aiding Qualitative Content Analysis. In Proceedings of the 2016 ACM on Conference on Human Information Interaction and Retrieval (CHIIR '16). ACM, New York, NY, USA, 213-222. DOI: https://doi.org/10.1145/2854946.2854960Google ScholarDigital Library
- Chen, N. C., Kocielnik, R., and Drouhard, M. 2016. Challenges of Applying Machine Learning to Qualitative Coding. Retrieved from https://faculty.washington.edu/aragon/pubs/textv isdrg_hcml2016.pdfTierney Patrick. 2012. A Qualitative Analysis Framework Using Natural Language Processing and Graph Theory. Retrieved from http://www.irrodl.org/index.php/irrodl/article/vie w/1240/2363Google Scholar
- Ancheta, J. R., Gorro, K. D., & Uy, M. A.D. (2020). # Walangpasok on Twitter: Natural language processing as a method for analyzing tweets on class suspensions in the Philippines. In 2020 12th International Conference on Knowledge and Smart Technology (KST) (pp. 103-108). IEEE.Google ScholarCross Ref
- Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781Google Scholar
- Bhatia, S., Lau, J. H., & Baldwin, T. (2017). An Automatic Approach for Document-level Topic Model Evaluation. arXiv preprint arXiv:1706.05140Google Scholar
- Schnabel, T., Labutov, I., Mimno, D. M., & Joachims, T. (2015, September). Evaluation methods for unsupervised word embeddings. In EMNLP (pp. 298 - 307)Google Scholar
- Gorro, K. D., Ali, M., Gorro, K. D., Ancheta, J. R., (2020, December) The 8th International Conference on Information Technology: IoT and Smart City, pp 69-73• https://doi.org/10.1145/3446999.3447012Google ScholarDigital Library
Recommendations
Latent dirichlet allocation based multi-document summarization
AND '08: Proceedings of the second workshop on Analytics for noisy unstructured text dataExtraction based Multi-Document Summarization Algorithms consist of choosing sentences from the documents using some weighting mechanism and combining them into a summary. In this article we use Latent Dirichlet Allocation to capture the events being ...
Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey
Topic modeling is one of the most powerful techniques in text mining for data mining, latent data discovery, and finding relationships among data and text documents. Researchers have published many articles in the field of topic modeling and applied in ...
Latent Dirichlet learning for document summarization
ICASSP '09: Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal ProcessingAutomatic summarization is developed to extract the representative contents or sentences from a large corpus of documents. This paper presents a new hierarchical representation of words, sentences and documents in a corpus, and infers the Dirichlet ...
Comments