skip to main content
10.1145/3446999.3447012acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicitConference Proceedingsconference-collections
research-article

Exploring Natural Language Processing Techniques in Social Media Analysis during a Pandemic: Understanding a corpus of Facebook posts using Word2vec and LDA

Published: 09 April 2021 Publication History

Abstract

People around the world have used social media extensively to communicate and express opinions especially during this time of the rapid spread of COVID-19. Nowadays, the various narratives of social media users are important that can be used in creating measures to curb the deadly disease. However, the manual collection of data from social media such as Facebook and its analysis can take time. Thus, this study attempted to use natural language processing (NLP) techniques such as topic modeling and word embedding to identify the concepts contained in the posts and comments of Facebook users in the Philippines regarding the pandemic. This study harvested posts and comments in Facebook groups that are primarily Filipino citizens that express opinions and suggestions in COVID-19 responses. Using Latent Dirichlet Allocation (LDA), this study was able to generate 10 topics related to the concepts of (1) self-discipline, (2) prayers for the frontliners, (3) total lockdown, (4) following government guidelines and protocols, and (5) flattening the curve of the disease. Meanwhile, word groups generated by Word2vec developed concepts such as (1) mass testing, (2) hope for faster recovery, and (3) expectation from the government. The average cosine similarity for word groups is 0.92, which implies strong relatedness of each word per group. This study proved that the use of NLP techniques helped in analyzing the themes of Facebook posts and comments related to the pandemic.

References

[1]
Latent Dirichlet Allocation for Beginners: A high level intuition. Retrieved from https://medium.com/@pratikbarhate/latent-dirichlet-allocation-for-beginners-a-high-level-intuition-23f8a5cbad71
[2]
Mikolov, T., Chen, K., Corrado, G., & Dean, J. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781
[3]
Gorro, K., Ancheta J., Capao, K., Oco, N., Roxas, R., Sabellano, M., Nonnecke, B., Mohanty, S., Crittenden, C., & Goldberg, K. 2017. Qualitative data analysis of disaster risk reduction suggestions assisted by topic modeling and word2vec Retrieved from http://ieeexplore.ieee.org/document/8300601/
[4]
Ancheta, J. R., Gorro, K. D., & Uy, M. A. D. 2020. # Walangpasok on Twitter: Natural language processing as a method for analyzing tweets on class suspensions in the Philippines. 12th International Conference on Knowledge and Smart Technology (KST) (pp. 103-108). IEEE.
[5]
Liu, H. 2017. Sentiment Analysis of Citations using Word2vec.
[6]
G. G. Chowdhury. 2003. Natural language processing, Annual Review of Information Science and Technology, vol. 37, no. 1, 51-89.
[7]
T. Rao and S. Srivastava. 2012. "Analyzing stock market movements using twitter sentiment analysis," Proceedings of the 2012 international conference on advances in social networks analysis and mining (ASONAM 2012), 119-123.
[8]
A. Younus, M. A. Qureshi, M. Saeed, N. Touheed, C. O'Riordan and G. Pasi. 2014. Election trolling: analyzing sentiment in tweets during pakistan elections 2013. Proceedings of the 23rd International Conference on World Wide Web, 411-412.
[9]
S. M Mohammad, S. Kiritchenko and J. Martin. 2013. "Identifying purpose behind electoral tweets," Proceedings of the Second International Workshop on Issues of Sentiment Discovery and Opinion Mining, 1.
[10]
K. Shimada, S. Inoue, H. Maeda and T. Endo. 2011. Analyzing tourism information on twitter for a local city. First ACIS International Symposium on Software and Network Engineering, 61-66.
[11]
S. Cvetojevic and H. H. Hochmair. (September 2018). "Analyzing the spread of tweets in response to Paris attacks," Computers, Environment and Urban Systems, vol. 71, 14-26.
[12]
M. Cheong, S. Ray and D. Green, "Interpreting the 2011 London riots from twitter metadata," 12th International Conference on Intelligent Systems Design and Applications (ISDA), pp. 915-920, November 2012.
[13]
M. Sevenster, J. Buurman, P. Liu, J.F. Peters and P.J. Chang, "Natural Language Processing Techniques for Extracting and Categorizing Finding Measurements in Narrative Radiology Reports," Applied Clinical Informatics, vol. 6, no. 3, pp. 600-610, 2015.
[14]
Siencnik, S. (2015). Adapting word2vec to Named Entity Recognition. Nordic Conference of Computational Linguistics, 239-243.
[15]
Wolf, L., Hanani, Y., Bar, K., & Dershowitz, N. Joint word2vec Networks for Bilingual Semantic Representations. International Journal of Computational Linguistics Vol. 5 No. 1, 27-42, 2014.
[16]
J.R. Ancheta, Negotiating Education Amidst COVID-19 Pandemic: Challenges and Strategies In Online Learning among College Students in Manila, Philippines. Youth Voice Journal Vol. 10, 2020.
[17]
G. G. Chowdhury, "Natural language processing," Annual Review of Information Science and Technology, vol. 37, no. 1, pp. 51-89, 2003.
[18]
J. R. Ancheta, C. Sy, L. Maceda, N. Oco and R. Roxas, "Computer-assisted thematic analysis of typhoon Fung-Wong tweets," TENCON 2017 - 2017 IEEE Region 10 Conference, Penang, 2017, pp. 723-726.

Cited By

View all
  • (2024)Extraction and Processing of Web Content for Corpus Creation: A Systematic Literature ReviewNew Perspectives in Software Engineering10.1007/978-3-031-50590-4_9(143-155)Online publication date: 21-Feb-2024
  • (2021)Stop words detection using a long short term memory recurrent neural networkProceedings of the 2021 9th International Conference on Information Technology: IoT and Smart City10.1145/3512576.3512612(199-202)Online publication date: 22-Dec-2021
  • (2021)An analysis of Disaster Risk Suggestions using Latent Dirichlet Allocation and Hierarchical Dirichlet Process (Nonparametric LDA)Proceedings of the 2021 9th International Conference on Information Technology: IoT and Smart City10.1145/3512576.3512608(181-184)Online publication date: 22-Dec-2021

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICIT '20: Proceedings of the 2020 8th International Conference on Information Technology: IoT and Smart City
December 2020
266 pages
ISBN:9781450388559
DOI:10.1145/3446999
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 April 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. COVID-19
  2. Facebook
  3. Pandemic
  4. Social media

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICIT 2020
ICIT 2020: IoT and Smart City
December 25 - 27, 2020
Xi'an, China

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)21
  • Downloads (Last 6 weeks)2
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Extraction and Processing of Web Content for Corpus Creation: A Systematic Literature ReviewNew Perspectives in Software Engineering10.1007/978-3-031-50590-4_9(143-155)Online publication date: 21-Feb-2024
  • (2021)Stop words detection using a long short term memory recurrent neural networkProceedings of the 2021 9th International Conference on Information Technology: IoT and Smart City10.1145/3512576.3512612(199-202)Online publication date: 22-Dec-2021
  • (2021)An analysis of Disaster Risk Suggestions using Latent Dirichlet Allocation and Hierarchical Dirichlet Process (Nonparametric LDA)Proceedings of the 2021 9th International Conference on Information Technology: IoT and Smart City10.1145/3512576.3512608(181-184)Online publication date: 22-Dec-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media