Abstract
In this paper, we build up on the existing literature pertaining topic modelling and sustainability by exploring Arabic text, mapping the Sustainability Development Goals (SDGs) presented by the United Nation to the tweets published in Arabic. The work utilized the popular Latent Dirichlet Allocation (LDA) technique, to summarize and present subtopics that matter to various sustainability areas, with a focus on 3 of the 17 Sustainability Development Goals. Term Weighting Scheme using TF-IDF and a document term matrix extracted to highlight the most influential keywords that formed the topics. The work presented a unique set of topics and terms that correlate with the certain areas of sustainability. Further exploration of Arabic sources, will inform people concerned with sustainability on the various issues related to sustainable development in the Arab World. The work presented in this paper is a step towards formalizing a framework that will capture and analyze various aspects of unstructured data revolving around sustainability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alshammeri, M., Atwell, E., Alsalka, M.A.:. Quranic topic modelling using paragraph vectors. In: Arai, K., Kapoor, S., Bhatia, R. (eds.) Proceedings of SAI Intelligent Systems Conference, AISC, vol. 1251. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-55187-2_19
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Lee, J.H., Wood, J., Kim, J.: Tracing the trends in sustainability and social media research using topic modeling. Sustainability 13(3), 1269 (2021)
Sutherland, I., et al.: Topic modeling of online accommodation reviews via latent dirichlet allocation. Sustainability 12(5), 1821 (2020)
Abuzayed, A., Al-Khalifa, H.: BERT for arabic topic modeling: an experimental study on BERTopic technique. Procedia Comput. Sci. 189, 191–194 (2021)
Abo, M.E.M., et al.: A multi-criteria approach for Arabic dialect sentiment analysis for online reviews: exploiting optimal machine learning algorithm selection. Sustainability 13(18), 10018 (2021)
Chang, I., et al.: Applying text mining, clustering analysis, and latent dirichlet allocation techniques for topic classification of environmental education journals. Sustainability 13(19), 10856 (2021)
Ma, T., et al.: The impact of weighting schemes and stemming process on topic modeling of Arabic long and short texts. ACM Trans. Asian Low-Res. Lang. Inf. Proces. 19(6), 1–23 (2020)
United Nations: The SDGs in Action. Accessed 30 Oct 2021. https://www.undp.org/sustainable-development-goals
Al Qudah, I., Rabhi, F.A.: Systematic approach to quantify impact of news sentiment on financial markets. In: 2019 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE), pp. 60–65. IEEE (2019)
Ifrim, G.: The Ants Have Megaphones Now: Text Mining and Summarization for News and Social Media Streams. InAI4Narratives@ IJCAI 2020, p. 1 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Al Qudah, I., Hashem, I., Soufyane, A., Chen, W., Merabtene, T. (2022). Applying Latent Dirichlet Allocation Technique to Classify Topics on Sustainability Using Arabic Text. In: Arai, K. (eds) Intelligent Computing. SAI 2022. Lecture Notes in Networks and Systems, vol 506. Springer, Cham. https://doi.org/10.1007/978-3-031-10461-9_43
Download citation
DOI: https://doi.org/10.1007/978-3-031-10461-9_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-10460-2
Online ISBN: 978-3-031-10461-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)