Abstract
Sentiment Analysis is a fundamental problem in social media and aims to determine the attitude of a writer. Recently, transformer-based models have shown great success in sentiment analysis and have been considered the state-of-the-art model for different NLP tasks. However, the accuracy of sentiment analysis for low resource Languages still needs improvements. In this paper, we are concerned with sentiment analysis for Arabic documents. We first applied data augmentation techniques on publicly available datasets to improve the robustness of supervised sentiment analysis models. Then we proposed an ensemble architecture of Arabic sentiment analysis by combing different BERT models. We validated these methods using three available datasets. Our results showed that the BERT-based ensemble method achieves an accuracy score of 96%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chouikhi, H.; Chniter, H. and Jarray, F.: Stacking BERT based models for Arabic sentiment analysis. In Proceedings of the 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KEOD, ISBN 978-989-758-533-3; ISSN 2184-3228, pp. 144-150 (2021). https://doi.org/10.5220/0010648400003064
Dragoni, M., Poria, S., Cambria, E.: OntoSenticNet: a commonsense ontology for sentiment analysis. IEEE Intell. Syst. 33(3), 77–85 (2018)
Safaya, A., Abdullatif, M., Yuret, D.: KUISAIL at SemEval-2020 task 12: BERT-CNN for offensive speech identification in social media. In: Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp. 2054–2059 (2020)
Antoun, W., Baly, F., Hajj, H.: AraBERT: Transformer-based model for Arabic language understanding. arXiv preprint arXiv:2003.00104 (2020)
Devlin, J., Chang, M. W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Imran, A., Faiyaz, M., Akhtar, F.: An enhanced approach for quantitative prediction of personality in Facebook posts. Int. J. Educ. Manag. Eng. (IJEME) 8(2), 8–19 (2018)
Al-Rubaiee, H., Qiu, R., Li, D.: Identifying Mubasher software products through sentiment analysis of Arabic tweets. In: 2016 International Conference on Industrial Informatics and Computer Systems (CIICS), pp. 1–6. IEEE (2016)
Socher, R., et al.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1631–1642 (2013)
Rangel, F., Rosso, P., Charfi, A., Zaghouani, W., Ghanem, B., Sánchez-Junquera, J.: Overview of the track on author profiling and deception detection in Arabic. Working Notes of FIRE 2019, vol. 2517, pp. 70–83 (2019). CEUR-WS. org
Alhumoud, S., Albuhairi, T., Alohaideb, W.: Hybrid sentiment analyser for Arabic tweets using R. In: 2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K), vol. 1, pp. 417–424. IEEE (2015)
Zahran, M.A., Magooda, A., Mahgoub, A.Y., Raafat, H., Rashwan, M., Atyia, A.: Word representations in vector space and their applications for Arabic. In: Gelbukh, A. (ed.) CICLing 2015. LNCS, vol. 9041, pp. 430–443. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18111-0_32
ElJundi, O., Antoun, W., El Droubi, N., Hajj, H., El-Hajj, W., Shaban, K.: hULMoNA: the universal language model in Arabic. In: Proceedings of the Fourth Arabic Natural Language Processing Workshop, pp. 68–77 (2019)
Lan, W., Chen, Y., Xu, W., Ritter, A.: An empirical study of pre-trained transformers for Arabic information extraction. arXiv preprint arXiv:2004.14519 (2020)
Abdul-Mageed, M., Elmadany, A., Nagoudi, E.M.B.: ARBERT & MARBERT: deep bidirectional transformers for Arabic. arXiv preprint arXiv:2101.01785 (2020)
Farha, I.A., Magdy, W.: From Arabic sentiment analysis to Sarcasm detection: the ArSarcasm dataset. In: Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection, pp. 32–39 (2020)
Abdelali, A., Hassan, S., Mubarak, H., Darwish, K., Samih, Y.: Pre-training BERT on Arabic tweets: practical considerations. arXiv preprint arXiv:2102.10684 (2021)
Grave, E., Bojanowski, P., Gupta, P., Joulin, A., Mikolov, T.: Learning word vectors for 157 languages. arXiv preprint arXiv:1802.06893 (2018)
Elnagar, A., Khalifa, Y.S., Einea, A.: Hotel Arabic-reviews dataset construction for sentiment analysis applications. In: Shaalan, K., Hassanien, A.E., Tolba, F. (eds.) Intelligent Natural Language Processing: Trends and Applications. SCI, vol. 740, pp. 35–52. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-67056-0_3
Aly, M., Atiya, A.: LABR: a large scale Arabic book reviews dataset. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, (Volume 2: Short Papers), pp. 494–498 (2013)
Alomari, K.M., ElSherif, H.M., Shaalan, K.: Arabic tweets sentimental analysis using machine learning. In: Benferhat, S., Tabia, K., Ali, M. (eds.) IEA/AIE 2017. LNCS (LNAI), vol. 10350, pp. 602–610. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60042-0_66
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759 (2016)
Baly, R., Khaddaj, A., Hajj, H., El-Hajj, W., Shaban, K.B.: ArSentD-LEV: a multi-topic corpus for target-based sentiment analysis in arabic levantine tweets. arXiv preprint arXiv:1906.01830 (2019)
Nabil, M., Aly, M., Atiya, A.: ASTD: Arabic sentiment tweets dataset. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 2515–2519 (2015)
Ghanem, B., Karoui, J., Benamara, F., Moriceau, V., Rosso, P.: IDAT at fire2019: overview of the track on irony detection in Arabic tweets. In: Proceedings of the 11th Forum for Information Retrieval Evaluation, pp. 10–13 (2019)
Shoukry, A., Rafea, A.: Sentence-level Arabic sentiment analysis. In 2012 international conference on collaboration technologies and systems (CTS), pp. 546–550. IEEE (2012)
Eskander, R., Rambow, O.: SLSA: a sentiment lexicon for standard Arabic. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 2545–2550 (2015)
Dahou, A., Elaziz, M.A., Zhou, J., Xiong, S.: Arabic sentiment classification using convolutional neural network and differential evolution algorithm. Comput. Intell. Neurosci. 2019, 2537689 (2019)
Harrat, S., Meftouh, K., Smaili, K.: Machine translation for Arabic dialects (survey). Inf. Process. Manag. 56(2), 262–273 (2019)
Chouikhi, H., Chniter, H., Jarray, F.: Arabic sentiment analysis using BERT model. In: Wojtkiewicz, K., Treur, J., Pimenidis, E., Maleszka, M. (eds.) ICCCI 2021. CCIS, vol. 1463, pp. 621–632. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-88113-9_50
Ma, J., Li, L.: Data augmentation for Chinese text classification using back-translation. J. Phys. Conf. Ser. 1651(1), 012039 (2020). IOP Publishing (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chouikhi, H., Jarray, F. (2023). BERT-Based Ensemble Learning Approach for Sentiment Analysis. In: Fred, A., Aveiro, D., Dietz, J., Bernardino, J., Masciari, E., Filipe, J. (eds) Knowledge Discovery, Knowledge Engineering and Knowledge Management. IC3K 2021. Communications in Computer and Information Science, vol 1718. Springer, Cham. https://doi.org/10.1007/978-3-031-35924-8_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-35924-8_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-35923-1
Online ISBN: 978-3-031-35924-8
eBook Packages: Computer ScienceComputer Science (R0)