Abstract
Sentiment analysis is the computational study of the emotions, attitudes and opinions of humans through the extraction of meaningful information. Social media platforms that allow consumers to share and publish content, are enriched with opinionating information that many analytical researches are currently, however, limited to a specific domain. This research presents an architecture to analyze a limited resource language, Persian language, and focuses on the analysis of social media, consisting of informal comments across different domains. The proposed model applies a transformer-based model, ParsBERT, to classify the sentiments of social media comments. Since social media comments have different domains, it is necessary for the proposed model to classify sentiments of comments in different domains. ParsBERT has been fine-tuned on a Persian corpus that has been generated for the purpose of this study. The generated corpus has been gathered from 28,710 Instagram comments in different topic domains and have been labeled as either negative or positive comments. The proposed model has been evaluated based on different test data belonging to different time-periods and topic domains and results have been compared with recent methods for the task of sentiment analysis for three different scenarios. Results show that when the training and test data are from different domains an accuracy of 68% is achieved, which is higher than other shallow methodologies and deep learning methods for determining the sentiments of social media comments in different domains.
Similar content being viewed by others
Data availability
The dataset generated during the current study is available in the GitHub repository, https://github.com/mpanahi/BERT-crossDomain-sentiment-classification.
References
Akhoundzade R, Devin K (2019) Persian sentiment lexicon expansion using unsupervised learning methods," 9th Int. Conf on Comput Knowl Eng (ICCKE). https://doi.org/10.1109/ICCKE48569.2019.8964692
Amiri F, Scerri S and Khodashahi MH (2015) Lexicon-based Sentiment Analysis for Persian Text. Int. Conf. Recent Adv. Nat. Lang. Process. RANLP 9–16
Alimardani S, Abdollah Aghaie A (2015) Opinion mining in Persian language using supervised algorithms. J Inf Syst Telecommun 3:135–141
Asgarian E, Kahani M, Sharifi S (2018) The impact of sentiment features on the sentiment polarity classification in Persian reviews. Cogni Comput 10:117–135. https://doi.org/10.1007/s12559-017-9513-1
Bagheri A, Saraee M and de Jong F (2013) Sentiment classification in Persian: Introducing a mutual information-based method for feature selection. 21st Iranian Conference on Electrical Engineering (ICEE) 1–6 https://doi.org/10.1109/IranianCEE.2013.6599671
Basiri ME, Kabiri A (2019) HOMPer: A new hybrid system for opinion mining in the Persian language. J Inf Sci 46:101–117. https://doi.org/10.1177/0165551519827886
Bojanowski P, Grave E, Joulin A and Mikolov T (2017) Enriching Word Vectors with Subword Information Trans. Assoc. Comput. Linguist. 135–46. https://doi.org/10.1162/tacl_a_00051
Dashtipour K, Gogate M, Li J, Jiang F, Kong B, Hussain A (2020) A Hybrid Persian Sentiment Analysis Framework: Integrating Dependency Grammar Based Rules and Deep Neural Networks. Neurocomputing 220:1–10. https://doi.org/10.1016/j.neucom.2019.10.009
Dashtipour K, Ieracitano C, Morabito FC, Raza A and Hussain A (2021) An Ensemble Based Classification Approach for Persian Sentiment Analysis. In: Progresses in Artificial Intelligence and Neural Systems, Singapore. Springer. pp 207–21
Dashtipour K, Raza A, Gelbukh A, Zhang R, Cambria E and Hussain A (2020) Persent 2.0: Persian sentiment lexicon enriched with domain-specific words," In: Advances in Brain Inspired Cognitive Systems BICS 2019. Lect. Notes Comput. Sci 11691 https://doi.org/10.1007/978-3-030-39431-8_48
Dastgheib MB, Koleini S, Rasti F (2020) The application of deep learning in Persian documents sentiment analysis. Int J of Inf Sci and Manag (IJISM) 18:1–15
Dehdarbehbahani I, Shakery A and Faili H (2014) Semi-supervised word polarity identification in resource-lean languages. Neural Netw 50–59 https://doi.org/10.1016/j.neunet
Devlin J, Chang M, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. NAACL-HLT 2019; 4171–4186 https://doi.org/10.18653/v1/N19-1423
Farahani M, Gharachorloo M, Farahani M, Manthouri M (2021) ParsBERT: Transformer-based Model for Persian Language Understanding. Neural Process Lett. https://doi.org/10.1007/s11063-021-10528-4
Ghasemi R, Ashrafi Asli SA and Momtazi S (2020) Deep Persian sentiment analysis: Cross-lingual training for low-resource languages. J. of Inf. Sci. 1–14. https://doi.org/10.1177/0165551520962781
He H (2022) A comprehensive review on the role of online media in sustainable business development and decision making. Soft Comput. https://doi.org/10.1007/s00500-022-06993-1
Heidari M and Shamsinejad P (2020) Producing An Instagram Dataset For Persian Language Sentiment Analysis Using Crowdsourcing Method 6th International Conference on Web Research (ICWR), 284–287 https://doi.org/10.1109/ICWR49608.2020.9122270
Instagram, Instagram Reports. https://www.iggroup.com/investors/financial-results/results-reports-and-presentations/result/year/2021. Accessed 2021.
Jafarian H, Taghavi AH, Javaheri A, and Reza Rawassizadeh R (2021) Exploiting BERT to improve aspect-based sentiment analysis performance on Persian language. In 2021 7th International Conference on Web Research (ICWR) 5–8 https://doi.org/10.1109/ICWR51868.2021.9443131
Mohtaj S, Roshanfekr B, Zafarian A and Asghari H (2018) Parsivar: A Language Processing Toolkit for Persian. LREC 2018 - 11th Int. Conf. Lang. Resour. Eval
Momtazi S (2012) Fine-grained German Sentiment Analysis on Social Media. LREC'12 1215–1220
Nezhad Z B and Deihimi MA (2019) A combined deep learning model for Persian sentiment analysis. IIUM Eng. J. 20: 129–139. https://doi.org/10.31436/iiumej.v20i1.1036
Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. NAACL-HLT 2227–2237 https://doi.org/10.18653/v1/N18-1202
Pouromid M, Yekkehkhani A, Asghari Oskoei M, Aminimehr A (2021) ParsBERT Post-Training for Sentiment Analysis of Tweets Concerning Stock Market In 2021 26th International Computer Conference. Computer Society of Iran (CSICC). https://doi.org/10.1109/CSICC52343.2021.9420569
Rajabi Z, Valavi M (2021) A Survey on sentiment analysis in Persian: A Comprehensive System Perspective Covering Challenges and Advances in Resources, and Methods. Cognit Comput. 882–902 https://doi.org/10.1007/s12559-021-09886-x
Roshanfekr B, Khadivi S, Rahmati M (2017) Sentiment analysis using deep learning on Persian texts. In 25th Iran Conf. on Electr. Eng (ICEE) 3:1503–1508
Saraee M and Bagheri A (2013) Feature Selection Methods in Persian Sentiment Analysis. In: Métais, E., Meziane, F., Saraee, M., Sugumaran, V., Vadera, S. (eds) NLDB 2013. Lect. Notes Comput. Sci, Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38824-8_29
Seraji M (2015) Persian Natural Language Processing. In: Issues in Persian Automatic Text Processing; Orthography, Morphology, and Syntax, Acta Universitatis Upsaliensis pp. 45–67
Shams M, Shakery A and Faili H (2012) A non-parametric LDA-based induction method for sentiment analysis. AISP 216–221 https://doi.org/10.1109/AISP.2012.6313747
Silapapiphat, Piriyarangsan S (2018) Social Media and New Environmental Movements for Social Sanction in Thailand. Asian Political Science Review 97–107 https://doi.org/10.2139/ssrn.3229230
Thelwall M, Buckley KA, Paltoglou G (2010) Cai D and Kappas A (2010) Sentiment strength detection in short informal text. J Am Soc Inf Sci Tec 61:2544–2558
Thelwall M, Kevan Buckley KA, Paltoglou G (2012) Sentiment strength detection for the social web. J Am Soc Inf Sci Tec 63:163–173
Twitter Inc., Twitter Annual Report. 2021. https://investor.twitterinc.com/results.cfm. Accessed 2021.
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez JN, Kaiser L and Polosukhin I (2017) Attention is All You Need. Adv. Neural Inf. Process. Syst. 5998–6008 https://doi.org/10.48550/arXiv.1706.03762
Zhanga Y, Zhanga Z, Miaoa D, Wanga J (2019) Three-way enhanced convolutional neural networks for sentence-level sentiment classification. Inf Sci 477:55–64
Zimbra D, Abbasi A, Zeng D, Chen H (2018) The State-of-the-Art in Twitter Sentiment Analysis: A Review and Benchmark Evaluation, ACM Trans Inf Syst 9: 5:1–5:9
Funding
The authors did not receive support from any organization for the submitted work.
Author information
Authors and Affiliations
Contributions
MP coded the system, developed the dataset and performed experiments and wrote the methodology and results of the manuscript. SG wrote the introduction and literature review and guided the research. All authors reviewed the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Panahandeh Nigjeh, M., Ghanbari, S. Leveraging ParsBERT for cross-domain polarity sentiment classification of Persian social media comments. Multimed Tools Appl 83, 10677–10694 (2024). https://doi.org/10.1007/s11042-023-16067-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-16067-5