Skip to main content
Log in

Leveraging ParsBERT for cross-domain polarity sentiment classification of Persian social media comments

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Sentiment analysis is the computational study of the emotions, attitudes and opinions of humans through the extraction of meaningful information. Social media platforms that allow consumers to share and publish content, are enriched with opinionating information that many analytical researches are currently, however, limited to a specific domain. This research presents an architecture to analyze a limited resource language, Persian language, and focuses on the analysis of social media, consisting of informal comments across different domains. The proposed model applies a transformer-based model, ParsBERT, to classify the sentiments of social media comments. Since social media comments have different domains, it is necessary for the proposed model to classify sentiments of comments in different domains. ParsBERT has been fine-tuned on a Persian corpus that has been generated for the purpose of this study. The generated corpus has been gathered from 28,710 Instagram comments in different topic domains and have been labeled as either negative or positive comments. The proposed model has been evaluated based on different test data belonging to different time-periods and topic domains and results have been compared with recent methods for the task of sentiment analysis for three different scenarios. Results show that when the training and test data are from different domains an accuracy of 68% is achieved, which is higher than other shallow methodologies and deep learning methods for determining the sentiments of social media comments in different domains.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

The dataset generated during the current study is available in the GitHub repository, https://github.com/mpanahi/BERT-crossDomain-sentiment-classification.

Notes

  1. https://github.com/mpanahi/BERT-crossDomain-sentiment-classification

References

  1. Akhoundzade R, Devin K (2019) Persian sentiment lexicon expansion using unsupervised learning methods," 9th Int. Conf on Comput Knowl Eng (ICCKE). https://doi.org/10.1109/ICCKE48569.2019.8964692

    Article  Google Scholar 

  2. Amiri F, Scerri S and Khodashahi MH (2015) Lexicon-based Sentiment Analysis for Persian Text. Int. Conf. Recent Adv. Nat. Lang. Process. RANLP 9–16

  3. Alimardani S, Abdollah Aghaie A (2015) Opinion mining in Persian language using supervised algorithms. J Inf Syst Telecommun 3:135–141

    Google Scholar 

  4. Asgarian E, Kahani M, Sharifi S (2018) The impact of sentiment features on the sentiment polarity classification in Persian reviews. Cogni Comput 10:117–135. https://doi.org/10.1007/s12559-017-9513-1

    Article  Google Scholar 

  5. Bagheri A, Saraee M and de Jong F (2013) Sentiment classification in Persian: Introducing a mutual information-based method for feature selection. 21st Iranian Conference on Electrical Engineering (ICEE) 1–6 https://doi.org/10.1109/IranianCEE.2013.6599671

  6. Basiri ME, Kabiri A (2019) HOMPer: A new hybrid system for opinion mining in the Persian language. J Inf Sci 46:101–117. https://doi.org/10.1177/0165551519827886

    Article  Google Scholar 

  7. Bojanowski P, Grave E, Joulin A and Mikolov T (2017) Enriching Word Vectors with Subword Information Trans. Assoc. Comput. Linguist. 135–46. https://doi.org/10.1162/tacl_a_00051

  8. Dashtipour K, Gogate M, Li J, Jiang F, Kong B, Hussain A (2020) A Hybrid Persian Sentiment Analysis Framework: Integrating Dependency Grammar Based Rules and Deep Neural Networks. Neurocomputing 220:1–10. https://doi.org/10.1016/j.neucom.2019.10.009

    Article  Google Scholar 

  9. Dashtipour K, Ieracitano C, Morabito FC, Raza A and Hussain A (2021) An Ensemble Based Classification Approach for Persian Sentiment Analysis. In: Progresses in Artificial Intelligence and Neural Systems, Singapore. Springer. pp 207–21

  10. Dashtipour K, Raza A, Gelbukh A, Zhang R, Cambria E and Hussain A (2020) Persent 2.0: Persian sentiment lexicon enriched with domain-specific words," In: Advances in Brain Inspired Cognitive Systems BICS 2019. Lect. Notes Comput. Sci 11691 https://doi.org/10.1007/978-3-030-39431-8_48

  11. Dastgheib MB, Koleini S, Rasti F (2020) The application of deep learning in Persian documents sentiment analysis. Int J of Inf Sci and Manag (IJISM) 18:1–15

    Google Scholar 

  12. Dehdarbehbahani I, Shakery A and Faili H (2014) Semi-supervised word polarity identification in resource-lean languages. Neural Netw 50–59 https://doi.org/10.1016/j.neunet

  13. Devlin J, Chang M, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. NAACL-HLT 2019; 4171–4186 https://doi.org/10.18653/v1/N19-1423

  14. Farahani M, Gharachorloo M, Farahani M, Manthouri M (2021) ParsBERT: Transformer-based Model for Persian Language Understanding. Neural Process Lett. https://doi.org/10.1007/s11063-021-10528-4

    Article  Google Scholar 

  15. Ghasemi R, Ashrafi Asli SA and Momtazi S (2020) Deep Persian sentiment analysis: Cross-lingual training for low-resource languages. J. of Inf. Sci. 1–14. https://doi.org/10.1177/0165551520962781

  16. He H (2022) A comprehensive review on the role of online media in sustainable business development and decision making. Soft Comput. https://doi.org/10.1007/s00500-022-06993-1

    Article  Google Scholar 

  17. Heidari M and Shamsinejad P (2020) Producing An Instagram Dataset For Persian Language Sentiment Analysis Using Crowdsourcing Method 6th International Conference on Web Research (ICWR), 284–287 https://doi.org/10.1109/ICWR49608.2020.9122270

  18. Instagram, Instagram Reports. https://www.iggroup.com/investors/financial-results/results-reports-and-presentations/result/year/2021. Accessed 2021.

  19. Jafarian H, Taghavi AH, Javaheri A, and Reza Rawassizadeh R (2021) Exploiting BERT to improve aspect-based sentiment analysis performance on Persian language. In 2021 7th International Conference on Web Research (ICWR) 5–8 https://doi.org/10.1109/ICWR51868.2021.9443131

  20. Mohtaj S, Roshanfekr B, Zafarian A and Asghari H (2018) Parsivar: A Language Processing Toolkit for Persian. LREC 2018 - 11th Int. Conf. Lang. Resour. Eval

  21. Momtazi S (2012) Fine-grained German Sentiment Analysis on Social Media. LREC'12 1215–1220

  22. Nezhad Z B and Deihimi MA (2019) A combined deep learning model for Persian sentiment analysis. IIUM Eng. J. 20: 129–139. https://doi.org/10.31436/iiumej.v20i1.1036

  23. Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. NAACL-HLT 2227–2237 https://doi.org/10.18653/v1/N18-1202

  24. Pouromid M, Yekkehkhani A, Asghari Oskoei M, Aminimehr A (2021) ParsBERT Post-Training for Sentiment Analysis of Tweets Concerning Stock Market In 2021 26th International Computer Conference. Computer Society of Iran (CSICC). https://doi.org/10.1109/CSICC52343.2021.9420569

    Article  Google Scholar 

  25. Rajabi Z, Valavi M (2021) A Survey on sentiment analysis in Persian: A Comprehensive System Perspective Covering Challenges and Advances in Resources, and Methods. Cognit Comput. 882–902 https://doi.org/10.1007/s12559-021-09886-x

  26. Roshanfekr B, Khadivi S, Rahmati M (2017) Sentiment analysis using deep learning on Persian texts. In 25th Iran Conf. on Electr. Eng (ICEE) 3:1503–1508

    Google Scholar 

  27. Saraee M and Bagheri A (2013) Feature Selection Methods in Persian Sentiment Analysis. In: Métais, E., Meziane, F., Saraee, M., Sugumaran, V., Vadera, S. (eds) NLDB 2013. Lect. Notes Comput. Sci, Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38824-8_29

  28. Seraji M (2015) Persian Natural Language Processing. In: Issues in Persian Automatic Text Processing; Orthography, Morphology, and Syntax, Acta Universitatis Upsaliensis pp. 45–67

  29. Shams M, Shakery A and Faili H (2012) A non-parametric LDA-based induction method for sentiment analysis. AISP 216–221 https://doi.org/10.1109/AISP.2012.6313747

  30. Silapapiphat, Piriyarangsan S (2018) Social Media and New Environmental Movements for Social Sanction in Thailand. Asian Political Science Review 97–107 https://doi.org/10.2139/ssrn.3229230

  31. Thelwall M, Buckley KA, Paltoglou G (2010) Cai D and Kappas A (2010) Sentiment strength detection in short informal text. J Am Soc Inf Sci Tec 61:2544–2558

    Article  Google Scholar 

  32. Thelwall M, Kevan Buckley KA, Paltoglou G (2012) Sentiment strength detection for the social web. J Am Soc Inf Sci Tec 63:163–173

    Article  Google Scholar 

  33. Twitter Inc., Twitter Annual Report. 2021. https://investor.twitterinc.com/results.cfm. Accessed 2021.

  34. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez JN, Kaiser L and Polosukhin I (2017) Attention is All You Need. Adv. Neural Inf. Process. Syst. 5998–6008 https://doi.org/10.48550/arXiv.1706.03762

  35. Zhanga Y, Zhanga Z, Miaoa D, Wanga J (2019) Three-way enhanced convolutional neural networks for sentence-level sentiment classification. Inf Sci 477:55–64

    Article  Google Scholar 

  36. Zimbra D, Abbasi A, Zeng D, Chen H (2018) The State-of-the-Art in Twitter Sentiment Analysis: A Review and Benchmark Evaluation, ACM Trans Inf Syst 9: 5:1–5:9

Download references

Funding

The authors did not receive support from any organization for the submitted work.

Author information

Authors and Affiliations

Authors

Contributions

MP coded the system, developed the dataset and performed experiments and wrote the methodology and results of the manuscript. SG wrote the introduction and literature review and guided the research. All authors reviewed the final manuscript.

Corresponding author

Correspondence to Mahnaz Panahandeh Nigjeh.

Ethics declarations

Competing interests

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Panahandeh Nigjeh, M., Ghanbari, S. Leveraging ParsBERT for cross-domain polarity sentiment classification of Persian social media comments. Multimed Tools Appl 83, 10677–10694 (2024). https://doi.org/10.1007/s11042-023-16067-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-16067-5

Keywords

Navigation