skip to main content
research-article

Sentiment Analysis of Iraqi Arabic Dialect on Facebook Based on Distributed Representations of Documents

Authors Info & Claims
Published:09 January 2019Publication History
Skip Abstract Section

Abstract

Nowadays, social media is used by many people to express their opinions about a variety of topics. Opinion Mining or Sentiment Analysis techniques extract opinions from user generated contents. Over the years, a multitude of Sentiment Analysis studies has been done about the English language with deficiencies of research in all other languages. Unfortunately, Arabic is one of the languages that seems to lack substantial research, despite the rapid growth of its use on social media outlets. Furthermore, specific Arabic dialects should be studied, not just Modern Standard Arabic. In this paper, we experiment sentiments analysis of Iraqi Arabic dialect using word embedding. First, we made a large corpus from previous works to learn word representations. Second, we generated word embedding model by training corpus using Doc2Vec representations based on Paragraph and Distributed Memory Model of Paragraph Vectors (DM-PV) architecture. Lastly, the represented feature used for training four binary classifiers (Logistic Regression, Decision Tree, Support Vector Machine and Naive Bayes) to detect sentiment. We also experimented different values of parameters (window size, dimension and negative samples). In the light of the experiments, it can be concluded that our approach achieves a better performance for Logistic Regression and Support Vector Machine than the other classifiers.

References

  1. Thabit Sabbah, Ali Selamat, Md Hafiz Selamat, Fawaz S. Al-Anzi, Enrique Herrera Viedma, Ondrej Krejcar, and Hamido Fujita. 2017. Modified frequency-based term weighting schemes for text classification. Applied Soft Computing 58 (September 2017), 193--206.Google ScholarGoogle Scholar
  2. Hassan Saif, Yulan He, Miriam Fernandez, and Harith Alani. 2016. Contextual semantics for sentiment analysis of twitter. Information Processing 8 Management 52, 1 (January 2016), 5--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Aziz Altowayan and Lixin Tao. 2016. Word embeddings for arabic sentiment analysis. In Proceedings of the IEEE International Conference on Big Data. IEEE, Los Alamitos, CA, USA, 3820--3825.Google ScholarGoogle ScholarCross RefCross Ref
  4. Bing Liu. 2012. Sentiment Analysis and Opinion mining, Morgan 8 Claypool Publishers. California, USA.Google ScholarGoogle Scholar
  5. Aymen Abu-Errub, Ashraf Odeh, Qusai Shambour, and Osama Al-Haj Hassan. 2014. Arabic roots extraction using morphological analysis. International Journal of Computer Science Issues (IJCSI) 11, 2 (March 2014), 128--134.Google ScholarGoogle Scholar
  6. Alaa M. El-Halees. 2017. Arabic opinion mining using distributed representations of documents. In Proceedings of the Palestinian International Conference on Information and Communication Technology. IEEE, Washington, DC, USA, 28--33.Google ScholarGoogle ScholarCross RefCross Ref
  7. RM Duwairi, Nizar A. Ahmed, and Saleh Y. Al-Rifai. 2015. Detecting sentiment embedded in arabic social media--a lexicon-based approach. Journal of Intelligent 8 Fuzzy Systems 29, 1 (2015), 107--117.Google ScholarGoogle ScholarCross RefCross Ref
  8. Abdullateef M. Rabab'ah, Mahmoud Al-Ayyoub, Yaser Jararweh, and Mohammed N. Al-Kabi. 2016. Evaluating sentistrength for arabic sentiment analysis. In Proceedings of the 7th International Conference on Computer Science and Information Technology (CSIT). IEEE, Washington, DC, USA, 1--6.Google ScholarGoogle Scholar
  9. Sadam Al-Azani and El-Sayed M. El-Alfy. 2017. Using word embedding and ensemble learning for highly imbalanced data sentiment analysis in short arabic text. In Proceedings of the 8th International Conference on Ambient Systems, Networks and Technologies, ANT 2017. Procedia Computer Science, 359--366.Google ScholarGoogle Scholar
  10. Anwar Alnawas and Nursal Arıcı. 2018. The corpus based approach to sentiment analysis in modern standard arabic and arabic dialects: A literature review. Journal of Polytechnic 21, 2 (June 2018), 461--470.Google ScholarGoogle Scholar
  11. Abdelghani Dahou, Shengwu Xiong, Junwei Zhou, Mohamed Houcine Haddoud, and Pengfei Duan. 2016. Word embeddings and convolutional neural network for arabic sentiment classification. In Proceedings of the 26th International Conference on Computational Linguistics. Association for Computational Linguistics, Stroudsburg PA USA, 2418--2427.Google ScholarGoogle Scholar
  12. Mohamed Aly and Amir Atiya. 2013. Labr: Large scale arabic book reviews. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Stroudsburg, PA, USA, 494--498.Google ScholarGoogle Scholar
  13. Hady ElSahar and Samhaa R. El-Beltagy. 2015. Building large arabic multi-domain resources for sentiment analysis. In Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics. Springer, Cham, Switzerland, 23--34.Google ScholarGoogle Scholar
  14. Eshrag Refaee and Verena Rieser. 2014. An arabic twitter corpus for subjectivity and sentiment analysis. In Proceedings of the 9th International Language Resources and Evaluation Conference. European Language Resources Association, France, 2268--2273.Google ScholarGoogle Scholar
  15. Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig. 2013. Linguistic regularities in continuous space word representations. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT-2013), Association for Computational Linguistics, Stroudsburg PA, USA, 746--751.Google ScholarGoogle Scholar
  16. Ahmad Al-Sallab, Ramy Baly, Hazem Hajj, Khaled Bashir Shaban, Wassim El-Hajj, and Gilbert Badaro.2017.Aroma: A recursive deep learning model for opinion mining in arabic as a low resource language. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 16, 4, Article 25 (July 2017), 20 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Ramy Baly, Hazem Hajj, Nizar Habash, Khaled Bashir Shaban, and Wassim El-Hajj. 2017. A sentiment treebank and morphologically enriched recursive deep models for effective sentiment analysis in arabic. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 16, 4, Article 23 (July 2017), 21 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Fadi Biadsy, Julia Hirschberg, and Nizar Habash. 2009. Spoken arabic dialect identification using phonotactic modeling. In Proceedings of the EACL 2009 workshop on computational approaches to semitic languages. Association for Computational Linguistics, Stroudsburg, PA, USA, 53--61. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Matti Phillips Khoshaba Al-Bazi. 2005. Iraqi Dialect Versus Standard Arabic, Matti Phillips Khoshaba (Al- Bazi). United States.Google ScholarGoogle Scholar
  20. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv:1301.3781. Retrieved from https://arxiv.org/abs/1301.3781.Google ScholarGoogle Scholar
  21. Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Stroudsburg PA, USA, 1532--1543.Google ScholarGoogle ScholarCross RefCross Ref
  22. Ahmet Hayran and Mustafa Sert. 2017. Sentiment analysis on microblog data based on word embedding and fusion techniques. In Proceedings of the 25th Signal Processing and Communications Applications Conference (SIU). IEEE, Washington, DC, USA, 1--4.Google ScholarGoogle ScholarCross RefCross Ref
  23. Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In Proceedings of the 31st International Conference on International Conference on Machine Learning. JMLR.org, 1188--1196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Antoine J-P Tixier, Michalis Vazirgiannis, and Matthew R. Hallowell. 2016. Word embeddings for the construction domain. ArXiv:1610.09333. Retrieved from https://arXiv:1610.09333.Google ScholarGoogle Scholar
  25. Aitor García-Pablos, Montse Cuadros, and German Rigau. 2018. W2vlda: Almost unsupervised system for aspect based sentiment analysis. Expert Systems with Applications 91 (January 2018), 127--137.Google ScholarGoogle Scholar
  26. Maria Giatsoglou, Manolis G. Vozalis, Konstantinos Diamantaras, Athena Vakali, George Sarigiannidis, and Konstantinos Ch. Chatzisavvas. 2017. Sentiment analysis leveraging emotions and word embeddings. Expert Systems with Applications 69 (March 2017), 214--224.Google ScholarGoogle Scholar
  27. Sungwoon Choi, Jangho Lee, Min-Gyu Kang, Hyeyoung Min, Yoon-Seok Chang, and Sungroh Yoon. 2017. Large-scale machine learning of media outlets for understanding public reactions to nation-wide viral infection outbreaks. Methods 129, 1 (October 2017), 50--59.Google ScholarGoogle ScholarCross RefCross Ref
  28. Marwa Naili, Anja Habacha Chaibi, and Henda Hajjami Ben Ghezala. 2017. Comparative study of word embedding methods in topic segmentation. Procedia Computer Science 112 (September 2017), 340--349. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Mohammad Al-Smadi, Mahmoud Al-Ayyoub, Yaser Jararweh, and Omar Qawasmeh. 2018. Enhancing aspect-based sentiment analysis of arabic hotels’ reviews using morphological, syntactic and semantic features. Information Processing 8 Management (January 2018).Google ScholarGoogle Scholar
  30. Hunaida Awwad and Adil Alpkocak. 2017. Using hybrid-stemming approach to enhance lexicon-based sentiment analysis in arabic. In Proceedings of the International Conference on New Trends in Computing Sciences (ICTCS). IEEE, Los Alamitos, CA, USA, 229--235.Google ScholarGoogle ScholarCross RefCross Ref
  31. Nawaf A. Abdulla, Nizar A. Ahmed, Mohammed A. Shehab, and Mahmoud Al-Ayyoub. 2013. Arabic sentiment analysis: Lexicon-based and corpus-based. In Proceedings of the IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT). IEEE, Los Alamitos, CA, USA, 1--6.Google ScholarGoogle ScholarCross RefCross Ref
  32. Mohammed Rushdi‐Saleh, M Teresa Martín‐Valdivia, L. Alfonso Ureña‐López, and José M. Perea‐Ortega. 2011. Oca: Opinion corpus for arabic. J. Am. Soc. Inf. Sci. Technol. 62, 10 (October 2011), 2045--2054. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Mahmoud Nabil, Mohamed Aly, and Amir Atiya. 2015. Astd: Arabic sentiment tweets dataset. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Stroudsburg PA, USA, 2515--2519.Google ScholarGoogle ScholarCross RefCross Ref
  34. Carmen Banea, Rada Mihalcea, and Janyce Wiebe. 2010. Multilingual subjectivity: Are more languages better? In Proceedings of the Proceedings of the 23rd International Conference on Computational Linguistics. Association for Computational Linguistics, Stroudsburg, PA, USA, 28--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Matic Perovšek, Janez Kranjc, Tomaž Erjavec, Bojan Cestnik, and Nada Lavrač. 2016. Textflows: A visual programming platform for text mining and natural language processing. Science of Computer Programming 121 (June 2016), 128--152. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Fréderic Godin, Baptist Vandersmissen, Wesley De Neve, and Rik Van de Walle. 2015. Multimedia lab @ acl wnut ner shared task: Named entity recognition for twitter microposts using distributed word representations. In Proceedings of the Workshop on Noisy User-generated Text. Association for Computational Linguistics, Stroudsburg, PA, USA, 146--153.Google ScholarGoogle ScholarCross RefCross Ref
  37. Mohit Bansal, Kevin Gimpel, and Karen Livescu. 2014. Tailoring continuous word representations for dependency parsing. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Stroudsburg, PA, USA, 809--815.Google ScholarGoogle ScholarCross RefCross Ref
  38. Duyu Tang, Furu Wei, Nan Yang, Ming Zhou, Ting Liu, and Bing Qin. 2014. Learning sentiment-specific word embedding for twitter sentiment classification. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Long Papers). Association for Computational Linguistics, Stroudsburg PA, USA, 1555--1565.Google ScholarGoogle ScholarCross RefCross Ref
  39. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2 (NIPS=13). Curran Associates Inc., USA, 3111--3119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Tomas Mikolov, Quoc V Le, and Ilya Sutskever. 2013. Exploiting similarities among languages for machine translation. ArXiv:1309.4168. Retrieved from https://arxiv.org/abs/1309.4168.Google ScholarGoogle Scholar

Index Terms

  1. Sentiment Analysis of Iraqi Arabic Dialect on Facebook Based on Distributed Representations of Documents

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 18, Issue 3
      September 2019
      386 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3305347
      Issue’s Table of Contents

      Copyright © 2019 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 January 2019
      • Revised: 1 September 2018
      • Accepted: 1 September 2018
      • Received: 1 June 2018
      Published in tallip Volume 18, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format