research-article

Sentiment Analysis of Iraqi Arabic Dialect on Facebook Based on Distributed Representations of Documents

Authors:
Anwar Alnawas

Department of Computer Engineering, Faculty of Technology, Gazi University, Turkey/Nasiriyah Technical Institute, Southern Technical University, Iraq

Department of Computer Engineering, Faculty of Technology, Gazi University, Turkey/Nasiriyah Technical Institute, Southern Technical University, Iraq
View Profile

,
Nursal Arici

Department of Computer Engineering, Faculty of Technology, Gazi University, Ankara, Turkey

Department of Computer Engineering, Faculty of Technology, Gazi University, Ankara, Turkey
View Profile

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 18 Issue 3Article No.: 20pp 1–17https://doi.org/10.1145/3278605

Published:09 January 2019Publication History

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

Nowadays, social media is used by many people to express their opinions about a variety of topics. Opinion Mining or Sentiment Analysis techniques extract opinions from user generated contents. Over the years, a multitude of Sentiment Analysis studies has been done about the English language with deficiencies of research in all other languages. Unfortunately, Arabic is one of the languages that seems to lack substantial research, despite the rapid growth of its use on social media outlets. Furthermore, specific Arabic dialects should be studied, not just Modern Standard Arabic. In this paper, we experiment sentiments analysis of Iraqi Arabic dialect using word embedding. First, we made a large corpus from previous works to learn word representations. Second, we generated word embedding model by training corpus using Doc2Vec representations based on Paragraph and Distributed Memory Model of Paragraph Vectors (DM-PV) architecture. Lastly, the represented feature used for training four binary classifiers (Logistic Regression, Decision Tree, Support Vector Machine and Naive Bayes) to detect sentiment. We also experimented different values of parameters (window size, dimension and negative samples). In the light of the experiments, it can be concluded that our approach achieves a better performance for Logistic Regression and Support Vector Machine than the other classifiers.

References

Thabit Sabbah, Ali Selamat, Md Hafiz Selamat, Fawaz S. Al-Anzi, Enrique Herrera Viedma, Ondrej Krejcar, and Hamido Fujita. 2017. Modified frequency-based term weighting schemes for text classification. Applied Soft Computing 58 (September 2017), 193--206.Google Scholar
Hassan Saif, Yulan He, Miriam Fernandez, and Harith Alani. 2016. Contextual semantics for sentiment analysis of twitter. Information Processing 8 Management 52, 1 (January 2016), 5--19. Google ScholarDigital Library
A. Aziz Altowayan and Lixin Tao. 2016. Word embeddings for arabic sentiment analysis. In Proceedings of the IEEE International Conference on Big Data. IEEE, Los Alamitos, CA, USA, 3820--3825.Google ScholarCross Ref
Bing Liu. 2012. Sentiment Analysis and Opinion mining, Morgan 8 Claypool Publishers. California, USA.Google Scholar
Aymen Abu-Errub, Ashraf Odeh, Qusai Shambour, and Osama Al-Haj Hassan. 2014. Arabic roots extraction using morphological analysis. International Journal of Computer Science Issues (IJCSI) 11, 2 (March 2014), 128--134.Google Scholar
Alaa M. El-Halees. 2017. Arabic opinion mining using distributed representations of documents. In Proceedings of the Palestinian International Conference on Information and Communication Technology. IEEE, Washington, DC, USA, 28--33.Google ScholarCross Ref
RM Duwairi, Nizar A. Ahmed, and Saleh Y. Al-Rifai. 2015. Detecting sentiment embedded in arabic social media--a lexicon-based approach. Journal of Intelligent 8 Fuzzy Systems 29, 1 (2015), 107--117.Google ScholarCross Ref
Abdullateef M. Rabab'ah, Mahmoud Al-Ayyoub, Yaser Jararweh, and Mohammed N. Al-Kabi. 2016. Evaluating sentistrength for arabic sentiment analysis. In Proceedings of the 7th International Conference on Computer Science and Information Technology (CSIT). IEEE, Washington, DC, USA, 1--6.Google Scholar
Sadam Al-Azani and El-Sayed M. El-Alfy. 2017. Using word embedding and ensemble learning for highly imbalanced data sentiment analysis in short arabic text. In Proceedings of the 8th International Conference on Ambient Systems, Networks and Technologies, ANT 2017. Procedia Computer Science, 359--366.Google Scholar
Anwar Alnawas and Nursal Arıcı. 2018. The corpus based approach to sentiment analysis in modern standard arabic and arabic dialects: A literature review. Journal of Polytechnic 21, 2 (June 2018), 461--470.Google Scholar
Abdelghani Dahou, Shengwu Xiong, Junwei Zhou, Mohamed Houcine Haddoud, and Pengfei Duan. 2016. Word embeddings and convolutional neural network for arabic sentiment classification. In Proceedings of the 26th International Conference on Computational Linguistics. Association for Computational Linguistics, Stroudsburg PA USA, 2418--2427.Google Scholar
Mohamed Aly and Amir Atiya. 2013. Labr: Large scale arabic book reviews. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Stroudsburg, PA, USA, 494--498.Google Scholar
Hady ElSahar and Samhaa R. El-Beltagy. 2015. Building large arabic multi-domain resources for sentiment analysis. In Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics. Springer, Cham, Switzerland, 23--34.Google Scholar
Eshrag Refaee and Verena Rieser. 2014. An arabic twitter corpus for subjectivity and sentiment analysis. In Proceedings of the 9th International Language Resources and Evaluation Conference. European Language Resources Association, France, 2268--2273.Google Scholar
Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig. 2013. Linguistic regularities in continuous space word representations. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT-2013), Association for Computational Linguistics, Stroudsburg PA, USA, 746--751.Google Scholar
Ahmad Al-Sallab, Ramy Baly, Hazem Hajj, Khaled Bashir Shaban, Wassim El-Hajj, and Gilbert Badaro.2017.Aroma: A recursive deep learning model for opinion mining in arabic as a low resource language. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 16, 4, Article 25 (July 2017), 20 pages. Google ScholarDigital Library
Ramy Baly, Hazem Hajj, Nizar Habash, Khaled Bashir Shaban, and Wassim El-Hajj. 2017. A sentiment treebank and morphologically enriched recursive deep models for effective sentiment analysis in arabic. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 16, 4, Article 23 (July 2017), 21 pages. Google ScholarDigital Library
Fadi Biadsy, Julia Hirschberg, and Nizar Habash. 2009. Spoken arabic dialect identification using phonotactic modeling. In Proceedings of the EACL 2009 workshop on computational approaches to semitic languages. Association for Computational Linguistics, Stroudsburg, PA, USA, 53--61. Google ScholarDigital Library
Matti Phillips Khoshaba Al-Bazi. 2005. Iraqi Dialect Versus Standard Arabic, Matti Phillips Khoshaba (Al- Bazi). United States.Google Scholar
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv:1301.3781. Retrieved from https://arxiv.org/abs/1301.3781.Google Scholar
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Stroudsburg PA, USA, 1532--1543.Google ScholarCross Ref
Ahmet Hayran and Mustafa Sert. 2017. Sentiment analysis on microblog data based on word embedding and fusion techniques. In Proceedings of the 25th Signal Processing and Communications Applications Conference (SIU). IEEE, Washington, DC, USA, 1--4.Google ScholarCross Ref
Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In Proceedings of the 31st International Conference on International Conference on Machine Learning. JMLR.org, 1188--1196. Google ScholarDigital Library
Antoine J-P Tixier, Michalis Vazirgiannis, and Matthew R. Hallowell. 2016. Word embeddings for the construction domain. ArXiv:1610.09333. Retrieved from https://arXiv:1610.09333.Google Scholar
Aitor García-Pablos, Montse Cuadros, and German Rigau. 2018. W2vlda: Almost unsupervised system for aspect based sentiment analysis. Expert Systems with Applications 91 (January 2018), 127--137.Google Scholar
Maria Giatsoglou, Manolis G. Vozalis, Konstantinos Diamantaras, Athena Vakali, George Sarigiannidis, and Konstantinos Ch. Chatzisavvas. 2017. Sentiment analysis leveraging emotions and word embeddings. Expert Systems with Applications 69 (March 2017), 214--224.Google Scholar
Sungwoon Choi, Jangho Lee, Min-Gyu Kang, Hyeyoung Min, Yoon-Seok Chang, and Sungroh Yoon. 2017. Large-scale machine learning of media outlets for understanding public reactions to nation-wide viral infection outbreaks. Methods 129, 1 (October 2017), 50--59.Google ScholarCross Ref
Marwa Naili, Anja Habacha Chaibi, and Henda Hajjami Ben Ghezala. 2017. Comparative study of word embedding methods in topic segmentation. Procedia Computer Science 112 (September 2017), 340--349. Google ScholarDigital Library
Mohammad Al-Smadi, Mahmoud Al-Ayyoub, Yaser Jararweh, and Omar Qawasmeh. 2018. Enhancing aspect-based sentiment analysis of arabic hotels’ reviews using morphological, syntactic and semantic features. Information Processing 8 Management (January 2018).Google Scholar
Hunaida Awwad and Adil Alpkocak. 2017. Using hybrid-stemming approach to enhance lexicon-based sentiment analysis in arabic. In Proceedings of the International Conference on New Trends in Computing Sciences (ICTCS). IEEE, Los Alamitos, CA, USA, 229--235.Google ScholarCross Ref
Nawaf A. Abdulla, Nizar A. Ahmed, Mohammed A. Shehab, and Mahmoud Al-Ayyoub. 2013. Arabic sentiment analysis: Lexicon-based and corpus-based. In Proceedings of the IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT). IEEE, Los Alamitos, CA, USA, 1--6.Google ScholarCross Ref
Mohammed Rushdi‐Saleh, M Teresa Martín‐Valdivia, L. Alfonso Ureña‐López, and José M. Perea‐Ortega. 2011. Oca: Opinion corpus for arabic. J. Am. Soc. Inf. Sci. Technol. 62, 10 (October 2011), 2045--2054. Google ScholarDigital Library
Mahmoud Nabil, Mohamed Aly, and Amir Atiya. 2015. Astd: Arabic sentiment tweets dataset. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Stroudsburg PA, USA, 2515--2519.Google ScholarCross Ref
Carmen Banea, Rada Mihalcea, and Janyce Wiebe. 2010. Multilingual subjectivity: Are more languages better? In Proceedings of the Proceedings of the 23rd International Conference on Computational Linguistics. Association for Computational Linguistics, Stroudsburg, PA, USA, 28--36. Google ScholarDigital Library
Matic Perovšek, Janez Kranjc, Tomaž Erjavec, Bojan Cestnik, and Nada Lavrač. 2016. Textflows: A visual programming platform for text mining and natural language processing. Science of Computer Programming 121 (June 2016), 128--152. Google ScholarDigital Library
Fréderic Godin, Baptist Vandersmissen, Wesley De Neve, and Rik Van de Walle. 2015. Multimedia lab @ acl wnut ner shared task: Named entity recognition for twitter microposts using distributed word representations. In Proceedings of the Workshop on Noisy User-generated Text. Association for Computational Linguistics, Stroudsburg, PA, USA, 146--153.Google ScholarCross Ref
Mohit Bansal, Kevin Gimpel, and Karen Livescu. 2014. Tailoring continuous word representations for dependency parsing. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Stroudsburg, PA, USA, 809--815.Google ScholarCross Ref
Duyu Tang, Furu Wei, Nan Yang, Ming Zhou, Ting Liu, and Bing Qin. 2014. Learning sentiment-specific word embedding for twitter sentiment classification. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Long Papers). Association for Computational Linguistics, Stroudsburg PA, USA, 1555--1565.Google ScholarCross Ref
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2 (NIPS=13). Curran Associates Inc., USA, 3111--3119. Google ScholarDigital Library
Tomas Mikolov, Quoc V Le, and Ilya Sutskever. 2013. Exploiting similarities among languages for machine translation. ArXiv:1309.4168. Retrieved from https://arxiv.org/abs/1309.4168.Google Scholar

Index Terms

Sentiment Analysis of Iraqi Arabic Dialect on Facebook Based on Distributed Representations of Documents
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Sentiment analysis

Recommendations

A Sentiment Analysis Algorithm of Danmaku Based on Building a Mixed Fine-grained Sentiment Lexicon
ICCPR '20: Proceedings of the 2020 9th International Conference on Computing and Pattern Recognition

The Danmaku is a form of instant video text commentary that reflects the viewer's sentiment orientation. Currently, most of sentiment analysis algorithms based on the sentiment lexicon are using manual construction of the lexicon. However, this kind of ...
Read More
Social Sentiment Detection of Event via Microblog
CSE '13: Proceedings of the 2013 IEEE 16th International Conference on Computational Science and Engineering

Sentimental analyses of the public have been attracting increasing attentions from researchers. This paper focuses on the research problem of social sentiment detection, which aims to identify the sentiments of the public evoked by online microblogs. A ...
Read More
Topic-related Chinese message sentiment analysis

Considering sentiment analysis of microblogs plays an important role in behavior analysis of social media, there has been a significant progress in this area recently. However, most researches are topic-ignored and neglect the sentimental orientation ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Asian and Low-Resource Language Information Processing Volume 18, Issue 3
September 2019
386 pages
ISSN:2375-4699
EISSN:2375-4702
DOI:10.1145/3305347
Editor:
Nianwen Xue
Brandeis University, Waltham, USA
Issue’s Table of Contents
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 January 2019
- Revised: 1 September 2018
- Accepted: 1 September 2018
- Received: 1 June 2018
Published in tallip Volume 18, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Doc2Vec
Iraqi Arabic Dialect
facebook
sentiments analysis
word embedding
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 25
  Total Citations
  View Citations
- 431
  Total Downloads
- Downloads (Last 12 months)24
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Sentiment Analysis of Iraqi Arabic Dialect on Facebook Based on Distributed Representations of Documents

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

References

Cited By

Index Terms

Recommendations

A Sentiment Analysis Algorithm of Danmaku Based on Building a Mixed Fine-grained Sentiment Lexicon

Social Sentiment Detection of Event via Microblog

Topic-related Chinese message sentiment analysis

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Sentiment Analysis of Iraqi Arabic Dialect on Facebook Based on Distributed Representations of Documents

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

References

Cited By

Index Terms

Recommendations

A Sentiment Analysis Algorithm of Danmaku Based on Building a Mixed Fine-grained Sentiment Lexicon

Social Sentiment Detection of Event via Microblog

Topic-related Chinese message sentiment analysis

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media