Skip to main content

Construction and Exploitation of an Algerian Corpus for Opinion and Emotion Analysis

  • Chapter
  • First Online:
Advances in Knowledge Discovery and Management

Part of the book series: Studies in Computational Intelligence ((SCI,volume 1004))

Abstract

The opinions we form are products of knowledge which are built over the years, from a mixture of our surroundings, culture and traditions. Several methods have been proposed over the years. However, dialects were not given much importance, as most studies focus on the English language. Reviewing the research carried out on North African dialects in general, particularly the Algerian dialect and to compensate for the shortage of documented information, we developed a platform which we called “TWIFIL”, for public data annotation. This resulted in the generation of an annotated corpus and a lexicon for the opinion and emotion analysis of Algerian dialects. The purpose of this work is twofold. Firstly, it addresses the shortage of relevant data for the Algerian dialect’s opinion and emotion analysis. Secondly, it provides a more reliable (the assessment of more than one person) annotated corpus. To validate our corpus, a neural sentiment classifier is created which gives competitive results compared to support vector machines and proves that the proposed corpus can be used for opinion and emotion analysis’s tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://wearesocial.com/blog/2018/01/global-digital-report-2018 (visited on 01st, January 2019).

  2. 2.

    https://www.slideshare.net/wearesocial/digital-in-2018-in-northern-africa-86865355.

  3. 3.

    http://gs.statcounter.com/social-media-stats/all/algeria/2019.

  4. 4.

    https://www.worldatlas.com/articles/what-languages-are-spoken-in-algeria.html.

  5. 5.

    https://newrepublic.com/article/150506/universal-basic-income-future-of-pointless-work.

  6. 6.

    All words quoted from the Algerian dialect were given by the authors, who are regular users of the dialect and social media.

  7. 7.

    https://twifil.com.

  8. 8.

    http://bit.do/twifil.

References

  • Abdulla, N. A., Ahmed, N. A., Shehab, M. A., Al-Ayyoub, M., Al-Kabi, M. N., & Al-rifai, S. (2014). Towards improving the lexicon-based approach for Arabic sentiment analysis. International Journal of Information Technology and Web Engineering (IJITWE), 9(3), 55–71.

    Article  Google Scholar 

  • Abo, M. E. M., Ahmed, N., & Balakrishnan, V. (2018). Arabic sentiment analysis: An overview of the ml algorithms. In Data Science Research Symposium 2018 (pp. 63).

    Google Scholar 

  • Al-Moslmi, T., Albared, M., Al-Shabi, A., Omar, N., & Abdullah, S. (2018). Arabic senti-lexicon: Constructing publicly available language resources for Arabic sentiment analysis. Journal of Information Science, 44(3), 345–362.

    Article  Google Scholar 

  • Al-Radaideh, Q. A., & Al-Qudah, G. Y. (2017). Application of rough set-based feature selection for Arabic sentiment analysis. Cognitive Computation, pp. 1–10.

    Google Scholar 

  • Al Sallab, A. A., Baly, R., Badaro, G., Hajj, H., El Hajj, W., & Shaban, K. B. (2015). Deep learning models for sentiment analysis in Arabic. In ANLP Workshop (vol. 9).

    Google Scholar 

  • Al-Smadi, M., Qawasmeh, O., Al-Ayyoub, M., Jararweh, Y., & Gupta, B. (2018). Deep recurrent neural network vs. support vector machine for aspect-based sentiment analysis of Arabic hotels? reviews. Journal of Computational Science, 27, 386–393.

    Article  Google Scholar 

  • Alnawas, A., & Arici, N. (2019). Sentiment analysis of Iraqi Arabic dialect on facebook based on distributed representations of documents. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 18(3), 20.

    Google Scholar 

  • Arti Buche, M. B.,& Chandak, A. Z. (2013). Opinion mining and analysis: A survey. International Journal on Natural Language Computing (IJNLC), 2(3).

    Google Scholar 

  • Atoum, J. O., & Nouman, M. (2019). Sentiment analysis of Arabic jordanian dialect tweets. International Journal of Advanced Computer Science and Applications, 10(2), 256–262.

    Article  Google Scholar 

  • Baly, R., Hajj, H., Habash, N., Shaban, K. B., & El-Hajj, W. (2017). A sentiment treebank and morphologically enriched recursive deep models for effective sentiment analysis in Arabic. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 16(4), 23.

    Google Scholar 

  • Baly, R., Khaddaj, A., Hajj, H., El-Hajj, W., & Shaban, K. B. (2019). Arsentd-lev: A multi-topic corpus for target-based sentiment analysis in Arabic levantine tweets. arXiv:1906.01830.

  • Bilal, M., Israr, H., Shahid, M., & Khan, A. (2016). Sentiment classification of roman-urdu opinions using naïve bayesian, decision tree and knn classification techniques. Journal of King Saud University-Computer and Information Sciences, 28(3), 330–344.

    Article  Google Scholar 

  • Cambria, E., Mazzocco, T., & Hussain, A. (2013). Application of multi-dimensional scaling and artificial neural networks for biologically inspired opinion mining. Biologically Inspired Cognitive Architectures, 4, 41–53.

    Article  Google Scholar 

  • Cheng, K., Li, J., Tang, J., & Liu, H. (2017). Unsupervised sentiment analysis with signed social networks. In AAAI (pp. 3429–3435).

    Google Scholar 

  • Das, B., & Chakraborty, S. (2018). An improved text sentiment classification model using tf-idf and next word negation. arXiv:1806.06407.

  • Diab, M., Habash, N., Rambow, O., Altantawy, M., & Benajiba, Y. (2010). Colaba: Arabic dialect annotation and processing. In Lrec Workshop on Semitic Language Processing (pp. 66–74).

    Google Scholar 

  • Dixit, A., Pal, A. K., Temghare, S., & Mapari, V. (2017). Emotion detection using decision tree. Development, 4(2).

    Google Scholar 

  • Duwairi, R. M. (2015). Sentiment analysis for dialectical Arabic. In 6th International Conference on Information and Communication Systems (ICICS) (pp. 166–170). IEEE.

    Google Scholar 

  • Duwairi, R. M., & Qarqaz, I. (2014). Arabic sentiment analysis using supervised classification. In 2014 International Conference on Future Internet of Things and Cloud (pp. 579–583). IEEE.

    Google Scholar 

  • ElSahar, H., & El-Beltagy, S. R. (2015). Building large arabic multi-domain resources for sentiment analysis. In International Conference on Intelligent Text Processing and Computational Linguistics (pp. 23–34). Springer.

    Google Scholar 

  • Glorot, X., Bordes, A., & Bengio, Y. (2011). Domain adaptation for large-scale sentiment classification: A deep learning approach. In Proceedings of the 28th International Conference on Machine Learning (ICML-11) (pp. 513–520).

    Google Scholar 

  • Guellil, I., Adeel, A., Azouaou, F., & Hussain, A. (2018). Sentialg: Automated corpus annotation for Algerian sentiment analysis. In International Conference on Brain Inspired Cognitive Systems (pp. 557–567). Springer.

    Google Scholar 

  • Guellil, I. & Azouaou, F. (2017). Asda: Analyseur syntaxique du dialecte algérien dans un but d’analyse sémantique. arXiv:1707.08998.

  • Habash, N., Diab, M. T., & Rambow, O. (2012). Conventional orthography for dialectal Arabic. In LREC (pp. 711–718).

    Google Scholar 

  • Habash, N., Rambow, O., & Roth, R. (2009). Mada+ tokan: A toolkit for arabic tokenization, diacritization, morphological disambiguation, pos tagging, stemming and lemmatization. In Proceedings of the 2nd International Conference on Arabic Language Resources and Tools (MEDAR), Cairo, Egypt (vol. 41, p. 62).

    Google Scholar 

  • Habash, N. Y. (2010). Introduction to Arabic natural language processing. Synthesis Lectures on Human Language Technologies, 3(1), 1–187.

    Article  Google Scholar 

  • Hu, X., Tang, J., Gao, H., & Liu, H. (2013). Unsupervised sentiment analysis with emotional signals. In Proceedings of the 22nd International Conference on World Wide Web (pp. 607–618). ACM.

    Google Scholar 

  • Ismail, R., Omer, M., Tabir, M., Mahadi, N., & Amin, I. (2018). Sentiment analysis for Arabic dialect using supervised learning. In 2018 International Conference on Computer, Control, Electrical, and Electronics Engineering (ICCCEEE) (pp. 1–6). IEEE.

    Google Scholar 

  • Jarrar, M., Habash, N., Akra, D. F., & Zalmout, N. (2014). Building a corpus for Palestinian Arabic: A preliminary study.

    Google Scholar 

  • Kim, Y. (2014). Convolutional neural networks for sentence classification. arXiv:1408.5882.

  • Li, Q., Fu, Y., Zhou, X., & Xu, Y. (2009). The investigation and application of svc and svr in handling missing values. In 2009 First International Conference on Information Science and Engineering (pp. 1002–1005). IEEE.

    Google Scholar 

  • Maamouri, M., & Cieri, C. (2002). Resources for Arabic natural language processing. In International Symposium on Processing Arabic (vol. 1).

    Google Scholar 

  • Maas, A. L., Daly, R. E., Pham, P. T., Huang, D., Ng, A. Y., & Potts, C. (2011). Learning word vectors for sentiment analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1 (pp. 142–150). Association for Computational Linguistics.

    Google Scholar 

  • Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S., & McClosky, D. (2014). The stanford corenlp natural language processing toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations (pp. 55–60).

    Google Scholar 

  • Mataoui, M., Zelmati, O., & Boumechache, M. (2016). A proposed lexicon-based sentiment analysis approach for the vernacular Algerian Arabic. Research in Computing Science, 110, 55–70.

    Article  Google Scholar 

  • Medhaffar, S., Bougares, F., Estève, Y., & Hadrich-Belguith, L. (2017). Sentiment analysis of tunisian dialects: Linguistic ressources and experiments. In Proceedings of the 3rd Arabic Natural Language Processing Workshop (pp. 55–61).

    Google Scholar 

  • Meftouh, K., Bouchemal, N., & Smaïli, K. (2012). A study of a non-resourced language: An Algerian dialect. In Spoken Language Technologies for Under-Resourced Languages.

    Google Scholar 

  • Menacer, M. A., Mella, O., Fohr, D., Jouvet, D., Langlois, D., & Smaïli, K. (2017). Development of the Arabic Loria Automatic Speech Recognition System (ALASR) and its evaluation for Algerian dialect. Procedia Computer Science, 117, 81–88.

    Article  Google Scholar 

  • Mohammad, S., & Kiritchenko, S. (2018). Understanding emotions: A dataset of tweets to study interactions between affect categories. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018).

    Google Scholar 

  • Nasser, A., Dinçer, K., & Sever, H. (2016). Investigation of the feature selection problem for sentiment analysis in Arabic language. Research in Computing Science, 110, 41–54.

    Article  Google Scholar 

  • Pak, A., & Paroubek, P. (2010). Twitter as a corpus for sentiment analysis and opinion mining. In LREc (vol. 10, pp. 1320–1326).

    Google Scholar 

  • Pang, B., & Lee, L., et al. (2008). Opinion mining and sentiment analysis. Foundations and Trends® in Information Retrieval, 2(1–2), 1–135.

    Google Scholar 

  • Plutchik, R. (1984). Emotions: A general psychoevolutionary theory. Approaches to Emotion, 1984, 197–219.

    Google Scholar 

  • Poria, S., Cambria, E., & Gelbukh, A. (2016). Aspect extraction for opinion mining with a deep convolutional neural network. Knowledge-Based Systems, 108, 42–49.

    Article  Google Scholar 

  • Qwaider, C., Chatzikyriakidis, S., & Dobnik, S. (2019). Can modern standard Arabic approaches be used for Arabic dialects? sentiment analysis as a case study. In Proceedings of the 3rd Workshop on Arabic Corpus Linguistics (pp. 40–50).

    Google Scholar 

  • Qwaider, C., Saad, M., Chatzikyriakidis, S., & Dobnik, S. (2018). Shami: A corpus of levantine Arabic dialects. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018).

    Google Scholar 

  • Rahab, H., Zitouni, A., and Djoudi, M. (2017). Siaac: Sentiment polarity identification on Arabic Algerian newspaper comments. In Proceedings of the Computational Methods in Systems and Software (pp. 139–149). Springer.

    Google Scholar 

  • Ravi, K., & Ravi, V. (2015). A survey on opinion mining and sentiment analysis: Tasks, approaches and applications. Knowledge-Based Systems, 89, 14–46.

    Article  Google Scholar 

  • Saadane, H., & Habash, N. (2015). A conventional orthography for Algerian Arabic. In Proceedings of the Second Workshop on Arabic Natural Language Processing (pp. 69–79).

    Google Scholar 

  • Salem, F. (2017). Social media and the internet of things towards data-driven policymaking in the Arab world: Potential, limits and concerns. The Arab Social Media Report, Dubai: MBR School of Government, Vol. 7, 2017. Available at SSRN: https://ssrn.com/abstract=2911832.

  • Sankoff, D., & Poplack, S. (1981). A formal grammar for code-switching. Research on Language & Social Interaction, 14(1), 3–45.

    Google Scholar 

  • Shoukry, A. & Rafea, A. (2012a). Preprocessing Egyptian dialect tweets for sentiment mining. In The Fourth Workshop on Computational Approaches to Arabic Script-Based Languages (p. 47).

    Google Scholar 

  • Shoukry, A., & Rafea, A. (2012b). Sentence-level Arabic sentiment analysis. In 2012 International Conference on Collaboration Technologies and Systems (CTS) (pp. 546–550). IEEE.

    Google Scholar 

  • Taboada, M., Brooke, J., Tofiloski, M., Voll, K., & Stede, M. (2011). Lexicon-based methods for sentiment analysis. Computational Linguistics, 37(2), 267–307.

    Article  Google Scholar 

  • Xie, X., Ge, S., Hu, F., Xie, M., & Jiang, N. (2017). An improved algorithm for sentiment analysis based on maximum entropy. Soft Computing, pp. 1–13.

    Google Scholar 

  • You, Q., Luo, J., Jin, H., & Yang, J. (2016). Cross-modality consistent regression for joint visual-textual sentiment analysis of social multimedia. In Proceedings of the Ninth ACM International Conference on Web Search and Data Mining (pp. 13–22). ACM.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Leila Moudjari .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Moudjari, L., Akli-Astouati, K. (2022). Construction and Exploitation of an Algerian Corpus for Opinion and Emotion Analysis. In: Jaziri, R., Martin, A., Rousset, MC., Boudjeloud-Assala, L., Guillet, F. (eds) Advances in Knowledge Discovery and Management. Studies in Computational Intelligence, vol 1004. Springer, Cham. https://doi.org/10.1007/978-3-030-90287-2_1

Download citation

Publish with us

Policies and ethics