Skip to main content

Lexicon Based Sentiment Analysis of Urdu Text Using SentiUnits

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6437))

Abstract

Like other languages, Urdu websites are becoming more popular, because the people prefer to share opinions and express sentiments in their own language. Sentiment analyzers developed for other well-studied languages, like English, are not workable for Urdu, due to their scriptic, morphological, and grammatical differences. As a result, this language should be studied as an independent problem domain. Our approach towards sentiment analysis is based on the identification and extraction of SentiUnits from the given text, using shallow parsing. SentiUnits are the expressions, which contain the sentiment information in a sentence. We use sentiment-annotated lexicon based approach. Unluckily, for Urdu language no such lexicon exists. So, a major part of this research consists in developing such a lexicon. Hence, this paper is presented as a base line for this colossal and complex task. Our goal is to highlight the linguistic (grammar and morphology) as well as technical aspects of this multidimensional research problem. The performance of the system is evaluated on multiple texts and the achieved results are quite satisfactory.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Pang, B., Lee, L.: Opinion mining and sentiment analysis. Foundation and Trends in Information Retrieval 2(1-2), 1–135 (2008)

    Google Scholar 

  2. Bautin, M., Vijayarenu, L., Skiena, S.: International sentiment analysis for news and blogs. In: International Conference on Weblogs and Social Media, ICWSM (2008)

    Google Scholar 

  3. Hatzivassiloglou, V., Wiebe, J.: Effects of Adjective Orientation and Gradability on Sentence Subjectivity. In: 18th International Conference on Computational Linguistics, New Brunswick, NJ (2000).

    Google Scholar 

  4. Turney, P.: Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews. In: ACL, Ph, PA, pp. 417–424 (July 2002)

    Google Scholar 

  5. Riaz, K.: Challenges in Urdu Stemming. Future Directions in Information Access, Glasgow (August 2007)

    Google Scholar 

  6. Akram, Q., Naseer, A., Hussain, S.: Assas-band, an Affix-Exception-List Based Urdu Stemmer. In: 7th Workshop on Asian Language Resources, IJCNLP 2009, Singapore (2009)

    Google Scholar 

  7. Melville, P., Gryc, W., Lawrence, R.D.: Sentiment analysis of blogs by combining lexical knowledge with text classification. In: Conference on Knowledge Discovery and Data Mining (2009)

    Google Scholar 

  8. Bloom, K., Argamon, S.: Unsupervised Extraction of Appraisal Expressions. In: Farzindar, A., Kešelj, V. (eds.) Canadian AI 2010. LNCS (LNAI), vol. 6085, pp. 290–294. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  9. Annet, M., Kondark, G.: A comparison of sentiment analysis techniques: Polarizing movie blogs. In: Bergler, S. (ed.) Canadian AI 2008. LNCS (LNAI), vol. 5032, pp. 25–35. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  10. Bloom, K., Argamon, S.: Automated learning of appraisal extraction patterns. In: Gries, S.T., Wulff, S., Davies, M. (eds.) Corpus Linguistic Applications: Current Studies, New Directions. Rodopi, Amsterdam (2009)

    Google Scholar 

  11. Andreevskaia, A., Bergler, S.: Mining WordNet for fuzzy sentiment: Sentiment tag extraction from WordNet glosses. In: EACL 2006, Trent, Italy (2006)

    Google Scholar 

  12. Mansour, Y., Mohri, M., Rostamizadeh, A.: Multiple source adaptation and the Renyi divergence. In: Uncertainty in Artificial Intelligence, UAI (2009)

    Google Scholar 

  13. Tan, S., Cheng, Z., Wang, Y., Xu, H.: Adapting Naive Bayes to Domain Adaptation for Sentiment Analysis. In: Advances in Information Retrieval, vol. 5478, pp. 337–349 (2009)

    Google Scholar 

  14. Bansal, M., Cardi, C., Lee, L.: The power of negative thinking: Exploring label disagreement in the min cut classification framework. In: International Conference in Computational Linguistics, COLING (2008)

    Google Scholar 

  15. Hu, M., Lui, B.: Mining and summarizing customer reviews. In: Conference on Human Language Technology and Empirical Methods in Natural Language Processing (2005)

    Google Scholar 

  16. Whitelaw, C., Garg, N., Argamon, S.: Using appraisal taxonomies for sentiment analysis. In: SIGIR (2005)

    Google Scholar 

  17. Na, J.-C., Sui, H., Khoo, C., Chan, S., Zhou, Y.: Effectiveness of simple linguistic processing in automatic sentiment classification of product reviews. In: Conference of the International Society of Knowledge Organization (ISKO), pp. 49–54 (2004)

    Google Scholar 

  18. Muaz, A., Khan, A.: The morphosyntactic behavior of ‘Wala’ in Urdu Language. In: 28th Annual Meeting of the South Asian Language Analysis Roundtable, SALA 2009, University of North Texas, US (2009)

    Google Scholar 

  19. Durrani, N., Hussain, S.: Urdu Word Segmentation. In: 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL HLT 2010), Los Angeles, US (2010)

    Google Scholar 

  20. Riaz, K.: Stop Word Identification in Urdu. In: Conference of Language and Technology, Bara Gali, Pakistan (August 2007)

    Google Scholar 

  21. Ijaz, M., Hussain, S.: Corpus based Urdu Lexicon Development. In: Conference on Language Technology (CLT 2007), University of Peshawar, Pakistan (2007)

    Google Scholar 

  22. Schmidt, R.: Urdu: An Essential Grammar. Routlege Publishing, New York (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Syed, A.Z., Aslam, M., Martinez-Enriquez, A.M. (2010). Lexicon Based Sentiment Analysis of Urdu Text Using SentiUnits. In: Sidorov, G., Hernández Aguirre, A., Reyes García, C.A. (eds) Advances in Artificial Intelligence. MICAI 2010. Lecture Notes in Computer Science(), vol 6437. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16761-4_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-16761-4_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-16760-7

  • Online ISBN: 978-3-642-16761-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics