Skip to main content

Evaluation of Perstem: A Simple and Efficient Stemming Algorithm for Persian

  • Conference paper
Multilingual Information Access Evaluation I. Text Retrieval Experiments (CLEF 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6241))

Included in the following conference series:

Abstract

Persian is a challenging language in the field of NLP. Right-to-left orthography, complex morphology, complicated grammatical rules, and different forms of letters make it an interesting language for NLP research. In this paper we measure the effectiveness of a simple and efficient stemming algorithm, Perstem, on Persian information retrieval. Our experiments on the Hamshahri corpus at CLEF2009 show that the Perstem algorithm greatly improved both precision (+91%) and recall (+43%).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agirre, E., Di Nunzio, G.M., Ferro, N., Mandl, T., Peters, C.: CLEF 2008 Ad hoc track overview. In: Peters, C., Deselaers, T., Ferro, N., Gonzalo, J., Jones, G.J.F., Kurimo, M., Mandl, T., Peñas, A., Petras, V. (eds.) CLEF 2008. LNCS, vol. 5706, pp. 15–37. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  2. AleAhmad, A., Kamalloo, E., Zareh, A., Rahgozar, M., Oroumchian, F.: Cross Language Experiments at Persian@CLEF 2008. In: Peters, C., Deselaers, T., Ferro, N., Gonzalo, J., Jones, G.J.F., Kurimo, M., Mandl, T., Peñas, A., Petras, V. (eds.) CLEF 2008. LNCS, vol. 5706, pp. 105–112. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  3. AleAhmad, A., Amiri, H., Darrudi, E., Rahgozar, M., Oroumchian, F.: Hamshahri: A standard Persian text collection. Knowledge-Based Systems 22(5), 382–387 (2009)

    Article  Google Scholar 

  4. Dehdari, J., Lonsdale, D.: A link grammar parser for Persian. Aspects of Iranian Linguistics, vol. 1. Cambridge Scholars Press (2008)

    Google Scholar 

  5. Dolamic, L., Savoy, J.: Persian Language, Is Stemming Efficient? In: 20th International Workshop on Database and Expert Systems Application, Linz, Austria, pp. 388–392 (2009)

    Google Scholar 

  6. Ferro, N., Peters, C.: CLEF 2009 Ad Hoc Track Overview: TEL & Persian Tasks. In: Workshop on Cross-Language Information Retrieval and Evaluation, Corfu, Greece (2009)

    Google Scholar 

  7. Karimpour, R., Ghorbani, A., Pishdad, A., Mohtarami, M., AleAhmad, A.: Using Part of Speech tagging in Persian Information Retrieval. In: Peters, C., Deselaers, T., Ferro, N., Gonzalo, J., Jones, G.J.F., Kurimo, M., Mandl, T., Peñas, A., Petras, V. (eds.) CLEF 2008. LNCS, vol. 5706, Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  8. Metzler, D., Croft, W.B.: Combining the Language Model and Inference Network Approaches to Retrieval. Information Processing and Management Special Issue on Bayesian Networks and Information Retrieval 40(5), 735–750 (2004)

    Google Scholar 

  9. Mokhtaripour, A., Jahanpour, S.: Introduction to a new Farsi stemmer. In: 15th ACM International Conference on Information and Knowledge Management. ACM, USA (2006)

    Google Scholar 

  10. Shahbazi, H., Mokhtaripour, A., Dalvi, M., Tork Ladani, B.: A New Approach for Scoring Relevant Documents by Applying a Farsi Stemming Method in Persian Web Search Engines. In: 13th International CSI Computer Conference, Kish Island, Iran, pp. 745–748 (2008)

    Google Scholar 

  11. Sharifloo, A., Shamsfard, M.: A Bottom Up approach to Persian Stemming. In: Third International Joint Conference on Natural Language Processing. ACL, India (2008)

    Google Scholar 

  12. Taghva, K., Beckley, R., Sadeh, M.: A Stemming Algorithm for the Farsi Language. In: International Conference on Information Technology: Coding and Computing. IEEE Computer Society, USA (2005)

    Google Scholar 

  13. Tashakori, M., Meybodi, M., Oroumchian, F.: Bon: First Persian Stemmer. In: First Eurasia Conference on Advances in Information and Communication Technology, Tehran, Iran (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jadidinejad, A.H., Mahmoudi, F., Dehdari, J. (2010). Evaluation of Perstem: A Simple and Efficient Stemming Algorithm for Persian. In: Peters, C., et al. Multilingual Information Access Evaluation I. Text Retrieval Experiments. CLEF 2009. Lecture Notes in Computer Science, vol 6241. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15754-7_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15754-7_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15753-0

  • Online ISBN: 978-3-642-15754-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics