Skip to main content

An Adaptive Information Retrieval System for Efficient Web Searching

  • Conference paper
Advanced Machine Learning Technologies and Applications (AMLTA 2014)

Abstract

Stemming algorithms (stemmers) are used to convert the words to their root form (stem), this process is used in the pre-processing stage of the Information Retrieval Systems. The Stemmers affect the indexing time by reducing the size of index file and improving the performance of the retrieval process. There are several stemming algorithms, the most widely used is porter stemming algorithm because of its efficiency, simplicity, speed, and also it easily handles exceptions. However there are some drawbacks, although many attempts were made to improve its structure but it was incomplete. This paper provides an efficient information retrieval technique as well as proposes a new stemming algorithm called Enhanced Porter’s Stemming Algorithm (EPSA). The objective of this technique is to overcome the drawbacks of the porter algorithm and improve the web searching. The EPSA was applied to two datasets to measure its performance. The result shows improvement of the precision over the original porter algorithm while realizing approximately the same recall percentage.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Singhal, A.: Modern Information Retrieval: A Brief Overview. IEEE Data Engineering Bulletin 24(4), 35–43 (2011)

    Google Scholar 

  2. Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. ACM Press, New York (1999)

    Google Scholar 

  3. Yamout, F., Demachkieh, R., Hamdan, G., Sabra, R.: Further Enhancement to the Porter’s Stemming Algorithm. In: Machine Learning and Interaction for Text based Information Retrieval, Germany, pp. 7– 23 (2004)

    Google Scholar 

  4. Maurya, V., Pandey, P., Maurya, L.S.: Effective Information Retrieval System. International Journal of Emerging Technology and Advanced Engineering 3(4), 787–792 (2013)

    Google Scholar 

  5. Sembok, T., Abu Ata, B., Bakar, Z.: A Rule and Template Based Stemming Algorithm for Arabic Language. International Journal of Mathemtical models and Methods in Applied Sciences 5(5), 974–981 (2011)

    Google Scholar 

  6. Lovins, J.: Development of a stemming algorithm. Mechanical Translation and Computational Linguistics 11, 22–31 (1968)

    Google Scholar 

  7. Bijal, D., Sanket, S.: Overview of Stemming Algorithms for Indian and Non-Indian Languages. International Journal of Computer Science and Information Technologies (IJCSIT) 5(2), 1144–1146 (2014)

    Google Scholar 

  8. Jivani, A.: A Comparative Study of Stemming Algorithms. Int. J. Comp. Tech. Appl. 2(6), 1930–1938 (2011)

    Google Scholar 

  9. Paice, C.: Another stemmer. ACM SIGIR Forum 24(3), 56–61 (1990)

    Article  Google Scholar 

  10. Sharma, D.: Stemming Algorithms: A Comparative Study and their Analysis. International Journal of Applied Information Systems 4(3), 7–12 (2012)

    Article  Google Scholar 

  11. Smirnov, I.: Overview of Stemming Algorithms, http://the-smirnovs.org/info/stemming.pdf

  12. Dawson, J.: Suffix removal and word conflation. ALLC Bulletin 2(3), 33–46 (1974)

    Google Scholar 

  13. Willett, P.: The Porter stemming algorithm: then and now. Program: Electronic Library and Information Systems 40(3), 219–223 (2006)

    Article  MathSciNet  Google Scholar 

  14. Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)

    Article  Google Scholar 

  15. Srinivasan, S., Thambidurai, P.: STANS Algorithm for Root Word Stemming. Information Technology Journal 5(4), 685–688 (2006)

    Article  Google Scholar 

  16. Megala, S., Kavitha, A., Marimuthu, A.: Improvised Stemming Algorithm – TWIG. International Journal of Advanced Research in Computer Science and Software Engineering 3(7), 168–171 (2013)

    Google Scholar 

  17. Karaa, W.: A New Stemmer To Improve Information Retrieval. International Journal of Network Security & Its Applications (IJNSA) 5(4), 143–154 (2013)

    Article  MathSciNet  Google Scholar 

  18. Moral, C., Antonio, A., Imbert, R., Rmirez, J.: A survey of stemming algorithms in information retrieval. Information Research 19(1) (2014)

    Google Scholar 

  19. Paice, C.D.: An evaluation method for stemming algorithms. In: Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 42–50. ACM, Dublin (1994)

    Google Scholar 

  20. Karaa, W.B.A., Gribâa, N.: Information Retrieval with Porter Stemmer: A New Version for English. In: Nagamalai, D., Kumar, A., Annamalai, A. (eds.) CCSEIT-2013. AISC, vol. 225, pp. 243–254. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  21. The Porter Stemming Algorithm, http://tartarus.org/~martin/PorterStemmer/index.html

  22. Common IR Test Collection, http://web.eecs.utk.edu/research/lsi/corpa.html

  23. Hassanien, A.E., Suraj, Z., Slezak, D., Lingras, P.: Rough computing: Theories, technologies and applications. IGI Publishing Hershey, PA (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Hajeer, S.I., Ismail, R.M., Badr, N.L., Tolba, M.F. (2014). An Adaptive Information Retrieval System for Efficient Web Searching. In: Hassanien, A.E., Tolba, M.F., Taher Azar, A. (eds) Advanced Machine Learning Technologies and Applications. AMLTA 2014. Communications in Computer and Information Science, vol 488. Springer, Cham. https://doi.org/10.1007/978-3-319-13461-1_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-13461-1_44

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-13460-4

  • Online ISBN: 978-3-319-13461-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics