Abstract
Stemming and lemmatization are two language modeling techniques used to improve the document retrieval precision performances. Stemming is a procedure to reduce all words with the same stem to a common form whereas lemmatization removes inflectional endings and returns the base form of a word.
The idea of this paper is to explain how a stemming or lemmatization in Amazigh language can improve the search outcomes by providing results that fit better with the query the user introduced.
In Document retrieval systems, lemmatization produced better precision compared to stemming. Overall the findings suggest that language modeling techniques improves document retrieval, with lemmatization technique producing the best result.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chowdhury, G., Chowdhury, S.: Introduction to Digital Libraries. Facet Publishing, London (2002)
Belkin, N.J.: Anomalous states of knowledge as a basis for information retrieval. Can. J. Inf. Sci. 5, 133–143 (1980)
Heaps, H.S.: Information Retrieval, Computational and Theoretical Aspects. Academic Press, Cambridge (1978)
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval, vol. 463. ACM Press, New York (1999)
Lovins, J.B.: Development of a stemming algorithm. Mech. Trans. Comput. Linguist. 11, 22–31 (1968)
Larkey, L.S., Ballesteros, L., Connell, M.E.: Improving stemming for Arabic information retrieval: light stemming and cooccurrence analysis. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 275–282. ACM (2002)
Xu, J., Fraser, A., Weischedel, R.: Empirical studies in strategies for Arabic retrieval. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 269–274. ACM (2002)
Wechsler, M., Sheridan, P., Schäuble, P.: Multi-language text indexing for internet retrieval. In: Proceedings of the 5th RIAO Conference, Computer-Assisted Information Searching on the Internet, vol. 5, pp. 217–232 (1997)
Hull, D.A.: Stemming algorithms: a case study for detailed evaluation. J. Am. Soc. Inf. Sci. 47, 70–84 (1996)
Hooper, R., Paice, C.: The Lancaster stemming algorithm, December 2013. http://www.comp.lancs.ac.uk/computing/research/stemming/
Ozturkmenoglu, O., Alpkocak, A.: Comparison of different lemmatization approaches for information retrieval on Turkish text collection. In: Innovations in Intelligent Systems and Applications (INISTA) International Symposium, pp. 1–5 (2012)
Gupta, D., Kumar, R., Yadav, R., Sajan, N.: Improving unsupervised stemming by using partial lemmatization coupled with data-based heuristics for Hindi. Int. J. Comput. Appl. 38, 1–8 (2012)
Greenberg, J.: The Languages of Africa. The Hague (1966)
Ouakrim, O.: Fonética y fonologÃa del Bereber. Survey at the University of Autònoma de Barcelona (1995)
Ameur, M., Bouhjar, A., Boukhris, F., Boukous, A., Boumalk, A., Elmedlaoui, M., Iazzi, E.M., Souifi, H.: Initiation à la langue Amazigh. The Royal Institute of Amazigh Culture (2004)
Boukhris, F., Boumalk, A., El Moujahid, E.H., Souifi, H.: La nouvelle grammaire de l’Amazigh. The Royal Institute of Amazigh Culture (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Samir, A., Lahbib, Z. (2018). Stemming and Lemmatization for Information Retrieval Systems in Amazigh Language. In: Tabii, Y., Lazaar, M., Al Achhab, M., Enneya, N. (eds) Big Data, Cloud and Applications. BDCA 2018. Communications in Computer and Information Science, vol 872. Springer, Cham. https://doi.org/10.1007/978-3-319-96292-4_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-96292-4_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-96291-7
Online ISBN: 978-3-319-96292-4
eBook Packages: Computer ScienceComputer Science (R0)