Abstract
Information retrieval (IR) is a crucial area of natural language processing (NLP). One of the fundamental issues in bilingual retrieving of information in search engines seems to be the way and the extent users call for phrases and chunks. The main problem arises when the existing bilingual dictionaries are not able to meet the users’ actual needs for translating such phrases and chunks into an alternative language and the results often are not reliable. In this project a heuristic method for extracting the correct equivalents of source language chunks using monolingual and bilingual linguistic corpora as well as text classification algorithms is to be introduced. Experimental results revealed that our method gained the accuracy rate of 86.13% which seems very encouraging.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alizade, H., et al.: Studying the efficiency of the existing methods in cross-language information retrieval using a machine-readable bilingual dictionary. Iranian Information and Documentation Centre 25(1), 53–70 (2009)
Chen, H.: Chinese information extraction Techniques. Presented at the SSIMIP, Singapore (2002)
Hull, D., Grefenstette, G.: Querying Across Languages; A Dictionary – Based Approach to Multilingual Information Retrieval. In: Proceedings of the 19th Annual International ACM Sigir, Zurich, Switzerland, pp. 49–57 (1996)
Mosavi Miangah, T.: Automatic term extraction for cross-language information retrieval using a bilingual parallel corpus. In: Proceedings of the 6th International Conference on Informatics and Systems (INFOS 2008), Cairo, Egypt, pp. 81–84 (2008)
Mosavi Miangah, T.: Constructing a large-scale English-Persian Parallel Corpus. META 54(1), 181–188 (2009)
Manning, C.D., Raghavan, P., SchĂĽtze, H.: An Introduction to Information Retrieval. Cambridge University Press, Cambridge (2009)
Shams, M., Pourmahmoud, S.: A linguistic-conceptual approach for cross-language information retrieval. In: Proceedings of the 13th National Conference of Computer Society of Iran, pp. 1–8. Kish Island, Iran (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Miangah, T.M., Nezarat, A. (2010). A Novel Method for Cross-Language Retrieval of Chunks Using Monolingual and Bilingual Corpora. In: Das, V.V., Vijaykumar, R. (eds) Information and Communication Technologies. ICT 2010. Communications in Computer and Information Science, vol 101. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15766-0_45
Download citation
DOI: https://doi.org/10.1007/978-3-642-15766-0_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15765-3
Online ISBN: 978-3-642-15766-0
eBook Packages: Computer ScienceComputer Science (R0)