Skip to main content

Information Retrieval

  • Chapter
  • First Online:
Natural Language Processing of Semitic Languages

Abstract

In the past several years, some aspects of Semitic language, primarily Arabic, Information Retrieval (IR) have garnered a significant amount of attention. The main research interests have focused on retrieval of formal language, mostly in the news domain, with ad hoc retrieval, OCR document retrieval, and cross-language retrieval. The literature on other aspects of retrieval continues to be sparse or non-existent, though some of these aspects have been investigated by industry. The two main aspects where literature is lacking are web search and social search. The survey will cover two main areas: 1) a significant part of the literature pertaining to language-specific issues that affect retrieval; and 2) specialized retrieval problems, namely document image retrieval, cross-language search, web search, and social search.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 119.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.clef-initiative.eu

  2. 2.

    http://research.nii.ac.jp/ntcir/index-en.html

  3. 3.

    http://www.isical.ac.in/~fire/

  4. 4.

    Buckwalter encoding is used to Romanize Arabic text in this chapter.

  5. 5.

    http://en.wikipedia.org/wiki/Varieties_of_Arabic

  6. 6.

    This is based on communication with people working on different web search engines.

References

  1. Abdelsapor A, Adly N, Darwish K, Emam O, Magdy W, Nagi M (2006) Building a heterogeneous information retrieval collection of printed Arabic documents. In: LREC 2006, Genoa

    Google Scholar 

  2. Abdul-Al-Aal A (1987) An-Nahw Ashamil. Maktabat Annahda Al-Masriya, Cairo

    Google Scholar 

  3. AbdulJaleel N, Larkey LS (2003) Statistical transliteration for English–Arabic cross language information retrieval. In: CIKM’03, New Orleans, 3–8 Nov 2003

    Google Scholar 

  4. Abu-Salem H, Al-Omari M, Evens M (1999) Stemming methodologies over individual query words for Arabic information retrieval. JASIS 50(6):524–529

    Article  Google Scholar 

  5. Ahmed M (2000) A large-scale computational processor of the Arabic morphology, and applications. Faculty of Engineering, Cairo University, Cairo

    Google Scholar 

  6. Ahmad F, Kondrak G (2005) Learning a spelling error model from search query logs. In: Proceedings of HLT-2005, Vancouver

    Google Scholar 

  7. Agirre E, Gojenola K, Sarasola K, Voutilainen A (1998) Towards a single proposal in spelling correction. In: Proceedings of COLING-ACL’98, San Francisco, pp 22–28

    Google Scholar 

  8. Alemayehu N (1999) Development of a stemming algorithm for Amharic language text retrieval. Ph.D. thesis, Dept. of Information Studies, University of Sheffield, Sheffield

    Google Scholar 

  9. Alemayehu N, Willett P (2003) The effectiveness of stemming for information retrieval in Amharic. Electron Libr Inf Syst 37(4):254–259

    Google Scholar 

  10. Aljlayl M, Frieder O (2002) On Arabic search: improving the retrieval effectiveness via a light stemming approach. In: CIKM’02, McLean

    Google Scholar 

  11. Aljlayl M, Beitzel S, Jensen E, Chowdhury A, Holmes D, Lee M, Grossman D, Frieder O (2001) IIT at TREC-10. In: TREC 2001, Gaithersburg

    Google Scholar 

  12. Al-Kharashi I, Evens M (1994) Comparing words, stems, and roots as index terms in an Arabic information retrieval system. JASIS 45(8):548–560

    Article  Google Scholar 

  13. Allam M (1995) Segmentation versus segmentation-free for recognizing Arabic text. Proc SPIE 2422:228–235

    Article  Google Scholar 

  14. Argaw AA, Asker L (2007) An Amharic stemmer: reducing words to their citation forms. In: Proceedings of the 5th workshop on important unresolved matters, ACL-2007, Prague, pp 104–110

    Google Scholar 

  15. Attar R, Choueka Y, Dershowitz N, Fraenkel AS (1978) KEDMA – linguistic tools for retrieval systems. J Assoc Comput Mach 25(1):52–66

    Article  MATH  MathSciNet  Google Scholar 

  16. Baird H (1990) Document image defect models. In: IAPR workshop on syntactic and structural pattern recognition, Murray Hill, pp 38–46

    Google Scholar 

  17. Baird H (1993) Document image defects models and their uses. In: Second international conference on document analysis and recognition (ICDAR), Tsukuba City, pp 62–67

    Google Scholar 

  18. Beesley K (1996) Arabic finite-state morphological analysis and generation. In: COLING-96, Copenhagen

    Google Scholar 

  19. Beesley K, Buckwalter T, Newton S (1989) Two-level finite-state analysis of Arabic morphology. In: Proceedings of the seminar on bilingual computing in Arabic and English, Cambridge

    Google Scholar 

  20. Blei D, Ng A, Jordan M (2003) Latent Dirichlet allocation. J Mach Learn Res 3(4–5):993–1022

    MATH  Google Scholar 

  21. Braschler M, Ripplinger B (2004) How effective is stemming and decompounding for German text retrieval? Inf Retr J 7(3–4):291–316

    Article  Google Scholar 

  22. Brill E, Moore R (2000) An improved error model for noisy channel spelling correction. In: Proceedings of the 38th annual meeting of the association for computational linguistics, ACL’00, Hong Kong, pp 286–293

    Google Scholar 

  23. Burges C, Shaked T, Renshaw E, Lazier A, Deeds M, Hamilton N, Hullender G (2005) Learning to rank using gradient descent. In: Proceedings of the 22nd international conference on machine learning, Bonn

    Google Scholar 

  24. Burgin B (1992) Variations in relevance judgments and the evaluation of retrieval performance. Inf Process Manage 28(5):619–627

    Article  Google Scholar 

  25. Carmel D, Maarek YS (1999) Morphological disambiguation for Hebrew search systems. In: NGITS-99, Zikhron-Yaakov

    Google Scholar 

  26. Chenm A, Gey F (2002) Building an Arabic stemmer for information retrieval. In: TREC-2002, Gaithersburg

    Google Scholar 

  27. Choueka Y (1980) Computerized full-text retrieval systems and research in the humanities: the Responsa project. Comput Hum 14:153–169. North-Holland

    Google Scholar 

  28. Church K, Gale W (1991) Probability scoring for spelling correction. Stat Comput 1:93–103

    Article  Google Scholar 

  29. Croft WB, Harding S, Taghva K, Andborsak J (1994) An evaluation of information retrieval accuracy with simulated OCR output. In: Proceedings of the 3rd annual symposium on document analysis and information retrieval, University of Nevada, Las Vegas, pp 115–126

    Google Scholar 

  30. Darwish K (2002) Building a shallow morphological analyzer in one day. In: ACL workshop on computational approaches to Semitic languages, Philadelphia

    Google Scholar 

  31. Darwish K (2003) Probabilistic methods for searching OCR-degraded Arabic text. Ph.D. thesis, Electrical and Computer Engineering Department, University of Maryland, College Park

    Google Scholar 

  32. Darwish K, Ali A (2012) Arabic retrieval revisited: morphological hole filling. In: Proceedings of the 50th annual meeting of the Association for Computational Linguistics: short papers-volume 2, Jeju Island. ACL, pp 218–222

    Google Scholar 

  33. Darwish K, Emam O (2005) The effect of blind relevance feedback on a new Arabic OCR degraded text collection. In: International conference on machine intelligence: special session on Arabic document image analysis, Tozeur, 5–7 Nov 2005

    Google Scholar 

  34. Darwish K, Magdy W (2007) Error correction vs. query garbling for Arabic OCR document retrieval. ACM Trans Inf Syst (TOIS) 26(1):5

    Google Scholar 

  35. Darwish K, Oard DW (2002) Term selection for searching printed Arabic. In: Proceedings of the 25th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR’02), Tampere, pp 261–268

    Google Scholar 

  36. Darwish K, Oard D (2002) CLIR experiments at Maryland for TREC 2002: evidence combination for Arabic–English retrieval. In: Text retrieval conference (TREC’02), Gaithersburg

    Google Scholar 

  37. Darwish K, Hassan H, Emam O (2005) Examining the effect of improved context sensitive morphology on Arabic information retrieval. In: Proceedings of the ACL workshop on computational approaches to Semitic languages, Ann Arbor, pp 25–30

    Google Scholar 

  38. De Roeck A, El-Fares W (2000) A morphologically sensitive clustering algorithm for identifying Arabic roots. In: 38th Annual meeting of the ACL, Hong Kong, pp 199–206

    Google Scholar 

  39. Diab M (2009) Second generation tools (AMIRA 2.0): fast and robust tokenization, POS tagging, and Base phrase chunking. In: 2nd international conference on Arabic language resources and tools, Cairo

    Google Scholar 

  40. Doermann D (1998) The indexing and retrieval of document images: a survey. Comput Vis Image Underst 70(3):287–298

    Article  Google Scholar 

  41. Doermann D, Yao S (1995) Generating synthetic data for text analysis systems. In: Symposium on document analysis and information retrieval, Las Vegas, pp 449–467

    Google Scholar 

  42. Domeij R, Hollman J, Kann V (1994) Detection of spelling errors in Swedish not using a Word List en Clair. J Quant Linguist 1:195–201

    Article  Google Scholar 

  43. Dumais ST, Furnas GW, Landauer TK, Deerwester S, Harshman R (1988) Using latent semantic analysis to improve access to textual information. In: CHI’88 proceedings of the SIGCHI conference on human factors in computing systems, Washington, DC

    Google Scholar 

  44. El-Kholy A, Habash N (2010) Techniques for Arabic morphological detokenization and orthographic denormalization. In: Proceedings of language resources and evaluation conference (LREC), Valletta

    Google Scholar 

  45. Fraser A, Xu J, Weischedel R (2002) TREC 2002 cross-lingual retrieval at BBN. In: TREC-2002, Gaithersburg

    Google Scholar 

  46. Gao W, Niu C, Nie J-Y, Zhou M, J Hu, Wong K-F, Hon H-W (2007) Cross-lingual query suggestion using query logs of different languages, SIGIR-2007, Amsterdam, pp 463–470

    Google Scholar 

  47. Gao W, Niu C, Zhou M, Wong KF (2009) Joint ranking for multilingual web search. In: ECIR 2009, pp 114–125

    Google Scholar 

  48. Gao W, Niu C, Nie J-Y, Zhou M, Wong K-F, Hon H-W (2010) Exploiting query logs for cross-lingual query suggestions. ACM Trans Inf Syst 28:1–33

    Article  Google Scholar 

  49. Gey F, Oard D (2011) The TREC-2001 cross-language information retrieval track: searching Arabic using English, French or Arabic queries. In: TREC 2001, Gaithersburg, pp 16–23

    Google Scholar 

  50. Gillies A, Erlandson E, Trenkle J, Schlosser S (1997) Arabic text recognition system. In: The symposium on document image understanding technology, Annapolis

    Google Scholar 

  51. Habash N, Rambow O (2007) Arabic diacritization through full morphological tagging. In: Proceedings of NAACL HLT 2007, Rochester, Companion volume, pp 53–56

    Google Scholar 

  52. Han B, Baldwin T (2011) Lexical normalisation of short text messages: makn sens a #twitter. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technologies-volume 1, Portland. ACL, pp 368–378

    Google Scholar 

  53. Harding S, Croft W, Weir C (1997) Probabilistic retrieval of OCR-degraded text using N-grams. In: European conference on digital libraries, Pisa. Research and advanced technology for digital libraries. Springer, Berlin/Heidelberg, pp 345–359

    Google Scholar 

  54. Harman D (1992) Overview of the first Text REtrieval conference, Gaithersburg, TREC-1992

    Google Scholar 

  55. Harman D (1995) Overview of the fourth Text REtrieval conference, Gaithersburg,TREC-4, p 1

    MathSciNet  Google Scholar 

  56. Hassibi K (1994) Machine printed Arabic OCR. In: 22nd AIPR workshop: interdisciplinary computer vision, SPIE Proceedings, Washington, DC

    Google Scholar 

  57. Hassibi K (1994) Machine printed Arabic OCR using neural networks. In: 4th international conference on multi-lingual computing, London

    Google Scholar 

  58. Hawking D (1996) Document retrieval in OCR-scanned text. In: 6th parallel computing workshop, Kawasaki

    Google Scholar 

  59. He D, Oard DW, Wang J, Luo J, Demner-Fushman D, Darwish K, Resnik P, Khudanpur S, Nossal M, Subotin M, Leuski A (2003) Making MIRACLEs: interactive translingual search for Cebuano and Hindi. ACM Trans Asian Lang Inf Process (TALIP) 2(3):219–244

    Google Scholar 

  60. Hefny A, Darwish K, Alkahky A (2011) Is a query worth translating: ask the users! In: ECIR 2011, Dublin, pp 238–250

    Google Scholar 

  61. Hersh WR, Bhuptiraju RT, Ross L, Cohen AM, Kraemer DF, Johnson P (2004) TREC 2004 genomics track overview (TREC-2004), Gaithersburg

    Google Scholar 

  62. Hmeidi I, Kanaan G, Evens M (1997) Design and implementation of automatic indexing for information retrieval with Arabic documents. JASIS 48(10):867–881

    Article  Google Scholar 

  63. Hong T (1995) Degraded text recognition using visual and linguistic context. Ph.D. thesis, Computer Science Department, SUNY Buffalo, Buffalo

    Google Scholar 

  64. Huang J, Efthimiadis EN (2009) Analyzing and evaluating query reformulation strategies in web search logs. In: CIKM’09, Hong Kong, 2–6 Nov 2009

    Google Scholar 

  65. Jarvelin K, Kekalainen J (2002) Cumulated gain-based evaluation of IR techniques. ACM Trans Inf Syst 20(4):422–446

    Article  Google Scholar 

  66. Joachims T (2006) Training linear SVMs in linear time. In: Proceedings of the ACM conference on knowledge discovery and data mining (KDD), Philadelphia

    Google Scholar 

  67. Jurafsky D, Martin J (2000) Speech and language processing. Prentice Hall, Upper Saddle River

    Google Scholar 

  68. Kantor P, Voorhees E (1996) Report on the TREC-5 confusion track. In: TREC-1996, Gaithersburg

    Google Scholar 

  69. Kareem Darwish (2013) Arabizi detection and conversion to Arabic. CoRR abs/1306.6755

    Google Scholar 

  70. Khoja S, Garside R (2001) Automatic tagging of an Arabic corpus using APT. In: The Arabic linguistic symposium (ALS), University of Utah, Salt Lake City

    Google Scholar 

  71. Kiraz G (1998) Arabic computation morphology in the west. In: 6th international conference and exhibition on multi-lingual computing, Cambridge

    Google Scholar 

  72. Kishida K (2008) Prediction of performance of cross-language information retrieval using automatic evaluation of translation. Libr Inf Sci Res 30(2):138–144

    Article  Google Scholar 

  73. Kanungo T, Haralick R (1998) An automatic closed-loop methodology for generating character ground-truth for scanned documents. IEEE Trans Pattern Anal Mach Intell 21(2):179–183

    Article  Google Scholar 

  74. Kanungo T, Haralick R, Phillips I (1993) Global and local document degradation models. In: 2nd international conference on document analysis and recognition (ICDAR’93), Tsukuba City, pp 730–734

    Google Scholar 

  75. Kanungo T, Bulbul O, Marton G, Kim D (1997) Arabic OCR systems: state of the art. In: Symposium on document image understanding technology, Annapolis

    Google Scholar 

  76. Kanungo T, Marton G, Bulbul O (1999) OmniPage vs. Sakhr: paired model evaluation of two Arabic OCR products. In: SPIE conference on document recognition and retrieval (VI), San Jose

    Google Scholar 

  77. Lam-Adesina AM, Jones GJF (2006) Examining and improving the effectiveness of relevance feedback for retrieval of scanned text documents. Inf Process Manage 42(3):633–649

    Article  Google Scholar 

  78. Larkey LS, Ballesteros L, Connell ME (2002) Improving stemming for Arabic information retrieval: light stemming and co-occurrence analysis. Research and development in information retrieval – SIGIR-2002, Tampere, pp 275–282

    Google Scholar 

  79. Lee Y, Papineni K, Roukos S, Emam O, Hassan H (2003) Language model based Arabic word segmentation. In: Proceedings of the 41st annual meeting of the association for computational linguistics, Sapporo, July 2003, pp 399–406

    Google Scholar 

  80. Lee CJ, Chen CH, Kao SH, Cheng PJ (2010) To translate or not to translate? In: SIGIR-2010, Geneva

    Google Scholar 

  81. Levow GA, Oard DW, Resnik P (2005) Dictionary-based techniques for cross-language information retrieval. Inf Process Manage J 41(3):523–547

    Article  Google Scholar 

  82. Li Y, Lopresti D, Tomkins A (1997) Validation of document defect models. IEEE Trans Pattern Anal Mach Intell 18:99–107

    Google Scholar 

  83. Lin WC, Chen HH (2003) Merging mechanisms in multilingual information retrieval. CLEF 2002, LNCS 2785. Springer, Berlin/New York, pp 175–186

    Google Scholar 

  84. Liu T-Y (2009) Learning to rank for information retrieval. Found Trends Inf Retr 3(3):225–331

    Article  Google Scholar 

  85. Lopresti D, Zhou J (1994) Using consensus sequence voting to correct OCR errors. In: IAPR workshop on document analysis systems, Kaiserslautern, pp 191–202

    Google Scholar 

  86. Lu Z, Bazzi I, Kornai A, Makhoul J, Natarajan P, Schwartz R (1999) A robust, language-independent OCR system. In: 27th AIPR workshop: advances in computer assisted recognition, Washington, DC. SPIE

    Google Scholar 

  87. Maamouri M, Graff D, Bouziri B, Krouna S, Bies A, Kulick S (2010) LDC standard Arabic morphological analyzer (SAMA) version 3.1. Linguistics Data Consortium, Catalog No. LDC2010L01

    Google Scholar 

  88. Magdy W, Darwish K (2006) Arabic OCR error correction using character segment correction, language modeling, and shallow morphology. In: Empirical methods in natural language processing (EMNLP’06), Sydney, pp 408–414

    Google Scholar 

  89. Magdy W, Darwish K, Rashwan M (2007) Fusion of multiple corrupted transmissions and its effect on information retrieval. In: ESOLE 2007, Cairo

    Google Scholar 

  90. Magdy W, Darwish K, El-Saban M (2009) Efficient language-independent retrieval of printed documents without OCR. In: SPIRE 2009, Saariselkä

    Google Scholar 

  91. Magdy W, Darwish K, Mourad A (2012) Language processing for Arabic microblog retrieval. In: CIKM, Maui

    Google Scholar 

  92. Mayfield J, McNamee P, Costello C, Piatko C, Banerjee A (2001) JHU/APL at TREC 2001: experiments in filtering and in Arabic, video, and web retrieval. In: Text retrieval conference (TREC’01), Gaithersburg

    Google Scholar 

  93. McNamee P, Mayfield J (2002) Comparing cross-language query expansion techniques by degrading translation resources. In: SIGIR’02, Tampere

    Google Scholar 

  94. Metzler D, Croft WB (2004) Combining the language model and inference network approaches to retrieval. Inf Process Manage 40(5):735–750. Special issue on Bayesian Networks and Information Retrieval

    Google Scholar 

  95. Mittendorf E, Schäuble P (2000) Information retrieval can cope with many errors. Inf Retr 3(3):189–216. Springer, Netherlands

    Google Scholar 

  96. Oard D, Dorr B (1996) A survey of multilingual text retrieval. UMIACS, University of Maryland, College Park

    Google Scholar 

  97. Oard D, Gey F (2002) The TREC 2002 Arabic/English CLIR track. In: TREC-2002, Gaithersburg

    Google Scholar 

  98. Oflazer K (1996) Error-tolerant finite state recognition with applications to morphological analysis and spelling correction. Comput Linguist 22(1):73–90

    Google Scholar 

  99. Page L (1998) Method for node ranking in a linked database. US patent no. 6285999

    Google Scholar 

  100. Pirkola A (1998) The effects of query structure and dictionary setups in dictionary-based cross-language information retrieval. In: SIGIR-1998, Melbourne, pp 55–63

    Google Scholar 

  101. Robertson SE, Jones KS (1976) Relevance weighting of search terms. J Am Soc Inf Sci 27:129–146

    Article  Google Scholar 

  102. Robertson SE, Jones KS (1996) Simple, proven approaches to text-retrieval. Technical report 356, Computer Laboratory, University of Cambridge, Cambridge

    Google Scholar 

  103. Robertson SE, Zaragoza H (2009) The probabilistic relevance framework: BM25 and beyond. Found Trends Inf Retr 3(4):333–389

    Article  Google Scholar 

  104. Salton G, Lesk M (1969) Relevance assessments and retrieval system evaluation. Inf Storage Retr 4:343–359

    Google Scholar 

  105. Salton G, McGill M (1983) Introduction to modern information retrieval. McGraw-Hill, New York

    MATH  Google Scholar 

  106. Salton G, Wong A, Yang CS (1975) A vector space model for automatic indexing. Commun ACM 18(11):613–620

    Article  MATH  Google Scholar 

  107. Salton G, Fox EA, Wu H (1983) Extended Boolean information retrieval. Commun ACM 26(11):1022–1036

    Article  MATH  MathSciNet  Google Scholar 

  108. Sanderson M (1994) Word sense disambiguation and information retrieval. In: SIGIR’94, Dublin, pp 142–151

    Google Scholar 

  109. Sanderson M, Joho H (2004) Forming test collections with no system pooling. In: SIGIR’04, Sheffield, 25–29 July 2004

    Google Scholar 

  110. Si L, Callan J (2005) CLEF 2005: multilingual retrieval by combining multiple multilingual ranked lists. In: Sixth workshop of the cross-language evaluation forum, CLEF, Vienna

    Google Scholar 

  111. Singhal A, Salton G, Buckley C (1996) Length normalization in degraded text collections. In: 5th annual symposium on document analysis and information retrieval, Las Vegas

    Google Scholar 

  112. Smith S (1990) An analysis of the effects of data corruption on text retrieval performance. Thinking Machines Corp, Cambridge

    Google Scholar 

  113. Soboroff I, Nicholas C, Cahan P (2001) Ranking retrieval systems without relevance judgments. In: SIGIR, New Orleans

    Google Scholar 

  114. Szpektor I, Dagan I, Lavie A, Shacham D, Wintner S (2007) Cross lingual and semantic retrieval for cultural heritage appreciation. In: Proceedings of the workshop on language technology for cultural heritage data, Prague

    Google Scholar 

  115. Taghva K, Borasack J, Condit A, Gilbreth J (1994) Results and implications of the noisy data projects, 1994. Information Science Research Institute, University of Nevada, Las Vegas

    Google Scholar 

  116. Taghva K, Borsack J, Condit A (1994) An expert system for automatically correcting OCR output. In: SPIE-document recognition, San Jose

    Google Scholar 

  117. Taghva K, Borasack J, Condit A, Inaparthy P (1995) Querying short OCR’d documents. Information Science Research Institute, University of Nevada, Las Vegas

    Google Scholar 

  118. Tillenius M (1996) Efficient generation and ranking of spelling error corrections. NADA technical report TRITA-NA-E9621

    Google Scholar 

  119. Tsai MF, Wang YT, Chen HH (2008) A study of learning a merge model for multilingual information retrieval. In: SIGIR, Singapore

    Google Scholar 

  120. Tseng Y, Oard DW (2001) Document image retrieval techniques for Chinese. In: Symposium on document image understanding technology (SDIUT), Columbia, pp 151–158

    Google Scholar 

  121. Udupa R, Saravanan K, Bakalov A, Bhole A (2009) “They Are Out There, If You Know Where to Look”: mining transliterations of OOV query terms for cross-language information retrieval. In: ECIR, Toulouse. LNCS, vol 5478, pp 437–448

    Google Scholar 

  122. Voorhees E (1998) Variations in relevance judgments and the measurement of retrieval effectiveness. In: SIGIR, Melbourne

    Google Scholar 

  123. Wang J, Oard DW (2006) Combining bidirectional translation and synonymy for cross-language information retrieval. In: SIGIR, Seattle, pp 202–209

    Google Scholar 

  124. Wayne C (1998) Detection & tracking: a case study in corpus creation & evaluation methodologies. Language resources and evaluation conference, Granada

    Google Scholar 

  125. Wu D, He D, Ji H, Grishman R (2008) A study of using an out-of-box commercial MT system for query translation in CLIR. In: Workshop on improving non-English web searching, CIKM, Napa Valley

    Google Scholar 

  126. Yona S, Wintner S (2008) A finite-state morphological grammar of Hebrew. In: Proceedings of the ACL-2005 workshop on computational approaches to Semitic languages, Ann Arbor, June 2005

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kareem Darwish .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Darwish, K. (2014). Information Retrieval. In: Zitouni, I. (eds) Natural Language Processing of Semitic Languages. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45358-8_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-45358-8_10

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-45357-1

  • Online ISBN: 978-3-642-45358-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics