skip to main content
10.1145/2457276.2457284acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmedesConference Proceedingsconference-collections
research-article

Amharic-English bilingual web search engine

Published:28 October 2012Publication History

ABSTRACT

As non-English languages are growing exponentially on the Web, the number of online non-English speakers who realizes the importance of finding information in different languages is enormously growing. However, the major general purpose search engines such as Google, Yahoo, etc have been lagging behind in providing indexes and search features to handle non-English languages. Amharic, which is the family of Semitic languages and the official working language of the federal government of Ethiopia, is one of these languages with a rapidly growing content on the Web. As a result, the need to develop bilingual search engine that handles the specific characteristics of the users' native language query (Amharic) and retrieves documents in both Amharic and English languages becomes more apparent.

In this research work, we designed a model for an Amharic-English Search Engine and developed a bilingual Web search engine based on the model that enables Web users for finding the information they need in Amharic and English languages. In doing so, we identified different language dependent query preprocessing components for query translation. We have also developed a bidirectional dictionary-based translation system which incorporates a transliteration component to handle proper names which are often missing in bilingual lexicons. We have used an Amharic search engine and an open source English search engine (Nutch) as our underlying search engines for Web document crawling, indexing, searching, ranking and retrieving.

To evaluate the effectiveness of our Amharic-English bilingual search engine, precision measures were conducted on the top 10 retrieved Web documents. The experimental results showed that the Amharic-English cross-lingual retrieval engine performed 74.12% of its corresponding English monolingual retrieval engine and the English-Amharic cross-lingual retrieval engine performed 78.82% of its corresponding Amharic monolingual retrieval engine. The bilingualism advantage of the system is also evaluated by comparing its results with general purpose search engines. The overall evaluation results of the system are found to be promising.

References

  1. Atelach Alemu Argaw. "Amharic-English Information Retrieval with Pseudo Relevance Feedback". In: Peters, C., et al. (eds.) Advances in Multilingual and Multimodal Information Retrieval: 8th Workshop of the Cross Language Evaluation Forum, CLEF 2007, Budapest, Hungary, September 19--21, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Tessema Mindaye, Hassen Redwan, Solomon Atnafu. "Searching the Web for Amharic Content". On the International Journal of Multimedia Processing and Technologies (JMPT), 2010. ISSN Print ISSN: 0976-4127."Google ScholarGoogle Scholar
  3. Atelach Alemu Argaw and Lars Asker. "Amharic-English Information Retrieval". Working Notes of CLEF 2006, Alicante, Spain. September 2006.Google ScholarGoogle Scholar
  4. Atelach Alemu Argaw, Lars Asker, Rickard Cöster and Jussi Karlgren. "Dictionary-based Amharic - English Information Retrieval". In Proceedings of Cross Language Evaluation Forum (CLEF 2004), Bath, UK. September 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Kristen Parton, Kathleen R. McKeown, James Allan, and Enrique Henestroza. "Simultaneous Multilingual Search for Translingual Information Retrieval". In Proceeding of the 17th ACM conference on Information and knowledge management, Napa Valley, California, USA. ACM New York, NY, USA, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Karunesh Arora, Ankur Garg, Gour Mohan, Somiram Singla, and Chander Mohan. "Cross Lingual Information Retrieval Efficiency Improvement through Transliteration". In Proceedings of ASCNT -- 2009, CDAC, Noida, India, pp. 65--71, 2009.Google ScholarGoogle Scholar
  7. Judit Bar-Ilan and Tatyana Gutman: "How do search engines handle non-English queries? - A case study". WWW (Alternate Paper Tracks), Budapest, Hungary, 2003.Google ScholarGoogle Scholar
  8. Amharic-WIKIPIDIA, the free encyclopaedia. Available at: http://en.wikipedia.org/wiki/Amharic, Accessed on 21 September 2009.Google ScholarGoogle Scholar
  9. Amharic Ethiopia Language. Available at: http://www.free-press-release.com/news/200907/1248234344.html, Accessed on July 21, 2009.Google ScholarGoogle Scholar
  10. Amharic Language. Available at: http://multilingualbooks.com/amharic.html, Accessed on 11 August 2010.Google ScholarGoogle Scholar
  11. Wen-hui Zhang, Hua-lin Qian, Wei Mao and Guo-nian Sun. "A Multilingual (Chinese, English) Indexing, Retrieval, Searching Search Engine". Available at http://www.isoc.org/inet99/proceedings/posters/210/index.htm, Accessed on August 10, 2009.Google ScholarGoogle Scholar
  12. Joanne Capstick, Abdel Kader Diagne, Gregor Erbach and Hans Uszkoreit. "MULINEX: Multilingual Web Search and Navigation", Accessed on August 25, 2009, Available at http://eprints.kfupm.edu.sa/52030/1/52030.pdf, Published on 08.02.99.Google ScholarGoogle Scholar
  13. Jialun Qin, Yilu Zhou, Michael Chau and Hsinchun Chen. "Multilingual Web retrieval: An experiment in English--Chinese business intelligence". John Wiley & Sons, Inc. New York, NY, USA, 2006Google ScholarGoogle Scholar
  14. Mohammed Aljlayl, Ophir Frieder, and David Grossman. "On Arabic-English Cross-Language Information Retrieval: A Machine Translation Approach". IEEE Computer Society Washington, DC, USA, 2002Google ScholarGoogle Scholar
  15. P. L. Nikesh, Sumam Mary Idicula, S. David Peter. "English-Malayalam Cross-Lingual Information Retrieval--an Experience', In Proceedings of IEEE International Conference on Electro/Information Technology, Ames, Iowa State University, May 2008Google ScholarGoogle Scholar
  16. A. Arasu, J. Cho, H. Garcia-Molina, A. Paepcke, and S. Raghavan. "Searching the Web". ACM Transactions on Internet Technology (TOIT), 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Wessel Kraaij, Jian-Yun Nie, and Michel Simard. "Embedding Web-based statistical translation models in cross-language information retrieval." MIT Press Cambridge, MA, USA, 2003.Google ScholarGoogle Scholar

Index Terms

  1. Amharic-English bilingual web search engine

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      MEDES '12: Proceedings of the International Conference on Management of Emergent Digital EcoSystems
      October 2012
      199 pages
      ISBN:9781450317559
      DOI:10.1145/2457276
      • General Chair:
      • Janusz Kacprzyk,
      • Program Chair:
      • Dominique Laurent,
      • Publications Chair:
      • Richard Chbeir

      Copyright © 2012 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 28 October 2012

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      MEDES '12 Paper Acceptance Rate16of50submissions,32%Overall Acceptance Rate267of682submissions,39%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader