skip to main content
10.1145/2348283.2348491acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
poster

Looking inside the box: context-sensitive translation for cross-language information retrieval

Published: 12 August 2012 Publication History

Abstract

Cross-language information retrieval (CLIR) today is dominated by techniques that use token-to-token mappings from bilingual dictionaries. Yet, state-of-the-art statistical translation models (e.g., using Synchronous Context-Free Grammars) are far richer, capturing multi-term phrases, term dependencies, and contextual constraints on translation choice. We present a novel CLIR framework that is able to reach inside the translation "black box" and exploit these sources of evidence. Experiments on the TREC-5/6 English-Chinese test collection show this approach to be promising.

References

[1]
D. Chiang. Hierarchical phrase-based translation. Computational Linguistics, 33:201--228, 2007.
[2]
K. Darwish and D. W. Oard. Probabilistic structured query methods. In SIGIR, 2003.
[3]
C. Dyer, J. Weese, H. Setiawan, A. Lopez, F. Ture, V. Eidelman, J. Ganitkevitch, P. Blunsom, and P. Resnik. cdec: a decoder, alignment, and learning framework for finite-state and context-free translation models. In ACL Demos, 2010.
[4]
A. Lopez. Hierarchical phrase-based translation with suffix arrays. In EMNLP, 2007.
[5]
W. Magdy and G. Jones. Should MT systems be used as black boxes in CLIR? In ECIR, 2011.
[6]
F. Och and H. Ney. A systematic comparison of various statistical alignment models. CL, 29(1):19--51, 2003.
[7]
J. Olsson and D. Oard. Combining LVCSR and vocabulary-independent ranked utterance retrieval for robust speech search. In SIGIR, 2009.
[8]
A. Pirkola. The effects of query structure and dictionary-setups in dictionary-based cross-language information retrieval. In SIGIR, 1998.

Cited By

View all
  • (2020)Chinese Short Text Entity Linking Based On Semantic Similarity and Entity Correlation2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI)10.1109/ICTAI50040.2020.00073(426-431)Online publication date: Nov-2020
  • (2019)Chinese Social Media Entity Linking Based on Effective Context with Topic Semantics2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC)10.1109/COMPSAC.2019.00063(386-395)Online publication date: Jul-2019
  • (2018)Using Communities of Words Derived from Multilingual Word Vectors for Cross-Language Information Retrieval in Indian LanguagesACM Transactions on Asian and Low-Resource Language Information Processing10.1145/320835818:1(1-27)Online publication date: 17-Dec-2018
  • Show More Cited By

Index Terms

  1. Looking inside the box: context-sensitive translation for cross-language information retrieval

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '12: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
    August 2012
    1236 pages
    ISBN:9781450314725
    DOI:10.1145/2348283

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 August 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. context
    2. machine translation

    Qualifiers

    • Poster

    Conference

    SIGIR '12
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)Chinese Short Text Entity Linking Based On Semantic Similarity and Entity Correlation2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI)10.1109/ICTAI50040.2020.00073(426-431)Online publication date: Nov-2020
    • (2019)Chinese Social Media Entity Linking Based on Effective Context with Topic Semantics2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC)10.1109/COMPSAC.2019.00063(386-395)Online publication date: Jul-2019
    • (2018)Using Communities of Words Derived from Multilingual Word Vectors for Cross-Language Information Retrieval in Indian LanguagesACM Transactions on Asian and Low-Resource Language Information Processing10.1145/320835818:1(1-27)Online publication date: 17-Dec-2018
    • (2016)Reranking Hypotheses of Machine-Translated Queries for Cross-Lingual Information RetrievalExperimental IR Meets Multilinguality, Multimodality, and Interaction10.1007/978-3-319-44564-9_5(54-66)Online publication date: 23-Aug-2016
    • (2015)Combining Orthogonal Information in Large-Scale Cross-Language Information RetrievalProceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/2766462.2767805(943-946)Online publication date: 9-Aug-2015
    • (2015)A Comparative Study of Online Translation Services for Cross Language Information RetrievalProceedings of the 24th International Conference on World Wide Web10.1145/2740908.2743008(859-864)Online publication date: 18-May-2015
    • (2014)Exploiting Representations from Statistical Machine Translation for Cross-Language Information RetrievalACM Transactions on Information Systems10.1145/264480732:4(1-32)Online publication date: 28-Oct-2014
    • (2014)Learning to translate queries for CLIRProceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval10.1145/2600428.2609539(1179-1182)Online publication date: 3-Jul-2014
    • (2013)Flat vs. hierarchical phrase-based translation models for cross-language information retrievalProceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval10.1145/2484028.2484137(813-816)Online publication date: 28-Jul-2013
    • (2013)Studying machine translation technologies for large-data CLIR tasks: a patent prior-art search case studyInformation Retrieval10.1007/s10791-013-9231-617:5-6(492-519)Online publication date: 21-Nov-2013

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media