skip to main content
10.1145/2701336.2701639acmotherconferencesArticle/Chapter ViewAbstractPublication PagesfireConference Proceedingsconference-collections
research-article

PAN@FIRE: Overview of the Cross-Language !ndian News Story Search (CL!NSS) Track

Published: 04 December 2013 Publication History

Abstract

The automatic alignment of documents in a quasi-comparable corpus is an important research problem for a resource poor cross-language technologies. News stories form one of the most prolific and abundant language resource. The PAN@FIRE task, cross-language !ndia news story search (CL!NSS), aimed to address the news story linking task across languages English and Hindi. We present the overview of the track with results and analysis.

References

[1]
P. Arora, J. Foster, and G. J. F. Jones. DCU at FIRE 2013: Cross-Language !ndian News Story Search. In FIRE {7}.
[2]
E. Barker and R. Gaizauskas. Assessing the comparability of news texts. In N. C. C. Chair), K. Choukri, T. Declerck, M. U. Doayan, B. Maegaard, J. Mariani, J. Odijk, and S. Piperidis, editors, Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12), Istanbul, Turkey, May 2012. European Language Resources Association (ELRA).
[3]
P. F. Brown, V. J. D. Pietra, S. A. D. Pietra, and R. L. Mercer. The Mathematics of Statistical Machine Translation: Parameter Estimation. Comput. Linguist., 19(2):263--311, June 1993.
[4]
P. Clough. Measuring text reuse in a journalistic domain. In In Proc. of the 4th CLUK Colloquium, pages 53--63, 2001.
[5]
P. Clough, R. Gaizauskas, S. S. L. Piao, and Y. Wilks. Meter: Measuring text reuse. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL '02, pages 152--159, Stroudsburg, PA, USA, 2002. Association for Computational Linguistics.
[6]
S. Das and A. Kumar. Performance Evaluation of Dictionary Based CLIR Strategies for Cross Language News Story Search. In FIRE {7}.
[7]
FIRE, editor. FIRE 2013 Working Notes. Fifth International Workshop of the Forum for Information Retrieval Evaluation, 2013.
[8]
P. Gupta, P. Clough, P. Rosso, and M. Stevenson. PAN@FIRE: Overview of the Cross-Language !ndian News Story Search (CL!NSS) Track. In Proceedings of the Fourth Forum for Information Retrieval Evaluation, FIRE '12, India, 2012.
[9]
K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of ir techniques. ACM Trans. Inf. Syst., 20(4):422--446, Oct. 2002.
[10]
A. Kumar and S. Das. Pre-Retrieval based Strategies for Cross Language News Story Search. In FIRE {7}.
[11]
M. Littman, S. T. Dumais, and T. K. Landauer. Automatic cross-language information retrieval using latent semantic indexing. In Cross-Language Information Retrieval, chapter 5, pages 51--62. Kluwer Academic Publishers, 1998.
[12]
D. S. Munteanu and D. Marcu. Improving machine translation performance by exploiting non-parallel corpora. Comput. Linguist., 31(4):477--504, Dec. 2005.
[13]
F. J. Och and H. Ney. A systematic comparison of various statistical alignment models. Comput. Linguist., 29(1):19--51, Mar. 2003.
[14]
A. Pal and L. Gillam. Set-based Similarity Measurement and Ranking Model to Identify Cases of Journalistic Text Reuse. In FIRE {7}.
[15]
Y. Palkovskii. Working note for CL!NSS. In FIRE, editor, FIRE 2011 Working Notes. Fourth International Workshop of the Forum for Information Retrieval Evaluation, 2012.
[16]
J. Platt, K. Toutanova, and W.-T. Yih. Translingual Document Representations from Discriminative Projections. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, EMNLP'10, pages 251--261, 2010.
[17]
J. R. Smith, C. Quirk, and K. Toutanova. Extracting parallel sentences from comparable corpora using document level alignment. In HLT-NAACL, pages 403--411, 2010.
[18]
G. Tholpadi and A. Param. Leveraging Article Titles for Cross-lingual Linking of Focal News Events. In FIRE {7}.
[19]
D. A. R. Torrejón and J. M. M. Ramos. Linking English and Hindi news by IDF, Reference Monotony and Extended Contextual N-grams IR Engine. In FIRE {7}.
[20]
D. A. R. Torrejón and J. M. M. Ramos. Text alignment module in coremo 2.1 plagiarism detector - notebook for pan at clef 2013. In CLEF (Online Working Notes/Labs/Workshop), 2013.
[21]
R. Udupa and M. M. Khapra. Improving the multilingual user experience of wikipedia using cross-language name search. In HLT-NAACL, pages 492--500, 2010.
[22]
R. Udupa, K. Saravanan, A. Kumaran, and J. Jagarlamudi. Mint: A method for effective and scalable mining of named entity transliterations from large comparable corpora. In EACL, pages 799--807, 2009.
[23]
J. Zobel and P. Dart. Phonetic string matching: lessons from information retrieval. In Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '96, pages 166--172, New York, NY, USA, 1996. ACM.

Cited By

View all
  • (2020)Author Profiling Tracks at FIRESN Computer Science10.1007/s42979-020-0073-11:2Online publication date: 26-Feb-2020
  • (2019)Evolution of the PAN Lab on Digital Text ForensicsInformation Retrieval Evaluation in a Changing World10.1007/978-3-030-22948-1_19(461-485)Online publication date: 14-Aug-2019
  • (2015)A Comparative Study on Different Translation Approaches for Query Formation in the Source Retrieval TaskProceedings of the 7th Annual Meeting of the Forum for Information Retrieval Evaluation10.1145/2838706.2838714(43-46)Online publication date: 4-Dec-2015
  • Show More Cited By

Index Terms

  1. PAN@FIRE: Overview of the Cross-Language !ndian News Story Search (CL!NSS) Track

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    FIRE '12 & '13: Proceedings of the 4th and 5th Annual Meetings of the Forum for Information Retrieval Evaluation
    December 2013
    105 pages
    ISBN:9781450328302
    DOI:10.1145/2701336
    • Editors:
    • Prasenjit Majumder,
    • Mandar Mitra,
    • Madhulika Agrawal,
    • Parth Mehta
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 December 2013

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Text reuse
    2. cross-language information retrieval
    3. news search

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    FIRE '13
    FIRE '13: Forum for Information Retrieval Evaluation
    December 4 - 6, 2013
    New Delhi, India

    Acceptance Rates

    Overall Acceptance Rate 19 of 64 submissions, 30%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)Author Profiling Tracks at FIRESN Computer Science10.1007/s42979-020-0073-11:2Online publication date: 26-Feb-2020
    • (2019)Evolution of the PAN Lab on Digital Text ForensicsInformation Retrieval Evaluation in a Changing World10.1007/978-3-030-22948-1_19(461-485)Online publication date: 14-Aug-2019
    • (2015)A Comparative Study on Different Translation Approaches for Query Formation in the Source Retrieval TaskProceedings of the 7th Annual Meeting of the Forum for Information Retrieval Evaluation10.1145/2838706.2838714(43-46)Online publication date: 4-Dec-2015
    • (2013)Applying Query Formulation and Fusion Techniques For Cross Language News Story SearchProceedings of the 4th and 5th Annual Meetings of the Forum for Information Retrieval Evaluation10.1145/2701336.2701650(1-9)Online publication date: 4-Dec-2013
    • (2013)Pre-Retrieval based Strategies for Cross Language News Story SearchProceedings of the 4th and 5th Annual Meetings of the Forum for Information Retrieval Evaluation10.1145/2701336.2701640(1-10)Online publication date: 4-Dec-2013
    • (2013)Linking transcribed conversational speechProceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval10.1145/2484028.2484136(961-964)Online publication date: 28-Jul-2013

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media