Skip to main content

SINAI at CLEF 2004: Using Machine Translation Resources with a Mixed 2-Step RSV Merging Algorithm

  • Conference paper
Multilingual Information Access for Text, Speech and Images (CLEF 2004)


In CLEF 2004, the SINAI group participated in the multilingual task. Our main interest was to test Machine Translation (MT) with a mixed 2-step RSV merging algorithm. Since 2-step RSV requires grouping the document frequency for each term with the translations for that term, and MT translates whole phrases better than working word for word, it is not directly feasible to use MT with a 2-step RSV merging algorithm. To solve this problem, we have tested an algorithm which aligns the original query and its translation(s) at term level.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others


  1. Chen, A.: Cross-language retrieval experiments at CLEF-2002. In: Peters, C., Braschler, M., Gonzalo, J. (eds.) CLEF 2002. LNCS, vol. 2785, pp. 26–48. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  2. Martínez-Santiago, F., Martín, M., Ureña, L.: SINAI at CLEF 2002: Experiments with merging strategies. In: Peters, C., Braschler, M., Gonzalo, J. (eds.) CLEF 2002. LNCS, vol. 2785, pp. 103–110. Springer, Heidelberg (2003)

    Google Scholar 

  3. Martínez-Santiago, F., Martín, M., Ureña, L.: A merging strategy proposal: the 2-step retrieval status value method. Technical Report. Department of Computer Science of University of Jaén (2004)

    Google Scholar 

  4. Martínez-Santiago, F., Montejo-Ráez, A., Ureña, L., Diaz, M.: SINAI at CLEF 2003: Merging and decompounding. In: Peters, C., Gonzalo, J., Braschler, M., Kluck, M. (eds.) CLEF 2003. LNCS, vol. 3237, pp. 192–201. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  5. Calvé, A., Savoy, J.: Database merging strategy based on logistic regression. Information Processing & Management 36, 341–359 (2000)

    Article  Google Scholar 

  6. Savoy, J.: Cross-Language information retrieval: experiments based on CLEF 2000 corpora. Information Processing & Management 39, 75–115 (2003)

    Article  MATH  Google Scholar 

  7. Porter, M.: An algorithm for suffix stripping. Program 14, 130–137 (1980)

    Google Scholar 

  8. Robertson, S.E., Walker, S., Beaulieu, M.: Experimentation as a way of life: Okapi at TREC. Information Processing and Management 1, 95–108 (2000)

    Article  Google Scholar 

  9. Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)

    MATH  Google Scholar 

  10. Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 20, 273–297 (1995)

    MATH  Google Scholar 

  11. Li, Y., Zaragoza, H., Herbrich, R., Shawe-Taylor, J., Kandola, J.: The perceptron algorithm with uneven margins. In: Proceedings of the International Conference of Machine Learning (ICML 2002) (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations


Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Martínez-Santiago, F., García-Cumbreras, M.A., Díaz-Galiano, M.C., Ureña, L.A. (2005). SINAI at CLEF 2004: Using Machine Translation Resources with a Mixed 2-Step RSV Merging Algorithm. In: Peters, C., Clough, P., Gonzalo, J., Jones, G.J.F., Kluck, M., Magnini, B. (eds) Multilingual Information Access for Text, Speech and Images. CLEF 2004. Lecture Notes in Computer Science, vol 3491. Springer, Berlin, Heidelberg.

Download citation

  • DOI:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-27420-9

  • Online ISBN: 978-3-540-32051-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics