Skip to main content

TALP at GeoCLEF 2006: Experiments Using JIRS and Lucene with the ADL Feature Type Thesaurus

  • Conference paper
Evaluation of Multilingual and Multi-modal Information Retrieval (CLEF 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4730))

Included in the following conference series:

Abstract

This paper describes our experiments in Geographical Information Retrieval (GIR) in the context of our participation in the CLEF 2006 GeoCLEF Monolingual English task. Our system, named TALP-GeoIR, follows a similar architecture of the GeoTALP-IR system presented at GeoCLEF 2005 with some changes in the retrieval modes and the Geographical Knowledge Base (KB).

The system has four phases performed sequentially: i) a Keyword Selection algorithm based on a linguistic and geographical analysis of the topics, ii) a geographical retrieval with Lucene, iii) a document retrieval task with the JIRS Passage Retrieval (PR) software, and iv) a Document Ranking phase. A Geographical KB has been built using a set of publicly available geographical gazetteers and the Alexandria Digital Library (ADL) Feature Type Thesaurus.

In our experiments we have used JIRS, a state-of-the-art PR system for Question Answering, for the GIR task. We also have experimented with an approach using both JIRS and Lucene. In this approach JIRS was used only for textual document retrieval and Lucene was used to detect the geographically relevant documents. These experiments show that applying only JIRS we obtain better results than combining JIRS and Lucene.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Brants, T.: TnT – A Statistical Part-of-Speech Tagger. In: Proceedings of the 6th Applied NLP Conference (ANLP 2000), Seattle, WA, United States (2000)

    Google Scholar 

  2. Ferrés, D., Ageno, A., Rodríguez, H.: The GeoTALP-IR System at GeoCLEF-2005: Experiments Using a QA-based IR System, Linguistic Analysis, and a Geographical Thesaurus. In: Peters, et al. (eds.) [7] (2005)

    Google Scholar 

  3. Ferrés, D., Kanaan, S., González, E., Ageno, A., Rodríguez, H., Surdeanu, M., Turmo, J.: TALP-QA System at TREC 2004: Structural and Hierarchical Relaxation Over Semantic Constraints. In: Proceedings of the Text Retrieval Conference (TREC 2004) (2005)

    Google Scholar 

  4. Gey, F., Larson, R., Sanderson, M., Bischoff, K., Mandl, T., Womser-Hacker, C., Santos, D., Rocha, P., Nunzio, G.M.D., Ferro, N.: GeoCLEF 2006: the CLEF 2006 Cross-Language Geographic Information Retrieval Track Overview. In: Proceedings of the Cross Language Evaluation Forum 2006. LNCS, Springer, Heidelberg (2007)

    Google Scholar 

  5. Gey, F., Larson, R., Sanderson, M., Joho, H., Clough, P., Petras, V.: GeoCLEF: the CLEF 2005 Cross-Language Geographic Information Retrieval Track Overview.. In: Peters, C., Gey, F.C., Gonzalo, J., Müller, H., Jones, G.J.F., Kluck, M., Magnini, B., de Rijke, M., Giampiccolo, D. (eds.) CLEF 2005. LNCS, vol. 4022, Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  6. Hill, L.L.: Core Elements of Digital Gazetteers: Placenames, Categories, and Footprints. In: Borbinha, J.L., Baker, T. (eds.) ECDL 2000. LNCS, vol. 1923, pp. 280–290. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  7. Peters, C., Gey, F.C., Gonzalo, J., Jones, G.J.F., Müller, H., Kluck, M., Magnini, B., Müller, H., de Rijke, M.: Accessing Multilingual Information Repositories: 6th Workshop of the Cross-Language Evaluation Forum, CLEF 2005. In: Peters, C., Gey, F.C., Gonzalo, J., Müller, H., Jones, G.J.F., Kluck, M., Magnini, B., Müller, H., de Rijke, M. (eds.) CLEF 2005. LNCS, vol. 4022, Springer, Heidelberg (2006)

    Google Scholar 

  8. Gómez Soriano, J.M., Montes-y-Gómez, M., Arnal, E.S., Rosso, P.: A Passage Retrieval System for Multilingual Question Answering. In: Matoušek, V., Mautner, P., Pavelka, T. (eds.) TSD 2005. LNCS (LNAI), vol. 3658, pp. 443–450. Springer, Heidelberg (2005)

    Google Scholar 

  9. Sang, E.F.T.K., De Meulder, F.: Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition. In: Daelemans, W., Osborne, M. (eds.) Proceedings of CoNLL 2003, pp. 142–147. Edmonton, Canada (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Carol Peters Paul Clough Fredric C. Gey Jussi Karlgren Bernardo Magnini Douglas W. Oard Maarten de Rijke Maximilian Stempfhuber

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ferrés, D., Rodríguez, H. (2007). TALP at GeoCLEF 2006: Experiments Using JIRS and Lucene with the ADL Feature Type Thesaurus. In: Peters, C., et al. Evaluation of Multilingual and Multi-modal Information Retrieval. CLEF 2006. Lecture Notes in Computer Science, vol 4730. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74999-8_124

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74999-8_124

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74998-1

  • Online ISBN: 978-3-540-74999-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics