Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6241))

Included in the following conference series:

Abstract

This paper reports experiments performed in the course of the CLEF’09 Intellectual Property track, where our main goal was to study automatic query generation from the patent documents. Two simple word weighting algorithms (modified RATF formula, and tf·idf) for selecting query keys from the patent documents were tested. Also using different parts of the patent documents as sources of query keys was investigated. Our best runs placed relatively well compared to the other CLEF-IP’09 participants’ runs. This suggests that tested approaches to the automatic query generation could be useful, and should be developed further. For three topics, the performance of the automatically extracted queries were compared to queries produced by three patent experts to see whether the automatic key word extraction algorithms seem to be able to extract relevant words from the topics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Mase, H., Matsubayashi, T., Ogawa, Y., Iwayama, M., Oshio, T.: Proposal of two-stage patent retrieval method considering the claim structure. ACM TALIP 4(2), 190–206 (2005)

    Article  Google Scholar 

  2. Larkey, L.S.: A patent search and classification system. In: Proc. of the Fourth ACM conference on Digital Libraries, pp. 179–187. ACM, New York (1999)

    Chapter  Google Scholar 

  3. Roda, G., Tait, J., Piroi, F., Zenz, V.: CLEF-IP 2009: Retrieval experiments in the intellectual property domain (2009), http://clef.iei.pi.cnr.it/

  4. Pirkola, A., Leppänen, E., Järvelin, K.: The RATF formula (Kwok’s formula): Exploiting average term frequency in cross-language retrieval. Information Research 7(2) (2002)

    Google Scholar 

  5. Kim, J.H., Choi, K.S.: Patent document categorization based on semantic structural information. Inf. Process. Manage 43(5), 1200–1215 (2007)

    Article  Google Scholar 

  6. Strohman, T., Metzler, D., Turtle, H., Croft, W.B.: Indri: A language-model based search engine for complex queries. In: Proc. of the International Conference on Intelligence Analysis (2005)

    Google Scholar 

  7. Wilkins, P., Ferguson, P., Smeaton, A.F.: Using score distributions for query-time fusion in multimediaretrieval. In: Proc. of the 8th ACM International Workshop on Multimedia Information Retrieval, pp. 51–60. ACM, New York (2006)

    Chapter  Google Scholar 

  8. Järvelin, A., Järvelin, A., Hansen, P.: UTA and SICS at CLEF-IP. In: CLEF Working Notes 2009 (2009), http://clef.iei.pi.cnr.it/

  9. Kekäläinen, J., Järvelin, K.: Using graded relevance assessments in IR evaluation. ACM TOIS 53(13), 1120–1129 (2002)

    Google Scholar 

  10. Fujita, S.: Technology survey and invalidity search: A comparative study of different tasks for Japanese patent document retrieval. Inf. Process. Manage. 43(5), 1154–1172 (2007)

    Article  Google Scholar 

  11. Talvensaari, T., Pirkola, A., Järvelin, K., Juhola, M., Laurikkala, J.: Focused web crawling in the acquisition of comparable corpora. Information Retrieval 11(5), 427–445 (2008)

    Article  Google Scholar 

  12. Sahlgren, M.: The Word-Space Model. PhD thesis, Stockholm University (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Järvelin, A., Järvelin, A., Hansen, P. (2010). UTA and SICS at CLEF-IP’09. In: Peters, C., et al. Multilingual Information Access Evaluation I. Text Retrieval Experiments. CLEF 2009. Lecture Notes in Computer Science, vol 6241. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15754-7_55

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15754-7_55

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15753-0

  • Online ISBN: 978-3-642-15754-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics