Skip to main content
Log in

Limitations of Automatic Patent IR

A Plea for More Interactivity

  • Schwerpunktbeitrag
  • Published:
Datenbank-Spektrum Aims and scope Submit manuscript

Abstract

In the last years, the interest of the IR-community in patent retrieval has risen. Various experiments have been undertaken and the problems of the field experienced in-depth. Results show that the paramount steps in the patent retrieval process are difficult to automate. This paper gives an overview over the methods that have been tested in the domain including their results and promotes a complementary, more interactive approach to patent retrieval. At last, promising areas for future research are described.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. http://www.clef-initiative.eu/.

  2. http://research.nii.ac.jp/ntcir/index-en.html.

  3. Prior art is, in general, all the knowledge that existed prior to the relevant filing or priority date of a patent application, whether it existed by way of written or oral disclosure [64].

  4. The are no exact numbers for the quantity of patents available. Even though estimates range between 50 and 90 million documents, they still give an impression of the magnitude that has to be dealt with.

  5. Lopez & Romary achieved a higher MAP score in CLEF 2009 and CLEF 2010 by using the citations that were the basis for the relevance assessment. Since this exploitation must lead to better results, the approach cannot be taken seriously in a fair comparison.

References

  1. Adams S (2009) New methodologies for patent searching; what do we need? In: Proceedings of the global symposium of intellectual property authorities. Available under http://www.wipo.int/export/sites/www/meetings/en/2009/sym_ip_auth/pdf/stephen_adams.pdf

    Google Scholar 

  2. Agatonovic M, Aswani N, Bontcheva K, Cunningham H, Heitz T, Li Y, Roberts I, Tablan V (2008) Large-scale, parallel automatic patent annotation. In: Proceedings of workshop on patent information retrieval, pp 1–8

    Chapter  Google Scholar 

  3. Azzopardi L, Vanderbauwhede W, Joho H (2010) Search system requirements of patent analysts. In: SIGIR ’10, pp 775–776

    Google Scholar 

  4. Becks D (2013) Entwicklung eines Framework für die begriffliche Optimierung von Patentanfragen. Doctoral Thesis, Faculty of Language and Information Sciences, University of Hildesheim

  5. Becks D, Mandl T, Womser-Hacker C (2010) Phrases or terms? The impact of different query types. In: CLEF notebook papers/labs/workshop

    Google Scholar 

  6. Bouayad-Agha N, Casamayor G, Ferraro G, Mille S, Vidal V, Wanner L (2009) Improving the comprehension of legal documentation: the case of patent claims. In: Proceedings of international conference on artificial intelligence and law, pp 78–87

    Google Scholar 

  7. Ceausu A, Tinsley J, Way A, Zhang J, Sheridan P (2011) Experiments on domain adaptation for patent machine translation in the PLuTO project. In: Proceedings of conference of the European association for machine translation, pp 21–28

    Google Scholar 

  8. D’hondt E, Verberne S, Alink W, Cornacchia R (2011) Combining document representations for prior-art retrieval. In: CLEF notebook papers/labs/workshop

    Google Scholar 

  9. Eisinger D (2013) Automated patent categorization and guided patent search using IPC as inspired by MeSH and PubMed. Doctoral Thesis, Faculty of Informatics, TU Dresden

  10. EPO (2012) Patent searching for beginners—EPO patent information beginners seminar (EPIBS). http://www.epo.org/learning-events/events/training/patent-information-training/PI01-2013.html. Accessed 24 September 2013

  11. Fujii A (2007) Enhancing patent retrieval by citation analysis. In: Proceedings international ACM SIGIR conference on research and development in information retrieval, pp 793–794

    Google Scholar 

  12. Fujita S (2005) Revisiting document length hypotheses: a comparative study of Japanese newspaper and patent retrieval. ACM Trans Asian Lang Inf Process 4:207–235

    Article  Google Scholar 

  13. Ganguly D, Leveling J, Jones G (2011) United we fall, divided we stand: a study of query segmentation and PRF for patent prior art search. In: Proceedings of the 4th workshop on patent information retrieval. ACM, New York, pp 13–18

    Chapter  Google Scholar 

  14. Giachanou A, Salampasis M, Satratzemi M, Samaras N (2013) Report on the CLEF-IP 2013 experiments: multilayer collection selection on topically organized patents. In: CLEF online working notes

    Google Scholar 

  15. Gobeill J, Ruch P (2012) BiTeM site report for the claims to passage task in CLEF-IP 2012. In: CLEF online working notes/labs/workshop

    Google Scholar 

  16. Goto I, Lu B, Chow KP, Sumita E, Tsou BK (2011) Overview of the patent machine translation task at the NTCIR-9 workshop. In: Proceedings of NTCIR-9 workshop meeting, December 6–9, Tokyo, Japan, pp 559–578

    Google Scholar 

  17. Hackl-Sommer RA (2010) Transparentes Ranking und Relevanz-Feedback im Patentretrieval. Fachinformationszentrum Karlsruhe

  18. Harris CG, Arens R, Srinivasan P (2010) Comparison of IPC and USPC classification systems in patent prior art searches. In: Proceedings of workshop on patent information retrieval. ACM, New York, pp 27–32

    Chapter  Google Scholar 

  19. Huang Z, Devlin J, Matsoukas S (2013) BBN’s systems for the Chinese-English sub-task of the NTCIR-10 PatentMT evaluation. In: Proceedings of NTCIR-10, pp 287–293

    Google Scholar 

  20. Indukuri KV, Ambekar AA, Sureka A (2007) Similarity analysis of patent claims using natural language processing techniques. In: International conference on computational intelligence and multimedia applications, vol 4, pp 169–175

    Google Scholar 

  21. Iwayama M, Fujii A, Kando N, Marukawa Y (2003) An empirical study on retrieval models for different document genres: patents and newspaper articles. In: Proceedings of international ACM SIGIR conference on research and development in information retrieval, pp 251–258

    Google Scholar 

  22. Jochim C, Lioma C, Schütze H (2011) Expanding queries with term and phrase translations in patent retrieval. In: Multidisciplinary information retrieval, pp 16–29

    Chapter  Google Scholar 

  23. Kim Y, Seo J, Croft WB (2011) Automatic Boolean query suggestion for professional search. In: Proceedings of international ACM SIGIR conference on research and development in information retrieval, pp 825–834

    Google Scholar 

  24. Kishida K (2003) Experiments on pseudo relevance feedback method using Taylor formula at NTCIR-3 patent retrieval task. In: Online proceedings of NTCIR-3

    Google Scholar 

  25. Koch S, Bosch H (2011) From static textual display of patents to graphical interactions. In: Lupu M, Mayer K, Tait J, Trippe A (eds) Current challenges in patent information retrieval, pp 217–235

    Chapter  Google Scholar 

  26. Koch S, Bosch H, Giereth M, Ertl T (2009) Iterative integration of visual insights during patent search and analysis. In: IEEE symposium on visual analytics science and technology (VAST), pp 203–210

    Google Scholar 

  27. Koch S, Bosch H, Giereth M, Ertl T (2011) Iterative integration of visual insights during scalable patent search and analysis. IEEE Trans Vis Comput Graph 17(5):557–569

    Article  Google Scholar 

  28. Konishi K (2005) Query terms extraction from patent document for invalidity search. In: Online proceedings of NTCIR-5

    Google Scholar 

  29. Krishnan A, Cardenas AF, Springer D (2010) Search for patents using treatment and causal relationships. In: Proceedings of workshop on patent information retrieval, New York, NY, USA, pp 1–10

    Chapter  Google Scholar 

  30. Li Y, Shawe-Taylor J (2007) Advanced learning algorithms for cross-language patent retrieval and classification. Inf Process Manag 43:1183–1199

    Article  Google Scholar 

  31. Li Y-R, Wang L-H, Hong C-F (2009) Extracting the significant-rare keywords for patent analysis. Expert Syst Appl 36(3):5200–5204

    Article  Google Scholar 

  32. Lopez P, Romary L (2009) PATATRAS: multiple retrieval models and regression models for prior art search. In: CLEF online working notes

    Google Scholar 

  33. Lopez P, Romary L (2010) Experiments with citation mining and key-term extraction for prior art search. In: CLEF notebook papers/labs/workshop

    Google Scholar 

  34. Lupu M, Hanbury A (2013) Patent retrieval. Found Trends Inf Retr 7(1):1–97

    Article  Google Scholar 

  35. Ma J, Matsoukas S (2011) BBN’s systems for the Chinese-English sub-task of NTCIR-9 patent MT evaluation. In: Proceedings of NTCIR-9, pp 579–584

    Google Scholar 

  36. Magdy W, Jones G (2011) A study on query expansion methods for patent retrieval. In: Proceedings of workshop on patent information retrieval, pp 19–24

    Chapter  Google Scholar 

  37. Marchionini G (1995) Information seeking in electronic environments. Cambridge Univ Press, Cambridge

    Book  Google Scholar 

  38. Moldovan A, Bot RI, Wanka G (2005) Latent semantic indexing for patent documents. Int J Appl Math Comput Sci 15:551

    MATH  Google Scholar 

  39. Nanba H, Fujii A, Iwayama M, Hashimoto T (2010) Overview of the patent mining task at the NTCIR-8 workshop. In: Proceedings of the 8th NTCIR workshop meeting, pp 293–302

    Google Scholar 

  40. Nanba H, Kamaya H, Takezawa T, Okumura M, Shinmori A, Tanigawa H (2011) Automatic translation of scholarly terms into patent terms. In: Lupu M, Mayer K, Tait J, Trippe AJ (eds) Current challenges in patent information retrieval, pp 373–388

    Chapter  Google Scholar 

  41. Nanba H, Mayumi S, Takezawa T (2011) Automatic construction of a bilingual thesaurus using citation analysis. In: Proceedings of workshop on patent information retrieval. ACM, New York, pp 25–30

    Chapter  Google Scholar 

  42. Nishiyama R, Tsuboi Y, Unno Y, Takeuchi H, Takezawa T (2010) Feature-rich information extraction for the technical trend-map creation. In: Proceedings of the 8th NTCIR workshop meeting, pp 318–324

    Google Scholar 

  43. Offen B (2012) Micro-Visualisierung von Patentdokumenten. Master Thesis, Faculty of Language and Information Science, University of Hildesheim

  44. Osborn M, Strzalkowski T, Marinescu M (1997) Evaluating document retrieval in patent database: a preliminary report. In: Proceedings of conference on information and knowledge management, pp 216–221

    Google Scholar 

  45. Oshio T, Mitsuhashi T, Kakita T (2011) Use of the Japio technical field dictionaries for NTCIR-PatentMT. In: Proceedings of NTCIR-9, pp 614–617

    Google Scholar 

  46. Oshio T, Mitsuhashi T, Kakita T (2013) Use of the Japio technical field dictionaries and commercial rule-based engine for NTCIR-PatentMT. In: Proceedings of NTCIR-10, pp 361–364

    Google Scholar 

  47. Parapatics P, Dittenbach M (2009) Patent claim decomposition for improved information extraction. In: Proceedings of the 2nd international workshop on patent information retrieval (PaIR ’09). ACM, New York, pp 33–36

    Chapter  Google Scholar 

  48. Perez-Iglesias J, Rodrigo A, Fresno V (2010) Using bm25f and kld for patent retrieval. In: CLEF notebook papers/LABs/workshops

    Google Scholar 

  49. Sheremetyeva S (2003) Natural language analysis of patent claims. In: Proceedings of ACL workshop on patent corpus processing, pp 66–73

    Chapter  Google Scholar 

  50. Shinmori A, Okumura M, Marukawa Y, Iwayama M (2003) Patent claim processing for readability—structure analysis and term explanation. In: Proceedings of the ACL workshop on patent corpus processing, pp 56–65

    Chapter  Google Scholar 

  51. Sudoh K, Duh K, Tsukada H, Nagata M, Wu X, Matsuzaki T, Tsujii J (2011) NTT-UT statistical machine translation in NTCIR-9 PatentMT. In: Proceedings of NTCIR-9, pp 585–592

    Google Scholar 

  52. Sudoh K, Suzuki J, Tsukada H, Nagata M, Hoshino S, Miyao Y (2013) NTT-NII statistical machine translation for NTCIR-10 PatentMT. In: Proceedings of NTCIR-10, pp 294–300

    Google Scholar 

  53. Taduri S, Lau GT, Law KH, Yu H, Kesan JP (2011) An ontology-based interactive tool to search documents in the U.S. patent system. In: Proceedings of the annual international digital government research conference: digital government innovation in challenging times, New York, NY, USA, pp 329–330

    Chapter  Google Scholar 

  54. Takaki T, Fuji A, Ishikawa T (2004) Associative document retrieval by query subtopic analysis and its application to invalidity patent search. In: Proceedings of conference on information and knowledge management, pp 399–405

    Google Scholar 

  55. Takeuchi H, Uramoto N, Takeda K (2005) Experiments on patent retrieval at NTCIR-5 workshop. In: Online proceedings of NTCIR-5

    Google Scholar 

  56. Tseng Y-H, Lin C-J, Lin Y-I (2007) Text mining techniques for patent analysis. Inf Process Manag 43:1216–1247

    Article  Google Scholar 

  57. Verberne S, D’hondt E (2009) Prior art retrieval using the claims section as a bag of words. In: Proceedings of the cross-language evaluation forum conference on multilingual information access evaluation: text retrieval experiments, pp 497–501

    Google Scholar 

  58. Verma M, Varma V (2011) Applying key phrase extraction to aid invalidity search. In: Proceedings of international conference on artificial intelligence and law, pp 249–255

    Google Scholar 

  59. Verma M, Varma V (2011) Exploring keyphrase extraction and IPC classification vectors for prior art search. In: CLEF notebook papers/labs/workshop

    Google Scholar 

  60. Voorhees EM (1998) Variations in relevance judgments and the measurement of retrieval effectiveness. In: Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval, pp 697–716

    Google Scholar 

  61. Wang J, Loh HT, Lu WF (2010) NTCIR-8 patent mining task: extracting technology and effect entities in patents and research papers. In: Proceedings of the 8th NTCIR workshop meeting, pp 325–330

    Google Scholar 

  62. Ward M, Keim D, Grinstein GG (2010) Interactive data visualization. Taylor & Francis Ltd, New Delhi

    Google Scholar 

  63. Wikipedia (2013) History of patent law. en.wikipedia.org/wiki/History_of_patent_law. Accessed 24 September 2013

  64. WIPO (2004) WIPO intellectual property handbook: policy, law and use. http://www.wipo.int/export/sites/www/freepublications/en/intproperty/489/wipo_pub_489.pdf. Accessed 24 September 2013

  65. WIPO (2010) WIPO intellectual property handbook: policy, law and use. http://www.wipo.int/export/sites/www/freepublications/en/intproperty/489/wipo_pub_489.pdf. Accessed 24 September 2013

  66. WIPO (2012) World intellectual property indicators. http://www.wipo.int/ipstats/en/wipi/index.html. Accessed 24 September 2013

  67. WIPO (2013) Patents. Frequently asked questions. http://www.wipo.int/patentscope/en/patents_faq.html#patent. Accessed 24 September 2013

  68. Xue X, Croft WB (2009) Automatic query generation for patent search. In: Proceedings of conference on information and knowledge management, pp 2037–2040

    Google Scholar 

  69. Yang S-Y, Soo VW (2008) Comparing the conceptual graphs extracted from patent claims. In: Proceedings of the IEEE international conference on sensor networks, ubiquitous, and trustworthy computing, Washington, DC, USA, pp 394–399

    Google Scholar 

  70. Zhao L, Callan J (2011) How to make manual conjunctive normal form queries work in patents search. In: Online proceedings of text retrieval conference

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Julia J. Jürgens.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jürgens, J.J., Womser-Hacker, C. Limitations of Automatic Patent IR. Datenbank Spektrum 14, 5–17 (2014). https://doi.org/10.1007/s13222-014-0149-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13222-014-0149-y

Keywords

Navigation