Abstract
In the last years, the interest of the IR-community in patent retrieval has risen. Various experiments have been undertaken and the problems of the field experienced in-depth. Results show that the paramount steps in the patent retrieval process are difficult to automate. This paper gives an overview over the methods that have been tested in the domain including their results and promotes a complementary, more interactive approach to patent retrieval. At last, promising areas for future research are described.
Similar content being viewed by others
Notes
Prior art is, in general, all the knowledge that existed prior to the relevant filing or priority date of a patent application, whether it existed by way of written or oral disclosure [64].
The are no exact numbers for the quantity of patents available. Even though estimates range between 50 and 90 million documents, they still give an impression of the magnitude that has to be dealt with.
Lopez & Romary achieved a higher MAP score in CLEF 2009 and CLEF 2010 by using the citations that were the basis for the relevance assessment. Since this exploitation must lead to better results, the approach cannot be taken seriously in a fair comparison.
References
Adams S (2009) New methodologies for patent searching; what do we need? In: Proceedings of the global symposium of intellectual property authorities. Available under http://www.wipo.int/export/sites/www/meetings/en/2009/sym_ip_auth/pdf/stephen_adams.pdf
Agatonovic M, Aswani N, Bontcheva K, Cunningham H, Heitz T, Li Y, Roberts I, Tablan V (2008) Large-scale, parallel automatic patent annotation. In: Proceedings of workshop on patent information retrieval, pp 1–8
Azzopardi L, Vanderbauwhede W, Joho H (2010) Search system requirements of patent analysts. In: SIGIR ’10, pp 775–776
Becks D (2013) Entwicklung eines Framework für die begriffliche Optimierung von Patentanfragen. Doctoral Thesis, Faculty of Language and Information Sciences, University of Hildesheim
Becks D, Mandl T, Womser-Hacker C (2010) Phrases or terms? The impact of different query types. In: CLEF notebook papers/labs/workshop
Bouayad-Agha N, Casamayor G, Ferraro G, Mille S, Vidal V, Wanner L (2009) Improving the comprehension of legal documentation: the case of patent claims. In: Proceedings of international conference on artificial intelligence and law, pp 78–87
Ceausu A, Tinsley J, Way A, Zhang J, Sheridan P (2011) Experiments on domain adaptation for patent machine translation in the PLuTO project. In: Proceedings of conference of the European association for machine translation, pp 21–28
D’hondt E, Verberne S, Alink W, Cornacchia R (2011) Combining document representations for prior-art retrieval. In: CLEF notebook papers/labs/workshop
Eisinger D (2013) Automated patent categorization and guided patent search using IPC as inspired by MeSH and PubMed. Doctoral Thesis, Faculty of Informatics, TU Dresden
EPO (2012) Patent searching for beginners—EPO patent information beginners seminar (EPIBS). http://www.epo.org/learning-events/events/training/patent-information-training/PI01-2013.html. Accessed 24 September 2013
Fujii A (2007) Enhancing patent retrieval by citation analysis. In: Proceedings international ACM SIGIR conference on research and development in information retrieval, pp 793–794
Fujita S (2005) Revisiting document length hypotheses: a comparative study of Japanese newspaper and patent retrieval. ACM Trans Asian Lang Inf Process 4:207–235
Ganguly D, Leveling J, Jones G (2011) United we fall, divided we stand: a study of query segmentation and PRF for patent prior art search. In: Proceedings of the 4th workshop on patent information retrieval. ACM, New York, pp 13–18
Giachanou A, Salampasis M, Satratzemi M, Samaras N (2013) Report on the CLEF-IP 2013 experiments: multilayer collection selection on topically organized patents. In: CLEF online working notes
Gobeill J, Ruch P (2012) BiTeM site report for the claims to passage task in CLEF-IP 2012. In: CLEF online working notes/labs/workshop
Goto I, Lu B, Chow KP, Sumita E, Tsou BK (2011) Overview of the patent machine translation task at the NTCIR-9 workshop. In: Proceedings of NTCIR-9 workshop meeting, December 6–9, Tokyo, Japan, pp 559–578
Hackl-Sommer RA (2010) Transparentes Ranking und Relevanz-Feedback im Patentretrieval. Fachinformationszentrum Karlsruhe
Harris CG, Arens R, Srinivasan P (2010) Comparison of IPC and USPC classification systems in patent prior art searches. In: Proceedings of workshop on patent information retrieval. ACM, New York, pp 27–32
Huang Z, Devlin J, Matsoukas S (2013) BBN’s systems for the Chinese-English sub-task of the NTCIR-10 PatentMT evaluation. In: Proceedings of NTCIR-10, pp 287–293
Indukuri KV, Ambekar AA, Sureka A (2007) Similarity analysis of patent claims using natural language processing techniques. In: International conference on computational intelligence and multimedia applications, vol 4, pp 169–175
Iwayama M, Fujii A, Kando N, Marukawa Y (2003) An empirical study on retrieval models for different document genres: patents and newspaper articles. In: Proceedings of international ACM SIGIR conference on research and development in information retrieval, pp 251–258
Jochim C, Lioma C, Schütze H (2011) Expanding queries with term and phrase translations in patent retrieval. In: Multidisciplinary information retrieval, pp 16–29
Kim Y, Seo J, Croft WB (2011) Automatic Boolean query suggestion for professional search. In: Proceedings of international ACM SIGIR conference on research and development in information retrieval, pp 825–834
Kishida K (2003) Experiments on pseudo relevance feedback method using Taylor formula at NTCIR-3 patent retrieval task. In: Online proceedings of NTCIR-3
Koch S, Bosch H (2011) From static textual display of patents to graphical interactions. In: Lupu M, Mayer K, Tait J, Trippe A (eds) Current challenges in patent information retrieval, pp 217–235
Koch S, Bosch H, Giereth M, Ertl T (2009) Iterative integration of visual insights during patent search and analysis. In: IEEE symposium on visual analytics science and technology (VAST), pp 203–210
Koch S, Bosch H, Giereth M, Ertl T (2011) Iterative integration of visual insights during scalable patent search and analysis. IEEE Trans Vis Comput Graph 17(5):557–569
Konishi K (2005) Query terms extraction from patent document for invalidity search. In: Online proceedings of NTCIR-5
Krishnan A, Cardenas AF, Springer D (2010) Search for patents using treatment and causal relationships. In: Proceedings of workshop on patent information retrieval, New York, NY, USA, pp 1–10
Li Y, Shawe-Taylor J (2007) Advanced learning algorithms for cross-language patent retrieval and classification. Inf Process Manag 43:1183–1199
Li Y-R, Wang L-H, Hong C-F (2009) Extracting the significant-rare keywords for patent analysis. Expert Syst Appl 36(3):5200–5204
Lopez P, Romary L (2009) PATATRAS: multiple retrieval models and regression models for prior art search. In: CLEF online working notes
Lopez P, Romary L (2010) Experiments with citation mining and key-term extraction for prior art search. In: CLEF notebook papers/labs/workshop
Lupu M, Hanbury A (2013) Patent retrieval. Found Trends Inf Retr 7(1):1–97
Ma J, Matsoukas S (2011) BBN’s systems for the Chinese-English sub-task of NTCIR-9 patent MT evaluation. In: Proceedings of NTCIR-9, pp 579–584
Magdy W, Jones G (2011) A study on query expansion methods for patent retrieval. In: Proceedings of workshop on patent information retrieval, pp 19–24
Marchionini G (1995) Information seeking in electronic environments. Cambridge Univ Press, Cambridge
Moldovan A, Bot RI, Wanka G (2005) Latent semantic indexing for patent documents. Int J Appl Math Comput Sci 15:551
Nanba H, Fujii A, Iwayama M, Hashimoto T (2010) Overview of the patent mining task at the NTCIR-8 workshop. In: Proceedings of the 8th NTCIR workshop meeting, pp 293–302
Nanba H, Kamaya H, Takezawa T, Okumura M, Shinmori A, Tanigawa H (2011) Automatic translation of scholarly terms into patent terms. In: Lupu M, Mayer K, Tait J, Trippe AJ (eds) Current challenges in patent information retrieval, pp 373–388
Nanba H, Mayumi S, Takezawa T (2011) Automatic construction of a bilingual thesaurus using citation analysis. In: Proceedings of workshop on patent information retrieval. ACM, New York, pp 25–30
Nishiyama R, Tsuboi Y, Unno Y, Takeuchi H, Takezawa T (2010) Feature-rich information extraction for the technical trend-map creation. In: Proceedings of the 8th NTCIR workshop meeting, pp 318–324
Offen B (2012) Micro-Visualisierung von Patentdokumenten. Master Thesis, Faculty of Language and Information Science, University of Hildesheim
Osborn M, Strzalkowski T, Marinescu M (1997) Evaluating document retrieval in patent database: a preliminary report. In: Proceedings of conference on information and knowledge management, pp 216–221
Oshio T, Mitsuhashi T, Kakita T (2011) Use of the Japio technical field dictionaries for NTCIR-PatentMT. In: Proceedings of NTCIR-9, pp 614–617
Oshio T, Mitsuhashi T, Kakita T (2013) Use of the Japio technical field dictionaries and commercial rule-based engine for NTCIR-PatentMT. In: Proceedings of NTCIR-10, pp 361–364
Parapatics P, Dittenbach M (2009) Patent claim decomposition for improved information extraction. In: Proceedings of the 2nd international workshop on patent information retrieval (PaIR ’09). ACM, New York, pp 33–36
Perez-Iglesias J, Rodrigo A, Fresno V (2010) Using bm25f and kld for patent retrieval. In: CLEF notebook papers/LABs/workshops
Sheremetyeva S (2003) Natural language analysis of patent claims. In: Proceedings of ACL workshop on patent corpus processing, pp 66–73
Shinmori A, Okumura M, Marukawa Y, Iwayama M (2003) Patent claim processing for readability—structure analysis and term explanation. In: Proceedings of the ACL workshop on patent corpus processing, pp 56–65
Sudoh K, Duh K, Tsukada H, Nagata M, Wu X, Matsuzaki T, Tsujii J (2011) NTT-UT statistical machine translation in NTCIR-9 PatentMT. In: Proceedings of NTCIR-9, pp 585–592
Sudoh K, Suzuki J, Tsukada H, Nagata M, Hoshino S, Miyao Y (2013) NTT-NII statistical machine translation for NTCIR-10 PatentMT. In: Proceedings of NTCIR-10, pp 294–300
Taduri S, Lau GT, Law KH, Yu H, Kesan JP (2011) An ontology-based interactive tool to search documents in the U.S. patent system. In: Proceedings of the annual international digital government research conference: digital government innovation in challenging times, New York, NY, USA, pp 329–330
Takaki T, Fuji A, Ishikawa T (2004) Associative document retrieval by query subtopic analysis and its application to invalidity patent search. In: Proceedings of conference on information and knowledge management, pp 399–405
Takeuchi H, Uramoto N, Takeda K (2005) Experiments on patent retrieval at NTCIR-5 workshop. In: Online proceedings of NTCIR-5
Tseng Y-H, Lin C-J, Lin Y-I (2007) Text mining techniques for patent analysis. Inf Process Manag 43:1216–1247
Verberne S, D’hondt E (2009) Prior art retrieval using the claims section as a bag of words. In: Proceedings of the cross-language evaluation forum conference on multilingual information access evaluation: text retrieval experiments, pp 497–501
Verma M, Varma V (2011) Applying key phrase extraction to aid invalidity search. In: Proceedings of international conference on artificial intelligence and law, pp 249–255
Verma M, Varma V (2011) Exploring keyphrase extraction and IPC classification vectors for prior art search. In: CLEF notebook papers/labs/workshop
Voorhees EM (1998) Variations in relevance judgments and the measurement of retrieval effectiveness. In: Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval, pp 697–716
Wang J, Loh HT, Lu WF (2010) NTCIR-8 patent mining task: extracting technology and effect entities in patents and research papers. In: Proceedings of the 8th NTCIR workshop meeting, pp 325–330
Ward M, Keim D, Grinstein GG (2010) Interactive data visualization. Taylor & Francis Ltd, New Delhi
Wikipedia (2013) History of patent law. en.wikipedia.org/wiki/History_of_patent_law. Accessed 24 September 2013
WIPO (2004) WIPO intellectual property handbook: policy, law and use. http://www.wipo.int/export/sites/www/freepublications/en/intproperty/489/wipo_pub_489.pdf. Accessed 24 September 2013
WIPO (2010) WIPO intellectual property handbook: policy, law and use. http://www.wipo.int/export/sites/www/freepublications/en/intproperty/489/wipo_pub_489.pdf. Accessed 24 September 2013
WIPO (2012) World intellectual property indicators. http://www.wipo.int/ipstats/en/wipi/index.html. Accessed 24 September 2013
WIPO (2013) Patents. Frequently asked questions. http://www.wipo.int/patentscope/en/patents_faq.html#patent. Accessed 24 September 2013
Xue X, Croft WB (2009) Automatic query generation for patent search. In: Proceedings of conference on information and knowledge management, pp 2037–2040
Yang S-Y, Soo VW (2008) Comparing the conceptual graphs extracted from patent claims. In: Proceedings of the IEEE international conference on sensor networks, ubiquitous, and trustworthy computing, Washington, DC, USA, pp 394–399
Zhao L, Callan J (2011) How to make manual conjunctive normal form queries work in patents search. In: Online proceedings of text retrieval conference
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Jürgens, J.J., Womser-Hacker, C. Limitations of Automatic Patent IR. Datenbank Spektrum 14, 5–17 (2014). https://doi.org/10.1007/s13222-014-0149-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13222-014-0149-y