short-paper

Open access

Building a Graph-Based Patent Search Engine

Authors:

Sebastian Björkqvist,

Juho KallioAuthors Info & Claims

SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 3300 - 3304

https://doi.org/10.1145/3539618.3591842

Published: 18 July 2023 Publication History

Abstract

Performing prior art searches is an essential step in both patent drafting and invalidation. The task is challenging due to the large number of existing patent documents and the domain knowledge required to analyze the documents. We present a graph-based patent search engine that tries to mimic the work done by a professional patent examiner. Each patent document is converted to a graph that describes the parts of the invention and the relations between the parts. The search engine is powered by a graph neural network that learns to find prior art by using novelty citation data from patent office search reports where citations are compiled by human patent examiners. We show that a graph-based approach is an efficient way to perform searches on technical documents and demonstrate it in the context of patent searching.

Supplemental Material

MP4 File

Performing prior art searches is an essential step in both patent drafting and invalidation. The task is challenging and time-consuming due to the large number of existing patent documents and the domain knowledge required to analyze the documents. We present a graph-based patent search engine that tries to mimic the work done by a professional patent examiner, speeding up patent searches and improving the results. Each patent document is converted to a graph that describes the parts of the invention and the relations between the parts. The search engine is powered by a graph neural network that learns to find prior art by using novelty citation data from patent office search reports where citations are compiled by human patent examiners. The graph-based approach makes it easy for humans to understand and control the search results and enables efficient processing of large patent documents.

Download
30.61 MB

References

[1]

Amplified AI. [n.,d.]. Amplified. https://www.amplified.ai/

[2]

Ambercite. [n.,d.]. Ambercite. https://www.ambercite.com/

[3]

Grigor Aslanyan and Ian Wetherbee. 2022. Patents Phrase to Phrase Semantic Matching Dataset. https://doi.org/10.48550/ARXIV.2208.01171

[4]

Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. https://doi.org/10.48550/ARXIV.1406.1078

[5]

Clarivate. [n.,d.]. Derwent Innovation. https://clarivate.com/products/ip-intelligence/patent-intelligence-software/derwent-innovation/

[6]

Nigel S. Clarke. 2018. The basics of patent searching. World Patent Information, Vol. 54 (2018), S4-S10. https://doi.org/10.1016/j.wpi.2017.02.006 Best of Search Matters.

[7]

EPO. [n.,d.]. Espacenet. https://worldwide.espacenet.com/

[8]

EPO. 2022. Guidelines for Examination in the European Patent Office. Chapter 9.2. https://www.epo.org/law-practice/legal-texts/html/guidelines/e/b_x_9_2.htm

[9]

Michael Freunek and André Bodmer. 2021. BERT based patent novelty search by training claims to their own description. https://doi.org/10.48550/ARXIV.2103.01126

[10]

Google. [n.,d.]. Google Patents. https://patents.google.com/

[11]

Steve Harris, Anthony Trippe, David Challis, and Nigel Swycher. 2020. Construction and evaluation of gold standards for patent classification-A case study on quantum computing. World Patent Information, Vol. 61 (2020), 101961. https://doi.org/10.1016/j.wpi.2020.101961

[12]

Lea Helmers, Franziska Horn, Franziska Biegler, Tim Oppermann, and Klaus-Robert Müller. 2019. Automating the search for a patent's prior art with a full text similarity search. PloS one, Vol. 14, 3 (2019), e0212103.

[13]

Benjamin Herbert, György Szarvas, and Iryna Gurevych. 2010. Prior Art Search Using International Patent Classification Codes and All-Claims-Queries. In Multilingual Information Access Evaluation I. Text Retrieval Experiments, Carol Peters, Giorgio Maria Di Nunzio, Mikko Kurimo, Thomas Mandl, Djamel Mostefa, Anselmo Pe n as, and Giovanna Roda (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 452--459.

[14]

Alexander Hermans, Lucas Beyer, and Bastian Leibe. 2017. In Defense of the Triplet Loss for Person Re-Identification. (2017). https://doi.org/10.48550/ARXIV.1703.07737

[15]

Matthew Honnibal, Ines Montani, Sofie Van Landeghem, and Adriane Boyd. 2020. spaCy: Industrial-strength Natural Language Processing in Python, Zenodo, 2020. https://spacy.io

[16]

IPScreener. [n.,d.]. IPScreener. https://ipscreener.com/

[17]

Jieh-Sheng Lee and Jieh Hsiang. 2020. Prior Art Search and Reranking for Generated Patent Text. (2020). https://doi.org/10.48550/ARXIV.2009.09132

[18]

Minesoft. [n.,d.]. Patbase. https://www.patbase.com

[19]

Patsnap. [n.,d.]. Patsnap. https://www.patsnap.com

[20]

Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global Vectors for Word Representation. In EMNLP, Vol. 14. 1532--1543.

[21]

Florina Piroi, Mihai Lupu, Allan Hanbury, and Veronika Zenz. 2011. CLEF-IP 2011: Retrieval in the intellectual property domain. Proceedings of CLEF 2011.

[22]

Questel. [n.,d.]. Orbit Intelligence. https://www.questel.com/ip-intelligence-software/orbit-intelligence/

[23]

Julian Risch, Nicolas Alder, Christoph Hewel, and Ralf Krestel. 2020. PatentMatch: A Dataset for Matching Patent Claims & Prior Art. https://doi.org/10.48550/ARXIV.2012.13919

[24]

Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. FaceNet: A unified embedding for face recognition and clustering. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (Jun 2015). https://doi.org/10.1109/cvpr.2015.7298682

[25]

Abhinav Shrivastava, Abhinav Gupta, and Ross Girshick. 2016. Training region-based object detectors with online hard example mining. In Proceedings of the IEEE conference on computer vision and pattern recognition. 761--769.

[26]

Kai Sheng Tai, Richard Socher, and Christopher D. Manning. 2015. Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Beijing, China, 1556--1566. https://doi.org/10.3115/v1/P15-1150

[27]

USPTO. [n.,d.]. Patent Public Search. https://ppubs.uspto.gov/pubwebapp/static/pages/landing.html

[28]

John Wieting, Mohit Bansal, Kevin Gimpel, and Karen Livescu. 2015. Towards Universal Paraphrastic Sentence Embeddings. https://doi.org/10.48550/ARXIV.1511.08198

[29]

WIPO. 2017. IP Facts and Figures 2017. https://doi.org/10.34667/tind.28221

[30]

WIPO. 2022a. IP Facts and Figures 2022. https://doi.org/10.34667/tind.47183

[31]

WIPO. 2022b. WIPO Patent Drafting Manual. Chapter 2. https://doi.org/10.34667/tind.44657

[32]

WIPO. 2022c. WIPO Patent Drafting Manual. Chapter 3. https://doi.org/10.34667/tind.44657

[33]

Jie Zhou, Ganqu Cui, Shengding Hu, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, and Maosong Sun. 2020. Graph neural networks: A review of methods and applications. AI open, Vol. 1 (2020), 57--81.

Cited By

de Oliveira MMosquéra LMartins PSerrano ABispo GVergara GSaiki GNeumann CGonçalves V(2024)Tracking Biofuel Innovation: A Graph-Based Analysis of Sustainable Aviation Fuel PatentsEnergies10.3390/en1715368317:15(3683)Online publication date: 26-Jul-2024
https://doi.org/10.3390/en17153683
Björkqvist SHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Relevance Feedback Method For Patent Searching Using Vector SubspacesProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3661365(2860-2864)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3661365

Index Terms

Building a Graph-Based Patent Search Engine
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
    2. Retrieval tasks and goals
      1. Expert search

Recommendations

Completing keyword patent search with semantic patent search: introducing a semiautomatic iterative method for patent near search based on semantic similarities

Patent search is a substantial basis for many operational questions and scientometric evaluations. We consider it as a sequence of distinct stages. The "patent wide search" involves a definition of system boundaries by means of classifications and a ...
A User-Friendly Patent Search Paradigm

As an important operation for finding existing relevant patents and validating a new patent application, patent search has attracted considerable attention recently. However, many users have limited knowledge about the underlying patents, and they have ...
Extracting problem solved concepts from patent documents
PaIR '09: Proceedings of the 2nd international workshop on Patent information retrieval

In this paper, we report the results for the experiments we carried out to automatically extract "problem solved concepts" from a patent document. We introduce two approaches for finding important information in a patent document. The main focus of our ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 2023

3567 pages

ISBN:9781450394086

DOI:10.1145/3539618

General Chairs:
Hsin-Hsi Chen
National Taiwan University
,
Wei-Jou (Edward) Duh
National Taiwan University
,
Hen-Hsen Huang
Academia Sinica
,
Program Chairs:
Makoto P. Kato
Spotify
,
Josiane Mothe
Universite de Toulouse
,
Barbara Poblete
University of Chile and Amazon Visiting Academic

Copyright © 2023 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2023

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

SIGIR '23

Sponsor:

SIGIR

SIGIR '23: The 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 23 - 27, 2023

Taipei, Taiwan

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
801
Total Downloads

Downloads (Last 12 months)527
Downloads (Last 6 weeks)57

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

de Oliveira MMosquéra LMartins PSerrano ABispo GVergara GSaiki GNeumann CGonçalves V(2024)Tracking Biofuel Innovation: A Graph-Based Analysis of Sustainable Aviation Fuel PatentsEnergies10.3390/en1715368317:15(3683)Online publication date: 26-Jul-2024
https://doi.org/10.3390/en17153683
Björkqvist SHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Relevance Feedback Method For Patent Searching Using Vector SubspacesProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3661365(2860-2864)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3661365

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten