skip to main content
10.1145/3539618.3591842acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper
Open access

Building a Graph-Based Patent Search Engine

Published: 18 July 2023 Publication History

Abstract

Performing prior art searches is an essential step in both patent drafting and invalidation. The task is challenging due to the large number of existing patent documents and the domain knowledge required to analyze the documents. We present a graph-based patent search engine that tries to mimic the work done by a professional patent examiner. Each patent document is converted to a graph that describes the parts of the invention and the relations between the parts. The search engine is powered by a graph neural network that learns to find prior art by using novelty citation data from patent office search reports where citations are compiled by human patent examiners. We show that a graph-based approach is an efficient way to perform searches on technical documents and demonstrate it in the context of patent searching.

Supplemental Material

MP4 File
Performing prior art searches is an essential step in both patent drafting and invalidation. The task is challenging and time-consuming due to the large number of existing patent documents and the domain knowledge required to analyze the documents. We present a graph-based patent search engine that tries to mimic the work done by a professional patent examiner, speeding up patent searches and improving the results. Each patent document is converted to a graph that describes the parts of the invention and the relations between the parts. The search engine is powered by a graph neural network that learns to find prior art by using novelty citation data from patent office search reports where citations are compiled by human patent examiners. The graph-based approach makes it easy for humans to understand and control the search results and enables efficient processing of large patent documents.

References

[1]
Amplified AI. [n.,d.]. Amplified. https://www.amplified.ai/
[2]
Ambercite. [n.,d.]. Ambercite. https://www.ambercite.com/
[3]
Grigor Aslanyan and Ian Wetherbee. 2022. Patents Phrase to Phrase Semantic Matching Dataset. https://doi.org/10.48550/ARXIV.2208.01171
[4]
Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. https://doi.org/10.48550/ARXIV.1406.1078
[5]
Clarivate. [n.,d.]. Derwent Innovation. https://clarivate.com/products/ip-intelligence/patent-intelligence-software/derwent-innovation/
[6]
Nigel S. Clarke. 2018. The basics of patent searching. World Patent Information, Vol. 54 (2018), S4-S10. https://doi.org/10.1016/j.wpi.2017.02.006 Best of Search Matters.
[7]
EPO. [n.,d.]. Espacenet. https://worldwide.espacenet.com/
[8]
EPO. 2022. Guidelines for Examination in the European Patent Office. Chapter 9.2. https://www.epo.org/law-practice/legal-texts/html/guidelines/e/b_x_9_2.htm
[9]
Michael Freunek and André Bodmer. 2021. BERT based patent novelty search by training claims to their own description. https://doi.org/10.48550/ARXIV.2103.01126
[10]
Google. [n.,d.]. Google Patents. https://patents.google.com/
[11]
Steve Harris, Anthony Trippe, David Challis, and Nigel Swycher. 2020. Construction and evaluation of gold standards for patent classification-A case study on quantum computing. World Patent Information, Vol. 61 (2020), 101961. https://doi.org/10.1016/j.wpi.2020.101961
[12]
Lea Helmers, Franziska Horn, Franziska Biegler, Tim Oppermann, and Klaus-Robert Müller. 2019. Automating the search for a patent's prior art with a full text similarity search. PloS one, Vol. 14, 3 (2019), e0212103.
[13]
Benjamin Herbert, György Szarvas, and Iryna Gurevych. 2010. Prior Art Search Using International Patent Classification Codes and All-Claims-Queries. In Multilingual Information Access Evaluation I. Text Retrieval Experiments, Carol Peters, Giorgio Maria Di Nunzio, Mikko Kurimo, Thomas Mandl, Djamel Mostefa, Anselmo Pe n as, and Giovanna Roda (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 452--459.
[14]
Alexander Hermans, Lucas Beyer, and Bastian Leibe. 2017. In Defense of the Triplet Loss for Person Re-Identification. (2017). https://doi.org/10.48550/ARXIV.1703.07737
[15]
Matthew Honnibal, Ines Montani, Sofie Van Landeghem, and Adriane Boyd. 2020. spaCy: Industrial-strength Natural Language Processing in Python, Zenodo, 2020. https://spacy.io
[16]
IPScreener. [n.,d.]. IPScreener. https://ipscreener.com/
[17]
Jieh-Sheng Lee and Jieh Hsiang. 2020. Prior Art Search and Reranking for Generated Patent Text. (2020). https://doi.org/10.48550/ARXIV.2009.09132
[18]
Minesoft. [n.,d.]. Patbase. https://www.patbase.com
[19]
Patsnap. [n.,d.]. Patsnap. https://www.patsnap.com
[20]
Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global Vectors for Word Representation. In EMNLP, Vol. 14. 1532--1543.
[21]
Florina Piroi, Mihai Lupu, Allan Hanbury, and Veronika Zenz. 2011. CLEF-IP 2011: Retrieval in the intellectual property domain. Proceedings of CLEF 2011.
[22]
Questel. [n.,d.]. Orbit Intelligence. https://www.questel.com/ip-intelligence-software/orbit-intelligence/
[23]
Julian Risch, Nicolas Alder, Christoph Hewel, and Ralf Krestel. 2020. PatentMatch: A Dataset for Matching Patent Claims & Prior Art. https://doi.org/10.48550/ARXIV.2012.13919
[24]
Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. FaceNet: A unified embedding for face recognition and clustering. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (Jun 2015). https://doi.org/10.1109/cvpr.2015.7298682
[25]
Abhinav Shrivastava, Abhinav Gupta, and Ross Girshick. 2016. Training region-based object detectors with online hard example mining. In Proceedings of the IEEE conference on computer vision and pattern recognition. 761--769.
[26]
Kai Sheng Tai, Richard Socher, and Christopher D. Manning. 2015. Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Beijing, China, 1556--1566. https://doi.org/10.3115/v1/P15-1150
[27]
USPTO. [n.,d.]. Patent Public Search. https://ppubs.uspto.gov/pubwebapp/static/pages/landing.html
[28]
John Wieting, Mohit Bansal, Kevin Gimpel, and Karen Livescu. 2015. Towards Universal Paraphrastic Sentence Embeddings. https://doi.org/10.48550/ARXIV.1511.08198
[29]
WIPO. 2017. IP Facts and Figures 2017. https://doi.org/10.34667/tind.28221
[30]
WIPO. 2022a. IP Facts and Figures 2022. https://doi.org/10.34667/tind.47183
[31]
WIPO. 2022b. WIPO Patent Drafting Manual. Chapter 2. https://doi.org/10.34667/tind.44657
[32]
WIPO. 2022c. WIPO Patent Drafting Manual. Chapter 3. https://doi.org/10.34667/tind.44657
[33]
Jie Zhou, Ganqu Cui, Shengding Hu, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, and Maosong Sun. 2020. Graph neural networks: A review of methods and applications. AI open, Vol. 1 (2020), 57--81.

Cited By

View all
  • (2024)Tracking Biofuel Innovation: A Graph-Based Analysis of Sustainable Aviation Fuel PatentsEnergies10.3390/en1715368317:15(3683)Online publication date: 26-Jul-2024
  • (2024)Relevance Feedback Method For Patent Searching Using Vector SubspacesProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3661365(2860-2864)Online publication date: 10-Jul-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2023
3567 pages
ISBN:9781450394086
DOI:10.1145/3539618
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2023

Check for updates

Author Tags

  1. graph neural networks
  2. natural language processing
  3. patent search

Qualifiers

  • Short-paper

Conference

SIGIR '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)527
  • Downloads (Last 6 weeks)57
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Tracking Biofuel Innovation: A Graph-Based Analysis of Sustainable Aviation Fuel PatentsEnergies10.3390/en1715368317:15(3683)Online publication date: 26-Jul-2024
  • (2024)Relevance Feedback Method For Patent Searching Using Vector SubspacesProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3661365(2860-2864)Online publication date: 10-Jul-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media