Skip to main content
Log in

One-to-many comparative summarization for patents

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Patents bring technology companies commercial values in modern business operations. However, companies have to bear the high cost of handling patent applications or infringement cases. A common yet expensive task among these jobs is to analyze relevant patent literature. Lengthy and technically complicated patents require a large number of human efforts. This paper focuses on automatically analyzing the similar contents between a patent and its relevant literature, relevant patents specifically, to help experts review the similarities among these patents. We formulate this as a one-to-many document comparison problem by generating a comparative summary of a given patent and its relevant patents. We extract essential technical features from semantic dependency trees based on sentences in claims and construct a multi-relational graph to model the relevance between features and patents. The key to generating the comparative summary is selecting comparative essential technical features, which we formulate as an optimization problem and solve by a fast greedy algorithm. Experiments on real-world datasets and case studies demonstrate the effectiveness and efficiency of the proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Abbas, A., Zhang, L., & Khan, S. U. (2014). A literature review on the state-of-the-art in patent analysis. World Patent Information, 37, 3–13.

    Article  Google Scholar 

  • Cascini, G., & Zini, M. (2008). Measuring patent similarity by comparing inventions functional trees. IFIP International Federation for Information Processing, 277, 31–42.

    Article  Google Scholar 

  • Choi, S., Kim, H., Yoon, J., Kim, K., & Lee, J. Y. (2012). An sao-based text-mining approach for technology roadmapping using patent information. R & D Management, 43(1), 52–74.

    Google Scholar 

  • Devlin, J., Chang, MW., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Long and Short Papers), Association for Computational Linguistics, Minneapolis, Minnesota (Vol. 1, pp 4171–4186). https://doi.org/10.18653/v1/N19-1423.

  • Erkan, G., & Radev, D. R. (2004) LexPageRank: Prestige in multi-document text summarization. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Barcelona, Spain, pp. 365–371.

  • Federico, P., Heimerl, F., Koch, S., & Miksch, S. (2017). A survey on visual approaches for analyzing scientific literature and patents. IEEE Transactions on Visualization and Computer Graphics, 23(9), 2179–2198. https://doi.org/10.1109/TVCG.2016.2610422

    Article  Google Scholar 

  • Gong, Y., & Liu, X. (2001). Generic text summarization using relevance measure and latent semantic analysis. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Association for Computing Machinery, New York, NY, USA, SIGIR’01, p 19–25

  • Helmers, L., Horn, F., Biegler, F., Oppermann, T., & Müller, K. R. (2019). Automating the search for a patent’s prior art with a full text similarity search. PLOS ONE, 14(3), 1–17.

    Article  Google Scholar 

  • Hu, P., Huang, M., Xu, P., Li, W., Usadi, A. K., & Zhu, X. (2012). Finding nuggets in ip portfolios: Core patent mining through textual temporal analysis. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, Association for Computing Machinery, CIKM ’12, pp. 1819–1823.

  • Huang, X., Wan, X., & Xiao, J. (2014). Comparative news summarization using concept-based optimization. Knowledge & Information Systems, 38(3), 691–716.

    Article  Google Scholar 

  • Krestel, R., Chikkamath, R., Hewel, C., & Risch, J. (2021). A survey on deep learning for patent analysis. World Patent Information, 65, 102035.

  • Lee, C., Song, B., & Park, Y. (2013). How to assess patent infringement risks: a semantic patent claim analysis using dependency relationships. Technology Analysis & Strategic Management, 25(1), 23–38.

    Article  Google Scholar 

  • Li, T., & Ding, C. (2008). Weighted consensus clustering. In Proceedings of the 2008 SIAM International Conference on Data Mining, SIAM (pp. 798–809).

  • Lupu, M., Mayer, K., Kando, N., & Trippe, A. J. (2017). Current challenges in patent information retrieval. Springer. https://doi.org/10.1007/978-3-662-53817-3

  • Mani, I., & Bloedorn, E. (1997). Multi-document summarization by graph search and matching. In Proceedings of the Fourteenth National Conference on Artificial Intelligence and Ninth Conference on Innovative Applications of Artificial Intelligence, AAAI Press, AAAI’97/IAAI’97, pp. 622–628.

  • Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J., & McClosky, D (2014) The Stanford CoreNLP natural language processing toolkit. In: Association for Computational Linguistics (ACL) System Demonstrations, pp 55–60

  • Mihalcea, R., Tarau, P (2005) A language independent algorithm for single and multiple document summarization. In: Companion Volume to the Proceedings of Conference including Posters/Demos and tutorial abstracts, Asian Federation of Natural Language Processing

  • Mikolov, T., Sutskever, I., Chen, K., Corrado, GS., & Dean, J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119

  • Ren, X., Lv, Y., Wang, K., & Han, J (2017) Comparative document analysis for large text corpora. Association for Computing Machinery, New York, NY, USA, WSDM ’17, p 325-334, 10.1145/3018661.3018690, https://doi.org/10.1145/3018661.3018690

  • Risch, J., & Krestel, R. (2019). Domain-specific word embeddings for patent classification. Data Technologies and Applications, 53(1), 108–122.

    Article  Google Scholar 

  • Shalaby, W., & Zadrozny, W. (2019). Patent retrieval: a literature review. Knowledge and Information Systems, 61(2), 631–660. https://doi.org/10.1007/s10115-018-1322-7

    Article  Google Scholar 

  • Shen C, & Li T (2010) Multi-document summarization via the minimum dominating set. In: Proceedings of the 23rd International Conference on Computational Linguistics, Association for Computational Linguistics, USA, COLING’10, p 984–992

  • Shen, D., Sun, JT., Li, H., Yang, Q., & Chen, Z (2007) Document summarization using conditional random fields. In: Proceedings of the 20th International Joint Conference on Artifical Intelligence, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, IJCAI’07, p 2862–2867

  • Souza, C. M., Meireles, M. R. G., & Almeida, P. E. M. (2021). A comparative study of abstractive and extractive summarization techniques to label subgroups on patent dataset. Scientometrics, 126(1), 135–156. https://doi.org/10.1007/s11192-020-03732-x

    Article  Google Scholar 

  • Tang, J., Wang, B., Yang, Y., Hu, P., Zhao, Y., Yan, X., Gao, B., Huang, M., Xu, P., Li, W., et al (2012) Patentminer: Topic-driven patent analysis and mining. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Association for Computing Machinery, New York, NY, USA, KDD’12, p 1366–1374, 10.1145/2339530.2339741

  • Tseng, Y. H., Lin, C. J., & Lin, Y. I. (2007). Text mining techniques for patent analysis. Inf Process Manage, 43(5), 1216–1247. https://doi.org/10.1016/j.ipm.2006.11.011

    Article  Google Scholar 

  • Wan, X., & Yang, J (2008) Multi-document summarization using cluster-based link analysis. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Association for Computing Machinery, New York, NY, USA, SIGIR’08, p 299–306, 10.1145/1390334.1390386

  • Wang, D., & Li, T (2010) Many are better than one: Improving multi-document summarization via weighted consensus. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Association for Computing Machinery, New York, NY, USA, SIGIR’10, p 809–810, 10.1145/1835449.1835627

  • Wang, D., Zhu, S., Li, T., & Gong, Y (2009) Multi-document summarization using sentence-based topic models. In: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, Association for Computational Linguistics, USA, ACLShort’09, p 297–300

  • Wang, D., Zhu, S., Li, T., & Gong, Y (2012) Comparative document summarization via discriminative sentence selection. ACM Trans Knowl Discov Data 6(3), 10.1145/2362383.2362386

  • Yang, SY., & Soo, VW (2008) Comparing the conceptual graphs extracted from patent claims. In: Proceedings of the 2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing (Sutc 2008), IEEE Computer Society, USA, SUTC’08, p 394–399, 10.1109/SUTC.2008.87

  • Zhang, L., Li, L., & Li, T. (2015). Patent mining: A survey. SIGKDD Explor Newsl, 16, 1–19.

    Article  Google Scholar 

  • Zhang, L., Li, L., Shen, C., & Li, T (2015b) Patentcom: A comparative view of patent document retrieval. In: Proceedings of the 2015 SIAM International Conference on Data Mining, SIAM, pp 163–171

  • Zhang, L., Liu, Z., Li, L., Shen, C., & Li, T. (2018). PatSearch: an integrated framework for patentability retrieval. Knowledge and Information Systems, 57(1), 135–158. https://doi.org/10.1007/s10115-017-1127-0

    Article  Google Scholar 

  • Zhou, D., Bousquet, O., Lal, TN., Weston, J.,&Schölkopf, B (2003) Learning with local and global consistency. In: Proceedings of the 16th International Conference on Neural Information Processing Systems, MIT Press, Cambridge, MA, USA, NIPS’03, p 321–328

Download references

Funding

Funding was provided by Nanjing University of Posts and Telecommunications (Grant No. NY219084).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zheng Liu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Z., Zhang, J., Qin, T. et al. One-to-many comparative summarization for patents. Scientometrics 127, 1969–1993 (2022). https://doi.org/10.1007/s11192-022-04307-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-022-04307-8

Keywords

Navigation