Research on classification and similarity of patent citation based on deep learning

Lu, Yonghe; Xiong, Xin; Zhang, Weiting; Liu, Jiaxin; Zhao, Ruijie

doi:10.1007/s11192-020-03385-w

Research on classification and similarity of patent citation based on deep learning

Published: 28 February 2020

Volume 123, pages 813–839, (2020)
Cite this article

Scientometrics Aims and scope Submit manuscript

Yonghe Lu ORCID: orcid.org/0000-0002-7758-9365¹,
Xin Xiong¹,
Weiting Zhang¹,
Jiaxin Liu¹ &
…
Ruijie Zhao¹

1937 Accesses
16 Citations
Explore all metrics

Abstract

This paper proposes a patent citation classification model based on deep learning, and collects the patent datasets in text analysis and communication area from Google patent database to evaluate the classification effect of the model. At the same time, considering the technical relevance between the examiners’ citations and the pending patent, this paper proposes a hypothesis to take the output value of the model as the technology similarity of two patents. The rationality of the hypothesis is verified from the perspective of machine statistics and manual spot check. The experimental results show that the model effect based on deep learning proposed in this paper is significantly better than the traditional text representation and classification method, while having higher robustness than the method combining Doc2vec and traditional classification technology. In addition, we compare between the proposed method based on deep learning and the traditional similarity method by a triple verification. It shows that the proposed method is more accurate in calculating technology similarity of patents. And the results of manual sampling show that it is reasonable to use the output value of the proposed model to represent the technology similarity of patents.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Method for Calculating Patent Similarity Using Patent Model Tree Based on Neural Network

Using Deep Learning Word Embeddings for Citations Similarity in Academic Papers

Patent Classification for Business Strategy with BERT

References

Cho, K., Van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., & Schwenk, H., et al. (2014). Learning phrase representations using RNN encoder–decoder for statistical machine translation. Computer Science, pp. 1724–1734.
Conneau, A., Kiela, D., Schwenk, H., Barrault, L., & Bordes, A. (2017). Supervised learning of universal sentence representations from natural language inference data. arXiv preprint arXiv:1705.02364.
Gilbert, G. N. (1977). Referencing as persuasion. Social Studies of Science, 7(1), 113–122.
Article Google Scholar
Graves, A. (2008). Supervised sequence labelling with recurrent neural networks. Studies in Computational Intelligence, p. 385.
Hochreiter, S., & Jrgen, Schmidhuber. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.
Article Google Scholar
Huang, P. S. , He, X. , Gao, J. , Deng, L., & Heck, L. (2013). Learning deep structured semantic models for web search using clickthrough data. In Proceedings of the 22nd ACM International Conference on Conference on information & Knowledge Management, ACM.
Jaffe, A. B., Trajtenberg, M., & Henderson, R. (1993). Geographic localization of knowledge spillovers as evidenced by patent citations. Quarterly Journal of Economics., 108(3), 577–598.
Article Google Scholar
Jiaojiao, Z. H. A. N. G., & Yun, L. I. U. (2017). Research on technology foresight model based on Delphi method and BP neural network. Science Technology and Industry, 17(12), 81–88. +94.
Google Scholar
Jie, H. U., Shaobo, L. I., Liya, Y. U., & Guanci, Y. A. N. G. (2018). A patent classification model based on convolutional neural networks and rand forest. Science Technology and Engineering, 18(06), 268–272.
Google Scholar
Junjie, MA, Jianxin, YOU, Rui, LU.(2013). Prediction of the number of invention patent authorization in China based on improved wavelet neural network. Science & Technology Progress and Policy., (04).
Kim, Y. (2014). Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751.
Kohonen, T., Kaski, S., Lagus, K., et al. (2000). Self organization of a massive document collection. IEEE Transactions on Neural Networks, 11(3), 574.
Article Google Scholar
Kowsari, K., Brown, D. E., Heidarysafa, M., Meimandi, K. J., & Barnes, L. E. (2017). HDLTex: hierarchical deep learning for text classification. In IEEE International Conference on Machine Learning and Applications. IEEE, pp. 364–371.
Lamirel, J. C., Shehabi, S. A., Hoffmann, M., & Francois, C. (2006). Intelligent patent analysis through the use of a neural network: Experiment of multi-viewpoint analysis with the multisom model. Acl Workshop on Patent Corpus Processing, 20, 7–23.
Google Scholar
Lee, C., Kwon, O., Kim, M., et al. (2018). Early identification of emerging technologies: A machine learning approach using multiple patent indicators. Technological Forecasting and Social Change., 127, 291–303.
Article Google Scholar
Li, X. I. E., Yong, D. E. N. G., & Sumin, Z. (2012). A comparative study on paper and patent citation. Journal of Intelligence, 20(04), 19–21.
Google Scholar
Mou, L., Men, R., Li, G., Xu, Y., Zhang, L., Yan, R., et al. (2015). Natural language inference by tree-based convolution and heuristic matching. Computer Science, 2, 130–136.
Google Scholar
Palangi, H., Deng, L., Shen, Y., Gao, J., He, X., Chen, J., & Ward, R. (2014). Semantic modelling with long-short-term memory for information retrieval. arXiv preprint arXiv:1412.6629.
Ramadhan, M. H., Malik, V. I., & Sjafrizal, T. (2018). Artificial neural network approach for technology life cycle construction on patent data. In 2018 5th International Conference on Industrial Engineering and Applications (ICIEA) IEEE, pp. 499–503.
Rui, L. I., & Liansheng, M. E. N. G. (2009). On the problems in patent citation analysis. Information studies: Theory & Application., 21(7), 39–43.
Google Scholar
Shen, Y., He, X., Gao, J., Deng, L., & Mesnil, Grgoire. (2014). A latent semantic model with convolutional-pooling structure for information retrieval. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, ACM, pp. 101–110.
Shengzhen, L. I., Jianxin, W. A. N. G., Jiandong, Q. I., & Lijun, Z. H. U. (2010). Automated categorization of patent based on back-propagation network. Computer Engineering and Design., 31(23), 5075–5078.
Google Scholar
Shuanggang, M. A. (2016). The Study of Automatic Chinese Patent Classification Based on Deep Learning Theory and Method. Jiangsu: Jiangsu University.
Google Scholar
Sung, H. Y., Yeh, H. Y., Lin, J. K., & Chen, S. H. (2017). A visualization tool of patent topic evolution using a growing cell structure neural network. Scientometrics, 111(3), 1267–1285.
Article Google Scholar
Tai, K. S., Socher, R., & Manning, C. D. (2015). Improved semantic representations from tree-structured long short-term memory networks. Computer Science, 5(1), 36.
Google Scholar
Trappey, A. J. C., Hsu, F. C., Trappey, C. V., & Lin, C. I. (2006). Development of a patent document classification and search platform using a back-propagation network. Expert Systems with Applications, 31(4), 755–765.
Article Google Scholar
Trappey, A. J. C., Trappey, C. V., Chiang, T. A., & Huang, Y. H. (2013). Ontology-based neural network for patent knowledge management in design collaboration. International Journal of Production Research, 51(7), 1992–2005.
Article Google Scholar
Xia, B., Baoan, L.I., Lv, X. (2016). Research on patent document classification based on deep learning. In International Conference on Artificial Intelligence and Industrial Engineering.
Xiaokang, Z. H. E. N. G. (2017). Research on the Transalation of Out of Vocabulary Words in the Neural Machine Translation for Chinese and English Patent Corpus. Beijing: Beijing Jiaotong University.
Google Scholar
Yin, W., Schütze, H., Xiang, B., & Zhou, B. (2016). Abcnn: Attention-based convolutional neural network for modeling sentence pairs. Transactions of the Association for Computational Linguistics, 4, 259–272.
Article Google Scholar
Yuxiang, M. A. (2014). Research on Intelligent Patent Infringement Retrieval Based on Neural Network. Chaoyang: Beijing University of Technology.
Google Scholar
Zhang, K., Chen, E., Liu, Q., Liu, C., & Lv, G. (2017). A context-enriched neural network method for recognizing lexical entailment. In Thirty-First AAAI Conference on Artificial Intelligence.
Zhou, Y., Liu, C., & Pan, Y. (2016). Modelling sentence pairs with tree-structured attentive encoder. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp. 2912–2922.
Zhou, Y., Liu, C., Pan, Y. (2016). Modelling sentence pairs with tree-structured attentive encoder. arXiv preprint arXiv:1610.02806.

Download references

Acknowledgements

The authors warmly thank reviewers for their valuable suggestions. This research was supported by National Natural Science Foundation of China [Grant Number: 71373291]. This research was supported by the Science and Technology Planning Project of Guangdong Province (China) [Grant Number: 2016B030303003].

Author information

Authors and Affiliations

School of Information Management, Sun Yat-sen University, Guangzhou, China
Yonghe Lu, Xin Xiong, Weiting Zhang, Jiaxin Liu & Ruijie Zhao

Authors

Yonghe Lu
View author publications
You can also search for this author in PubMed Google Scholar
Xin Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Weiting Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jiaxin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Ruijie Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yonghe Lu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lu, Y., Xiong, X., Zhang, W. et al. Research on classification and similarity of patent citation based on deep learning. Scientometrics 123, 813–839 (2020). https://doi.org/10.1007/s11192-020-03385-w

Download citation

Received: 04 August 2019
Published: 28 February 2020
Issue Date: May 2020
DOI: https://doi.org/10.1007/s11192-020-03385-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Research on classification and similarity of patent citation based on deep learning

Abstract

Access this article

Similar content being viewed by others

A Method for Calculating Patent Similarity Using Patent Model Tree Based on Neural Network

Using Deep Learning Word Embeddings for Citations Similarity in Academic Papers

Patent Classification for Business Strategy with BERT

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Research on classification and similarity of patent citation based on deep learning

Abstract

Access this article

Similar content being viewed by others

A Method for Calculating Patent Similarity Using Patent Model Tree Based on Neural Network

Using Deep Learning Word Embeddings for Citations Similarity in Academic Papers

Patent Classification for Business Strategy with BERT

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation