Skip to main content
Log in

Research on classification and similarity of patent citation based on deep learning

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

This paper proposes a patent citation classification model based on deep learning, and collects the patent datasets in text analysis and communication area from Google patent database to evaluate the classification effect of the model. At the same time, considering the technical relevance between the examiners’ citations and the pending patent, this paper proposes a hypothesis to take the output value of the model as the technology similarity of two patents. The rationality of the hypothesis is verified from the perspective of machine statistics and manual spot check. The experimental results show that the model effect based on deep learning proposed in this paper is significantly better than the traditional text representation and classification method, while having higher robustness than the method combining Doc2vec and traditional classification technology. In addition, we compare between the proposed method based on deep learning and the traditional similarity method by a triple verification. It shows that the proposed method is more accurate in calculating technology similarity of patents. And the results of manual sampling show that it is reasonable to use the output value of the proposed model to represent the technology similarity of patents.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Cho, K., Van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., & Schwenk, H., et al. (2014). Learning phrase representations using RNN encoder–decoder for statistical machine translation. Computer Science, pp. 1724–1734.

  • Conneau, A., Kiela, D., Schwenk, H., Barrault, L., & Bordes, A. (2017). Supervised learning of universal sentence representations from natural language inference data. arXiv preprint arXiv:1705.02364.

  • Gilbert, G. N. (1977). Referencing as persuasion. Social Studies of Science, 7(1), 113–122.

    Article  Google Scholar 

  • Graves, A. (2008). Supervised sequence labelling with recurrent neural networks. Studies in Computational Intelligence, p. 385.

  • Hochreiter, S., & Jrgen, Schmidhuber. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.

    Article  Google Scholar 

  • Huang, P. S. , He, X. , Gao, J. , Deng, L., & Heck, L. (2013). Learning deep structured semantic models for web search using clickthrough data. In Proceedings of the 22nd ACM International Conference on Conference on information & Knowledge Management, ACM.

  • Jaffe, A. B., Trajtenberg, M., & Henderson, R. (1993). Geographic localization of knowledge spillovers as evidenced by patent citations. Quarterly Journal of Economics., 108(3), 577–598.

    Article  Google Scholar 

  • Jiaojiao, Z. H. A. N. G., & Yun, L. I. U. (2017). Research on technology foresight model based on Delphi method and BP neural network. Science Technology and Industry, 17(12), 81–88. +94.

    Google Scholar 

  • Jie, H. U., Shaobo, L. I., Liya, Y. U., & Guanci, Y. A. N. G. (2018). A patent classification model based on convolutional neural networks and rand forest. Science Technology and Engineering, 18(06), 268–272.

    Google Scholar 

  • Junjie, MA, Jianxin, YOU, Rui, LU.(2013). Prediction of the number of invention patent authorization in China based on improved wavelet neural network. Science & Technology Progress and Policy., (04).

  • Kim, Y. (2014). Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751.

  • Kohonen, T., Kaski, S., Lagus, K., et al. (2000). Self organization of a massive document collection. IEEE Transactions on Neural Networks, 11(3), 574.

    Article  Google Scholar 

  • Kowsari, K., Brown, D. E., Heidarysafa, M., Meimandi, K. J., & Barnes, L. E. (2017). HDLTex: hierarchical deep learning for text classification. In IEEE International Conference on Machine Learning and Applications. IEEE, pp. 364–371.

  • Lamirel, J. C., Shehabi, S. A., Hoffmann, M., & Francois, C. (2006). Intelligent patent analysis through the use of a neural network: Experiment of multi-viewpoint analysis with the multisom model. Acl Workshop on Patent Corpus Processing, 20, 7–23.

    Google Scholar 

  • Lee, C., Kwon, O., Kim, M., et al. (2018). Early identification of emerging technologies: A machine learning approach using multiple patent indicators. Technological Forecasting and Social Change., 127, 291–303.

    Article  Google Scholar 

  • Li, X. I. E., Yong, D. E. N. G., & Sumin, Z. (2012). A comparative study on paper and patent citation. Journal of Intelligence, 20(04), 19–21.

    Google Scholar 

  • Mou, L., Men, R., Li, G., Xu, Y., Zhang, L., Yan, R., et al. (2015). Natural language inference by tree-based convolution and heuristic matching. Computer Science, 2, 130–136.

    Google Scholar 

  • Palangi, H., Deng, L., Shen, Y., Gao, J., He, X., Chen, J., & Ward, R. (2014). Semantic modelling with long-short-term memory for information retrieval. arXiv preprint arXiv:1412.6629.

  • Ramadhan, M. H., Malik, V. I., & Sjafrizal, T. (2018). Artificial neural network approach for technology life cycle construction on patent data. In 2018 5th International Conference on Industrial Engineering and Applications (ICIEA) IEEE, pp. 499–503.

  • Rui, L. I., & Liansheng, M. E. N. G. (2009). On the problems in patent citation analysis. Information studies: Theory & Application., 21(7), 39–43.

    Google Scholar 

  • Shen, Y., He, X., Gao, J., Deng, L., & Mesnil, Grgoire. (2014). A latent semantic model with convolutional-pooling structure for information retrieval. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, ACM, pp. 101–110.

  • Shengzhen, L. I., Jianxin, W. A. N. G., Jiandong, Q. I., & Lijun, Z. H. U. (2010). Automated categorization of patent based on back-propagation network. Computer Engineering and Design., 31(23), 5075–5078.

    Google Scholar 

  • Shuanggang, M. A. (2016). The Study of Automatic Chinese Patent Classification Based on Deep Learning Theory and Method. Jiangsu: Jiangsu University.

    Google Scholar 

  • Sung, H. Y., Yeh, H. Y., Lin, J. K., & Chen, S. H. (2017). A visualization tool of patent topic evolution using a growing cell structure neural network. Scientometrics, 111(3), 1267–1285.

    Article  Google Scholar 

  • Tai, K. S., Socher, R., & Manning, C. D. (2015). Improved semantic representations from tree-structured long short-term memory networks. Computer Science, 5(1), 36.

    Google Scholar 

  • Trappey, A. J. C., Hsu, F. C., Trappey, C. V., & Lin, C. I. (2006). Development of a patent document classification and search platform using a back-propagation network. Expert Systems with Applications, 31(4), 755–765.

    Article  Google Scholar 

  • Trappey, A. J. C., Trappey, C. V., Chiang, T. A., & Huang, Y. H. (2013). Ontology-based neural network for patent knowledge management in design collaboration. International Journal of Production Research, 51(7), 1992–2005.

    Article  Google Scholar 

  • Xia, B., Baoan, L.I., Lv, X. (2016). Research on patent document classification based on deep learning. In International Conference on Artificial Intelligence and Industrial Engineering.

  • Xiaokang, Z. H. E. N. G. (2017). Research on the Transalation of Out of Vocabulary Words in the Neural Machine Translation for Chinese and English Patent Corpus. Beijing: Beijing Jiaotong University.

    Google Scholar 

  • Yin, W., Schütze, H., Xiang, B., & Zhou, B. (2016). Abcnn: Attention-based convolutional neural network for modeling sentence pairs. Transactions of the Association for Computational Linguistics, 4, 259–272.

    Article  Google Scholar 

  • Yuxiang, M. A. (2014). Research on Intelligent Patent Infringement Retrieval Based on Neural Network. Chaoyang: Beijing University of Technology.

    Google Scholar 

  • Zhang, K., Chen, E., Liu, Q., Liu, C., & Lv, G. (2017). A context-enriched neural network method for recognizing lexical entailment. In Thirty-First AAAI Conference on Artificial Intelligence.

  • Zhou, Y., Liu, C., & Pan, Y. (2016). Modelling sentence pairs with tree-structured attentive encoder. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp. 2912–2922.

  • Zhou, Y., Liu, C., Pan, Y. (2016). Modelling sentence pairs with tree-structured attentive encoder. arXiv preprint arXiv:1610.02806.

Download references

Acknowledgements

The authors warmly thank reviewers for their valuable suggestions. This research was supported by National Natural Science Foundation of China [Grant Number: 71373291]. This research was supported by the Science and Technology Planning Project of Guangdong Province (China) [Grant Number: 2016B030303003].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yonghe Lu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lu, Y., Xiong, X., Zhang, W. et al. Research on classification and similarity of patent citation based on deep learning. Scientometrics 123, 813–839 (2020). https://doi.org/10.1007/s11192-020-03385-w

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-020-03385-w

Keywords

Navigation