Patent2Vec: Multi-view representation learning on patent-graphs for patent classification

Fang, Lintao; Zhang, Le; Wu, Han; Xu, Tong; Zhou, Ding; Chen, Enhong

doi:10.1007/s11280-021-00885-4

Patent2Vec: Multi-view representation learning on patent-graphs for patent classification

Published: 16 June 2021

Volume 24, pages 1791–1812, (2021)
Cite this article

World Wide Web Aims and scope Submit manuscript

Lintao Fang¹,
Le Zhang¹,
Han Wu¹,
Tong Xu¹,
Ding Zhou¹ &
…
Enhong Chen¹

1487 Accesses
8 Citations
1 Altmetric
Explore all metrics

Abstract

Patent classification has long been treated as a crucial task to support related services. Though large efforts have been made on the automatic patent classification task, those prior arts mainly focus on mining textual information such as titles and abstracts. Meanwhile, few of them pay attention to the meta data, e.g., the inventors and the assignee company, and the potential correlation via the metadata-based graph has been largely ignored. To that end, in this paper, we develop a new paradigm for patent classification task in the perspective of multi-view patent graph analysis and then propose a novel framework called Patent2vec to learn low-dimensional representations of patents for patent classification. Specifically, we first employ the graph representation learning on individual graphs, so that view-specific representations will be learned by capturing the network structure and side information. Then, we propose a view enhancement module to enrich single view representations by exploiting cross-view correlation knowledge. Afterward, we deploy an attention-based multi-view fusion method to get refined representations for each patent and further design a view alignment module to constraint final fused representation in a relational embedding space which can preserve latent relational information. Empirical results demonstrate that our model not only improves the classification accuracy but also improves the interpretability of classifying patents reflected in the multi-source data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mining High-Value Patents Leveraging Massive Patent Data

Knowledge graph enhanced citation recommendation model for patent examiners

Article 11 March 2024

Knowledge Powered Cooperative Semantic Fusion for Patent Classification

References

Cao, S., Lu, W., Xu, Q.: Grarep: Learning graph representations with global structural information. In: CIKM 2015, pp. 891–900 (2015)
Chandra, D.K., Wang, P., Leopold, J., Fu, Y.: Collective representation learning on spatiotemporal heterogeneous information networks. In: SIGSPATIAL, pp. 319–328 (2019)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding (2019)
Dai, L., Yin, Y., Qin, C., Xu*, T., He, X., Chen, E., Xiong, H.: Enterprise Cooperation and Competition Analysis with Sign-Oriented Preference Network. In: Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’20), pp 774–782, San Diego, CA, USA (2020)
Dong, Y., Chawla, N.V., Swami, A.: metapath2vec: Scalable representation learning for heterogeneous networks. In: SIGKDD, pp. 135–144 (2017)
Evgeniya, U., Yaroslav, G., Victor, L.: Multi-region bilinear convolutional neural networks for person re-identification. In: AVSS, pp. 1–6. IEEE (2017)
Feichtenhofer, C., Pinz, A., Zisserman, A.: Convolutional two-stream network fusion for video action recognition. In: CVPR, pp. 1933–1941 (2016)
Grawe, M.F., Martins, C.A., Bonfante, A.G.: Automated patent classification using word embedding. In: ICMLA, pp. 408–411. IEEE (2017)
Grover, A., Leskovec, J.: node2vec: Scalable feature learning for networks. In: SIGKDD, pp. 855–864 (2016)
Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. In: NeurIPS, pp. 1024–1034 (2017)
He, C., Loh, H.T.: Pattern-oriented associative rule-based patent classification. Expert Syst. Appl. 37(3), 2395–2404 (2010)
Article Google Scholar
Hu, J., Li, S., Hu, J., Yang, G.: A hierarchical feature extraction model for multi-label mechanical patent classification. Sustainability 10(1), 219 (2018)
Article Google Scholar
Hu, J., Li, S, Yao, Y, Yu, L., Yang, G., Hu, J.: Patent keyword extraction algorithm based on distributed representation for patent classification. Entropy 20(2), 104 (2018)
Article Google Scholar
Jain, H., Prabhu, Y., Varma, M.: Extreme multi-label loss functions for recommendation, tagging, ranking & other missing label applications. In: SIGKDD, pp. 935–944 (2016)
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T. (2016)
Karpathy, Andrej, Li, Fei-Fei: Deep visual-semantic alignments for generating image descriptions. In: CVPR, pp. 3128–3137 (2015)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)
Kiros, R., Salakhutdinov, R., Zemel, R.S.: Unifying visual-semantic embeddings with multimodal neural language models. arXiv preprint arXiv:1411.2539 (2014)
Lai, K.-K., Wu, S.-J.: Using the patent co-citation approach to establish a new patent classification system. Inf. Process. Manage. 41(2), 313–330 (2005)
Article Google Scholar
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: ICML, pp. 1188–1196 (2014)
Lee, J.-S., Hsiang, J.: Patent classification by fine-tuning bert language model. World Patent Inf. 61, 101965 (2020)
Article Google Scholar
Li, S., Hu, J., Cui, Y., Hu, J.: Deeppatent: patent classification with convolutional neural networks and word embedding. Scientometrics 117 (2), 721–744 (2018)
Article Google Scholar
Li, P., Xie, J., Wang, Q., Gao, Z.: Towards faster training of global covariance pooling networks by iterative matrix square root normalization. In: CVPR, pp. 947–955 (2018)
Li, Y., Yang, M., Zhang, Z.: A survey of multi-view representation learning. IEEE TKDE 31(10), 1863–1883 (2018)
Google Scholar
Louay, A., Peter, K., Erdan, G., Stefan, F., Frank, H.: Optimizing neural networks for patent classification. In: ECML PKDD, pp. 688–703. Springer (2019)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781(2013)
Nagrani, A., Albanie, S., Zisserman, A.: Learnable pins: Cross-modal embeddings for person identity. In: ECCV, pp. 71–88 (2018)
Peng, Y., Qi, J.: Cm-gans: Cross-modal generative adversarial networks for common representation learning. TOMM 15(1), 1–24 (2019)
Article MathSciNet Google Scholar
Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: Online learning of social representations. In: SIGKDD (2014)
Prabhu, Y., Varma, M.: Fastxml: A fast, accurate and stable tree-classifier for extreme multi-label learning. In: SIGKDD, pp. 263–272 (2014)
Risch, J., Krestel, R.: Domain-specific word embeddings for patent classification. Data Technologies and Applications (2019)
Roudsari, A.H., Afshar, J., Lee, C.C., Lee, W.: Multi-label patent classification using attention-aware deep learning model. In: IEEE BigComp, pp. 558–559. IEEE (2020)
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Article Google Scholar
Smith, H.: Automation of patent classification. World Patent Inf. 24(4), 269–271 (2002)
Article Google Scholar
Tang, P., Jiang, M., Xia, B.(Ning), Pitera, J.W., Welser, J., Chawla, N.V.: Multi-label patent categorization with non-local attention-based graph convolutional network. In: AAAI, pp. 9024–9031 (2020)
Tang, J., Meng, Q., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: Large-scale information network embedding. In: WWW, pp. 1067–1077 (2015)
Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. arXiv preprint arXiv:1710.10903 (2017)
Wang, W., Arora, R., Livescu, K., Srebro, N.: Stochastic optimization for deep cca via nonlinear orthogonal iterations (2016)
Wang, P., Fu, Y., Xiong, H., Li, X.: Adversarial substructured representation learning for mobile user profiling. In: SIGKDD, pp. 130–138 (2019)
Wang, P., Fu, Y., Zhang, J., Wang, P., Yu, Z., Aggarwal, C.: You are how you drive: Peer and temporal-aware representation learning for driving behavior analysis. In: SIGKDD, pp. 2457–2466 (2018)
Wang, P., Fu, Y., Zhou, Y., Liu, K., Li, X., Hua, K.: Exploiting mutual information for substructure-aware graph representation learning. In: IJCAI, pp. 3415–3421 (2020)
Wang, P., Li, X., Zheng, Y., Aggarwal, C., Fu, Y.: Spatiotemporal representation learning for driving behavior analysis. A joint perspective of peer and temporal dependencies. TKDE (2019)
Hao Wang, Tong Xu*, Qi Liu, Defu Lian, Enhong Chen, Dongfang Du, Han Wu, Wen Su: MCNE: An End-to-End Framework for Learning Multiple Conditional Network Representations of Social Network. In: Proceedings of the 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’19), pp 1064–1072, Anchorage, AK, USA (2019)
Wu, C.-H., Ken, Y., Huang, T.: Patent classification system using a new hybrid genetic algorithm support vector machine. Appl. Soft Comput. 10 (4), 1164–1177 (2010)
Article Google Scholar
Xia, B., Baoan, L.I., Lv, X.: Research on patent document classification based on deep learning. In: AIIE. Atlantis Press (2016)
Zhang, L., Li, L., Li, T.: Patent mining: a survey. ACM SIGKDD Explorations Newsletter 16(2), 1–19 (2015)
Article Google Scholar
Zhang, L., Xu, T., Zhu, H., Qin, C., Meng, Q, Xiong, H., Chen, E.: Large-Scale Talent Flow Embedding for Company Competitive Analysis. In: Proceedings of The Web Conference 2020 (WWW’20), pp 2354–2364, Taipei, China (2020)
Zhang, D., Liu, J., Zhu, H., Liu, Y., Wang, L., Wang, P., Xiong, H.: Job2vec: Job title benchmarking with collective multi-view representation learning. In: CIKM, pp. 2763–2771 (2019)
van der Maaten, L., Hinton, G.: Visualizing data using t-sne. JMLR 9(Nov), 2579–2605 (2008)
MATH Google Scholar

Download references

Acknowledgments

This research was partially supported by grants from the National Key Research and Development Program of China (Grant No.2018YFB1402600), and the National Natural Science Foundation of China (Grant No.62072423).

Author information

Authors and Affiliations

Anhui Province Key Lab of Big Data Analysis and Application, University of Science and Technology of China, Hefei, China
Lintao Fang, Le Zhang, Han Wu, Tong Xu, Ding Zhou & Enhong Chen

Authors

Lintao Fang
View author publications
You can also search for this author in PubMed Google Scholar
Le Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Han Wu
View author publications
You can also search for this author in PubMed Google Scholar
Tong Xu
View author publications
You can also search for this author in PubMed Google Scholar
Ding Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Enhong Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tong Xu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special Issue on Explainability in the Web

Guest Editors: Guandong Xu, Hongzhi Yin, Irwin King, and Lin Li

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fang, L., Zhang, L., Wu, H. et al. Patent2Vec: Multi-view representation learning on patent-graphs for patent classification. World Wide Web 24, 1791–1812 (2021). https://doi.org/10.1007/s11280-021-00885-4

Download citation

Received: 30 October 2020
Revised: 20 February 2021
Accepted: 21 April 2021
Published: 16 June 2021
Issue Date: September 2021
DOI: https://doi.org/10.1007/s11280-021-00885-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Patent2Vec: Multi-view representation learning on patent-graphs for patent classification

Abstract

Access this article

Similar content being viewed by others

Mining High-Value Patents Leveraging Massive Patent Data

Knowledge graph enhanced citation recommendation model for patent examiners

Knowledge Powered Cooperative Semantic Fusion for Patent Classification

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Patent2Vec: Multi-view representation learning on patent-graphs for patent classification

Abstract

Access this article

Similar content being viewed by others

Mining High-Value Patents Leveraging Massive Patent Data

Knowledge graph enhanced citation recommendation model for patent examiners

Knowledge Powered Cooperative Semantic Fusion for Patent Classification

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation