Learning semantic and relationship joint embedding for author name disambiguation

Xiong, Bo; Bao, Peng; Wu, Yilin

doi:10.1007/s00521-020-05088-y

Learning semantic and relationship joint embedding for author name disambiguation

Original Article
Published: 20 June 2020

Volume 33, pages 1987–1998, (2021)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

707 Accesses
10 Citations
Explore all metrics

Abstract

Author name disambiguation is an important research topic in the academic information retrieval community. Existing methods rely either on feature engineering on rich attributes information or on relationship information to obtain documents’ similarity, but seldom consider the complementarity and the correlation between them. The feature engineering on attributes, especially on rich text information, could capture the global semantic concepts, while the relationship information could encode local structural proximity in multiple academic networks. To bridge the gap between semantic and relationship information in author name disambiguation, this paper presents a joint representation learning approach, which could encode both semantic and relationship information into a common low dimensional space. Specifically, the proposed method consists of four modules: (1) semantic embedding module; (2) relationship embedding module; (3) semantic and relationship joint embedding module; and (4) clustering module. Experimental results demonstrate that the proposed joint representation learning approach consistently outperforms the state-of-the-art methods on three benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MORE: Toward Improving Author Name Disambiguation in Academic Knowledge Graphs

Article 28 November 2022

Author Name Disambiguation in Heterogeneous Academic Networks

Semantic Author Name Disambiguation with Word Embeddings

References

Fu Y, Zhu L, Han H (2016) A survey of name disambiguation. Technol Intell Eng 2(1):053–058
Google Scholar
Cen L, Dragut E, Si L, Ouzzani M (2013) Author disambiguation by hierarchical agglomerative clustering with adaptive stopping criterion. In: Proceedings of SIGIR, pp 741–744
Han H, Giles L, Zha H, Li C, Tsioutsiouliklis K (2014) Two supervised learning approaches for name disambiguation in author citations. In: Proceedings of JCDL
Zhang B, Dundar M, Hasan M (2016) Bayesian non-exhaustive classification a case study: online name disambiguation using temporal record streams. In: Proceedings of CIKM, pp 1341–1350
Zhang B, Saha T, Hasan M (2014) Name disambiguation from link data in a collaboration graph. In: Proceedings of ASNAM, pp 8–84
Zhang D, Tang J, Li J, Wang K (2007) A constraintbased probabilistic framework for name disambiguation. In: Proceedings of CIKM, 10191022
Pucktada T, Lee G (2009) Disambiguating authors in academic publications using random forests. In: Proceedings of JCDL, pp 39–48
Wang X, Tang J, Cheng H, Yu P (2011) ADANA: active name disambiguation. In: International Conference on Data Mining (ICDM), pp 794–803
Zhang Y, Zhang F, Yao P, Tang J (2018) Name disambiguation in AMiner: clustering, maintenance, and human in the loop. In: Proceedings of SIGKDD, pp 1002–1011
Zhang Y, Zhang F, Yao P, Tang J (2018) Name disambiguation in AMiner: clustering, maintenance, and human in the loop. In: Proceedings of SIGKDD, pp 1002-1011
Zhang B, Hasan M (2017) Name disambiguation in anonymized graphs using network embedding. In: Proceedings of CIKM, New York, pp 1239-1248
Qian Y, Zheng Q, Sakai T, Ye J, Liu J (2015) Dynamic author name disambiguation for growing digital libraries. Inf Retr J 18(5):379–412
Article Google Scholar
Han H, Yao C, Fu Y, Yu Y, Zhang Y, Xu S (2017) Semantic fingerprints-based author name disambiguation in Chinese documents. Scientometrics 111:1879–1896
Article Google Scholar
Silva J, Silva F (2017) Feature extraction for the author name disambiguation problem in a bibliographic database. In: Proceedings of the SAC, pp 783-789
Zhang H, Guo H, Wang X, Ji Y, Wu QJ (2020) Clothescounter: a framework for star-oriented clothes mining from videos. Neurocomputing 377:38–48
Article Google Scholar
Zhou Q, Liu Y, Wei Y, Wang W, Wang B, Wu S (2018) dirichlet process mixtures model based on variational inference for Chinese person name disambiguation. In: International Conference on Computing and Data Engineering (ICDE), pp 6-10
Gonçalves A, Laender M, Ferreira A, Anderson A (2015) On the combination of domain-specific heuristics for author name disambiguation: the nearest cluster method. Int J Dig Libr 16:229–246
Article Google Scholar
Fan X, Wang J, Pu X et al (2011) On graph-based name disambiguation. J Data Inf Qual 2(2):10
Google Scholar
Shin D, Kim T, Choi J et al (2014) Author name disambiguation using a graph model with node splitting and merging based on bibliographic information. Scientometrics 100(1):15–50
Article Google Scholar
Kim K, Giles C (2016) Financial entity record linkage with random forests. In: Proceedings of the Second International Workshop on data science for macro-modeling, article 13, 2 pages
Saha T, Zhang B, Hasan M (2015) Name disambiguation from link data in a collaboration graph using temporal and topological features. Soc Netw Anal Min 5(1):1–14
Article Google Scholar
D’Angelo C, Giuffrida C, Abramo G (2014) A heuristic approach to author name disambiguation in bibliometrics databases for large-scale research assessments. J Assoc Inf Sci Technol 62(2):257–269
Article Google Scholar
Cetoli A, Akbari M, Bragaglia S, O’Harney A, Sloan M (2018) Named entity disambiguation using deep learning on graphs. arXiv preprint arXiv:1810.09164
Huang D, Wang J (2017) An approach on Chinese microblog entity linking combining baidu encyclopaedia and word2vec. Proc Comput Sci 111:37–45
Article Google Scholar
Zhu W, Zhang W, Li G, et al (2016) A study of damp-heat syndrome classification using Word2vec and TF-IDF. In: Proceedings of BIBM, pp 1415-1420
Wang C, Chakrabarti K, Cheng T, et al (2012) Targeted disambiguation of ad-hoc, homogeneous sets of named entities. In: Proceedings of WWW, pp 719-728
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: A unified embedding for face recognition and clustering. In: Proceedings of CVPR, pp 815-823
Elmacioglu E, Tan Y, Yan S, et al (2017) Psnus: Web people name disambiguation by simple clustering with rich features. In: Proceedings of SemEval, pp 268-271
Xu J, Shen S, Li D, et al (2018) A network-embedding based method for author disambiguation. In: Proceedings of ICKM, pp 1735-1738
Perozzi B, Al-Rfou R, Skiena S (2014) DeepWalk: online learning of social representations. arXiv preprint arXiv:1403.6652
Tang J, Qu M, Wang M, et al (2015) Line: Large-scale information network embedding. In: Proceedings of WWW, 1067-1077
Grover A, Leskovec J (2016) node2vec: Scalable feature learning for networks. In: Proceedings of SIGKDD, pp 855-864
Yang C, Liu Z, Zhao D, et al (2015) Network representation learning with rich text information. In: Proceedings of IJCAI
Fu T, Lee W, Lei Z (2017) Hin2vec: Explore meta-paths in heterogeneous information networks for representation learning. In: Proceedings of CIKM, pp 1797-1806
Van der Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(2579–2605):85
MATH Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant number 61702031 and the Fundamental Research Funds for the Central Universities under grant number 2020JBM077. The authors would like to thank the editor and reviewers for the valuable comments and constructive suggestions to improve the paper.

Author information

Authors and Affiliations

School of Software Engineering, Beijing Jiaotong University, Beijing, 100044, China
Bo Xiong, Peng Bao & Yilin Wu

Authors

Bo Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Peng Bao
View author publications
You can also search for this author in PubMed Google Scholar
Yilin Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peng Bao.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xiong, B., Bao, P. & Wu, Y. Learning semantic and relationship joint embedding for author name disambiguation. Neural Comput & Applic 33, 1987–1998 (2021). https://doi.org/10.1007/s00521-020-05088-y

Download citation

Received: 11 December 2019
Accepted: 04 June 2020
Published: 20 June 2020
Issue Date: March 2021
DOI: https://doi.org/10.1007/s00521-020-05088-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning semantic and relationship joint embedding for author name disambiguation

Abstract

Access this article

Similar content being viewed by others

MORE: Toward Improving Author Name Disambiguation in Academic Knowledge Graphs

Author Name Disambiguation in Heterogeneous Academic Networks

Semantic Author Name Disambiguation with Word Embeddings

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning semantic and relationship joint embedding for author name disambiguation

Abstract

Access this article

Similar content being viewed by others

MORE: Toward Improving Author Name Disambiguation in Academic Knowledge Graphs

Author Name Disambiguation in Heterogeneous Academic Networks

Semantic Author Name Disambiguation with Word Embeddings

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation