Abstract
Knowledge fusion is an important part of constructing a knowledge graph. In recent years, with the development of major knowledge bases, the integration of multi-source knowledge bases is the focus and difficulty in the field of knowledge fusion. Due to the large differences in knowledge base structure, the efficiency and accuracy of fusion are not high. In response to this problem, this paper proposes MEFE (Multi-fEature Knowledge Fusion and Evaluation Method) based on BERT. MEFE comprehensively considers the attributes, descriptions and category characteristics of entities to perform knowledge fusion on multi-source knowledge bases. Firstly, MEFE uses entity category tags to build a category dictionary. Then, it vectorizes the category tags based on the dictionary and clusters the entities according to the category tags. Finally it uses BERT (Bidirectional Encoder Representation from Transformers) to calculate the entity similarity for the entity pairs in the same group. We calculate entity redundancy rate and information loss rate of knowledge base according to the fusion result, so as to evaluate the quality of the knowledge base. Experiments show that MEFE effectively improves the efficiency of knowledge fusion through clustering, and the use of BERT promotes the accuracy of fusion.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Lehmann, J., Isele, R., Jakob, M.: DBpedia: a largescale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6(2), 167–195 (2015)
Hoffart, J., Suchanek, F.M., Berberich, K.: YAGO2: a spatially and temporally enhanced knowledge base from Wikipedia. Artif. Intell. 194, 28–61 (2013)
Wu, W., Li, H., Wang, H.: Probase: a probabilistic taxonomy for text understanding. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, Scottsdale, USA, pp. 481–492 (2012)
Carlson, A., Betteridge, J., Kisiel, B.: Toward an architecture for never ending language learning. In: Proceedings of the 24th AAAI Conference on Artificial Intelligence, vol. 42, no. 4, pp. 1306–1313 (2010)
Niu, X., Sun, X., Wang, H., Rong, S., Qi, G., Yu, Y.: Zhishi.me - weaving chinese linking open data. In: Aroyo, L., et al. (eds.) ISWC 2011. LNCS, vol. 7032, pp. 205–220. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25093-4_14
Solemn, L.G.: Feng Jianhua: overview of knowledge base entity alignment technology. Comput. Res. Develop. 53(1), 165–192 (2016)
Suchanek, F.M., Abiteboul, S., Senellart, P.: PARIS: probabilistic alignment of relations, instances, and schema. Proc. VLDB Endow. 5(3), 157–168 (2011)
Lacoste-Julien, S., Palla, K., Davies, A.: SIGMa: simple greedy matching for aligning large knowledge bases. In: Proceedings of the 2013 ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 572–580. ACM, New York (2013)
Cohen, W., Richman, J.: Learning to match and cluster large high-dimensional data sets for data integration. In: Proceedings of the 2002 ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 475–480. ACM, New York (2002)
McCallum, A., Wellner, B.: Conditional models of identity uncertainty with application to noun coreference. In: Proceedings of Advances in Neural Information Processing Systems, vol. 17, pp. 905–912. MIT Press, Cambridge, MA (2005)
He, F., et al.: Unsupervised entity alignment using attribute triples and relation triples. In: Li, G., Yang, J., Gama, J., Natwichai, J., Tong, Y. (eds.) DASFAA 2019. LNCS, vol. 11446, pp. 367–382. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-18576-3_22
Trisedya, B.D., Qi, J., Zhang, R.: Entity alignment between knowledge graphs using attribute embeddings. In: AAAI-19, vol. 33, no. 01, pp. 297–304 (2019)
Zeng, W., Zhao, X., Tang, J.: Collective entity alignment via adaptive features. In: ICDE 2020, pp. 1870–1873 (2020)
Zhuang, Y., Li, G., Zhong, Z.: Hike: a hybrid human-machine method for entity alignment in large-scale knowledge bases. In: CIKM 2017, pp. 1917–1926 (2017)
Devlin, J., Chang, M.W., Lee, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv arXiv:1810.04805 (2018)
Acknowledgments
This work was supported by the National Key R&D Program of China (2017YFB1401300, 2017YFB1401302), Outstanding Youth of Jiangsu Natural Science Foundation (BK20170100), Key R&D Program of Jiangsu (BE2017166), Natural Science Foundation of the Jiangsu Higher Education Institutions of China (No. 19KJB520046), Natural Science Foundation of Jiangsu Province (No. BK20170900), Innovative and Entrepreneurial talents projects of Jiangsu Province, Jiangsu Planned Projects for Postdoctoral Research Funds (No. 2019K024), Six talent peak projects in Jiangsu Province, the Ministry of Education Foundation of Humanities and Social Sciences (No. 20YJC880104), NUPT DingShan Scholar Project and NUPTSF (NY219132) and CCF-Tencent Open Fund WeBank Special Funding (No. CCF-WebankRAGR20190104).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Ji, Y. et al. (2020). MEFE: A Multi-fEature Knowledge Fusion and Evaluation Method Based on BERT. In: Qiu, M. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2020. Lecture Notes in Computer Science(), vol 12453. Springer, Cham. https://doi.org/10.1007/978-3-030-60239-0_30
Download citation
DOI: https://doi.org/10.1007/978-3-030-60239-0_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60238-3
Online ISBN: 978-3-030-60239-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)