Skip to main content
Log in

KRL_Match: knowledge graph objects matching for knowledge representation learning

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

The way of obtaining the embeddings of the knowledge graph objects through modeling with binary classification method from the level of triple structure is coarser in granularity for the existing knowledge representation learning models based on the probability, and the space-time efficiency of negative sampling is lower for the most of the knowledge representation learning models at present. To solve these problems, this paper proposes a knowledge representation learning model KRL_Match, which carries out the knowledge graph objects matching centered on a certain kind of knowledge graph objects (head entity, tail entity, relation), and executes multi-classification learning to determine the true matching and dynamic implicit negative sampling. Specifically, first, we make two classes of the knowledge graph objects of target and source in the same kind of knowledge graph objects matched mutually by their matrix multiplication operation in a knowledge graph batch sample space, which is constructed by random sampling from the universe set of the knowledge graph instance, and the knowledge graph objects matching sample spaces will be implicitly generated meanwhile; then, we measure the matching degree of each matching of the knowledge graph objects by softmax regression multi-classification method in each implicit sample space; finally, we fit the real probability with the matching degree by optimizing the cross-entropy loss based on the local closed world assumption. We conduct the knowledge graph objects matching for the knowledge representation learning inspired by the attention mechanism and firstly create the dynamic implicit negative sampling method in the knowledge representation learning. Experiments show that the KRL_Match model has achieved better performances compared with the baselines: Hits@10 (filter) has increased by 12.2% and 6.1% on benchmarks FB15K and FB15K237 respectively for the entity prediction task, and accuracy has increased by 12.6% on benchmark FB13 for the triple classification task. In addition, space-time efficiency test indicates that the negative sampling of KRL_Match is less 7395.59s and half in time and the storage space separately than TransE’s on benchmark FB15K (BS = 12000).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. Some pictures are quoted from: https://www.imdb.com/, https://www.1905.com/.

  2. To solve the problem of non-normalized parametric probability density estimation, the basic idea of NCE [43] is to transform the density estimation problem to a binary classification problem and distinguish the samples with the data distribution from the samples with the known noise distribution.

  3. Specifically, such as FB15K, WN18, etc. See Sect. 4 for more information.

  4. For example, the triple (World_War_II, */films, Stalag_17) in Fig. 1a and (Jim_Broadbent, */nominated_for, Another_Year) in Fig. 1b.

  5. We also call the triple as a kind of the knowledge graph object, but the knowledge graph object generally refers to entity and relation in this paper.

  6. It is the operation of their embeddings actually. We adopt a simple addition and subtraction operation referring to TransE [18] in this paper. Others are the same.

  7. In this paper, each of the multiple negative samples needs to participate in the determination of the threshold.

  8. https://www.imdb.com/name/nm1561881/?ref_=fn_al_nm_1.

  9. https://www.imdb.com/name/nm0614774/.

  10. https://www.imdb.com/name/nm0578479/.

References

  1. Lin Y, Han X, Xie R, Liu Z, Sun M (2018) Knowledge representation learning: a quantitative review. arXiv preprint arXiv:1812.10901, pp 1–57

  2. Wang Q, Mao Z, Wang B, Guo L (2017) Knowledge graph embedding: a survey of approaches and applications. IEEE Trans Knowl Data Eng (TKDE) 29:2724–2743. https://doi.org/10.1109/TKDE.2017.2754499

    Article  Google Scholar 

  3. Ji S, Pan S, Cambria E, Marttinen P, Yu PS (2021) A survey on knowledge graphs: representation, acquisition, and applications. IEEE Trans Neural Netw Learn Syst (TNNLS). https://doi.org/10.1109/TNNLS.2021.3070843

    Article  Google Scholar 

  4. Chen X, Jia S, Xiang Y (2020) A review: Knowledge reasoning over knowledge graph. Expert Syst Appl 141:112948.1-1129948.21. https://doi.org/10.1016/j.eswa.2019.112948

    Article  Google Scholar 

  5. Nguyen HL, Vu DT, Jung JJ (2020) Knowledge graph fusion for smart systems: a survey. Inf Fus 61:56–70

    Article  Google Scholar 

  6. Cui H, Peng T, Feng L, Bao T, Liu L (2021) Simple question answering over knowledge graph enhanced by question pattern classification. Knowl Inf Syst. https://doi.org/10.1007/s10115-021-01609-w

    Article  Google Scholar 

  7. Bengio Y, Senecal J-S (2008) Adaptive importance sampling to accelerate training of a neural probabilistic language model. IEEE Trans Neural Netw 19(4):713–722

    Article  Google Scholar 

  8. Kotnis B, Nastase V (2018) Analysis of the impact of negative sampling on link prediction in knowledge graphs. arXiv preprint arXiv:1708.06816v2

  9. Rossi A, Barbosa D, Firmani D, Matinata A, Merialdo P (2021) Knowledge graph embedding for link prediction: a comparative analysis. ACM Trans Knowl Discov Data (TKDD) 15(2):1–49

    Article  Google Scholar 

  10. Wang Z, Zhang J, Feng J, Chen Z (2014) Knowledge graph embedding by translating on hyperplanes. In: Proceedings of the AAAI conference on artificial intelligence (AAAI), vol 28, no 1

  11. Cai L, Wang WY (2018) KBGAN: Adversarial learning for knowledge graph embeddings. In: Proceedings of the 2018 conference of the north american chapter of the association for computational linguistics: human language technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, pp 1470–1480. https://doi.org/10.18653/v1/N18-1133. https://aclanthology.org/N18-1133

  12. Chaudhari S, Polatkan G, Ramanath R, Mithal V (2019) An attentive survey of attention models. arXiv preprint arXiv:1904.02874

  13. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems (NIPS). Curran Associates, Inc., pp 5998–6008. http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf

  14. Brauwers G, Frasincar F (2021) A general survey on attention mechanisms in deep learning. IEEE Trans Knowl Data Eng 11(15):1–20. https://doi.org/10.1109/TKDE.2021.3126456

    Article  Google Scholar 

  15. Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: 3rd international conference on learning representations (ICLR), pp 1–15

  16. Dong X, Gabrilovich E, Heitz G, Horn W, Lao N, Murphy K, Strohmann T, Sun S, Zhang W (2014) Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: ACM SIGKDD conference on knowledge discovery and data mining (KDD), KDD2014, Association for Computing Machinery, New York, NY, USA, pp 601–610. https://doi.org/10.1145/2623330.2623623

  17. Wang Z, Zhang J, Feng J, Chen Z (2014) Knowledge graph and text jointly embedding. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, pp 1591–1601. https://doi.org/10.3115/v1/D14-1167

  18. Zhong H, Zhang J, Wang Z, Wan H, Chen Z (2015) Aligning knowledge and text embeddings by entity descriptions. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, Portugal, pp 267–272. https://doi.org/10.18653/v1/D15-1031

  19. He S, Liu K, Ji G, Zhao J (2015) Learning to represent knowledge graphs with Gaussian embedding. In: Proceedings of the 24th ACM international on conference on information and knowledge management, CIKM ’15. Association for Computing Machinery, New York, NY, USA, pp 623–632. https://doi.org/10.1145/2806416.2806502

  20. Xiao H, Huang M, Zhu X (2016) TransG: a generative model for knowledge graph embedding. In: Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Berlin, Germany, pp 2316–2325. https://doi.org/10.18653/v1/P16-1219

  21. Bordes A, Usunier N, Garcia-Durán A, Weston J, Yakhnenko O (2013) Translating embeddings for modeling multi-relational data. In: Advances in neural information processing systems (NIPS), Vol. 26. Curran Associates, Inc., pp 2787–2795. https://proceedings.neurips.cc/paper/2013/file/1cecc7a77928ca8133fa24680a88d2f9-Paper.pdf

  22. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: 1st international conference on learning representations (ICLR) (2013)

  23. Mikolov T (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems (NIPS), vol 26, pp 3111–3119

  24. Goldberg Y, Levy O (2014) word2vec explained: deriving Mikolov et al.’s negative-sampling word-embedding method. arXiv:1402.3722 (2014)

  25. Toutanova K, Chen D, Pantel P, Poon H, Choudhury P, Gamon M (2015) Representing text for joint embedding of text and knowledge bases. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, Portugal, pp 1499–1509. https://doi.org/10.18653/v1/D15-1174

  26. Toutanova K, Chen D (2015) Observed versus latent features for knowledge base and text inference. In: Proceedings of the 3rd workshop on continuous vector space models and their compositionality. Association for Computational Linguistics, Beijing, China, pp 57–66. https://doi.org/10.18653/v1/W15-4007

  27. Trouillon T, Dance C, Gaussier É, Welbl J, Riedel S, Bouchard G (2017) Knowledge graph completion via complex tensor factorization. J Mach Learn Res 18:130:1-130:38

    MATH  Google Scholar 

  28. Trouillon T, Welbl J, Riedel S, Gaussier É, Bouchard G (2016) Complex embeddings for simple link prediction. In: International conference on machine learning (ICML), PMLR, pp 2071–2080

  29. Lin Y, Liu Z, Sun M (2016) Knowledge representation learning with entities, attributes and relations. In: International joint conference on artificial intelligence (IJCAI), vol 1, pp 41–52

  30. Fan M, Zhou Q, Zheng T, Grishman R (2017) Distributed representation learning for knowledge graphs with entity descriptions. Pattern Recognit Lett 93:31–37

    Article  Google Scholar 

  31. Dettmers T, Minervini P, Stenetorp P, Riedel S (2018) Convolutional 2D knowledge graph embeddings. In: The association for the advancement of artificial intelligence (AAAI), pp 1811–1818. https://aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/17366

  32. Chen X, Chen M, Shi W, Sun Y, Zaniolo C, Embedding uncertain knowledge graphs. In: The association for the advancement of artificial intelligence (AAAI), vol. 33, pp 3363–3370. https://doi.org/10.1609/aaai.v33i01.33013363

  33. Guan S, Jin X, Wang Y, Jia Y, Shen H, Li Z, Cheng X (2018) Self-learning and embedding based entity alignment. Knowl Inf Syst 59(2):361–386

    Article  Google Scholar 

  34. Li L, Wang P, Wang Y, Wang S, Yan J, Jiang J, Tang B, Wang C, Liu Y (2020) A method to learn embedding of a probabilistic medical knowledge graph: algorithm development. JMIR Med Inform 8(5):e17645–e17645

    Article  Google Scholar 

  35. Fan M, Zhou Q, Abel A, Zheng T, Grishman R (2016) Probabilistic belief embedding for large-scale knowledge population. Cogn Comput 8:1087–1102

    Article  Google Scholar 

  36. Fan M, Feng Q, Abel A, Zheng T, Grishman R (2015) Probabilistic belief embedding for knowledge base completion. arXiv:1505.02433 (2015)

  37. Gong F, Wang M, Wang H, Wang S, Liu M (2021) SMR: Medical knowledge graph embedding for safe medicine recommendation. Big Data Res 23:100174

    Article  Google Scholar 

  38. Yang B, Yih W-t, He X, Gao J, Deng L (2015) Embedding entities and relations for learning and inference in knowledge bases. arXiv preprint arXiv:1412.6575

  39. Ji G, He S, Xu L, Liu K, Zhao J (2015) Knowledge graph embedding via dynamic mapping matrix. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing, vol 1, pp 687–696. https://doi.org/10.3115/v1/p15-1067

  40. Gutmann M, Hyvärinen A (2010) Noise-contrastive estimation: a new estimation principle for unnormalized statistical models. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics (AISTATS), pp 297–304

  41. Gutmann M, Hyvärinen A (2012) Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics. J Mach Learn Res 13:307–361

    MATH  Google Scholar 

  42. van den Oord A, Li Y, Vinyals O (2018) Representation learning with contrastive predictive coding. arXiv:1807.03748 (2018)

  43. Mnih A, Teh Y (2012) A fast and simple algorithm for training neural probabilistic language models. In: International conference on machine learning (ICML), pp 1–8

  44. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, Cambridge

    MATH  Google Scholar 

  45. Kingma DP, Ba J (2017) Adam: A method for stochastic optimization, pp 1–15. arXiv preprint arXiv:1412.6980v9

  46. Sagi O, Rokach L (2018) Ensemble learning: a survey, Wiley Interdisciplinary Reviews. Data Min Knowl Disc 8(4):1–18

    Google Scholar 

  47. Drumond L, Rendle S, Schmidt-Thieme L (2012) Predicting RDF triples in incomplete knowledge bases with tensor factorization. In: Proceedings of the 27th annual ACM symposium on applied computing, SAC ’12. Association for Computing Machinery, New York, NY, USA, pp 326–331. https://doi.org/10.1145/2245276.2245341

  48. Chami I, Wolf A, Juan D-C, Sala F, Ravi S, Ré C (2020) Low-dimensional hyperbolic knowledge graph embeddings. In: Proceedings of the 58th annual meeting of the association for computational linguistics, association for computational linguistics, pp 6901–6914. https://doi.org/10.18653/v1/2020.acl-main.617.

  49. Sun Z, Chen M, Hu W, Wang C, Dai J, Zhang W (2020) Knowledge association with hyperbolic knowledge graph embeddings. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, pp 5704–5716. https://doi.org/10.18653/v1/2020.emnlp-main.460

  50. Sun Z, Deng ZH, Nie JY, Tang J (2019) Rotate: knowledge graph embedding by relational rotation in complex space. In: International conference on learning representations (ICLR)

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant No. 62172061; National Key R &D Program of China under Grant Nos. 2020YFB1711800 and 2020YFB1707900; the Science and Technology Project of Sichuan Province under Grant Nos. 2021GFW019, 2021YFG0152, 2021YFG0025, 2020YFG0479, 2020YFG0322, 2020GFW035, 2020GFW033, and the R &D Project of Chengdu City under Grant No. 2019-YF05-01790-GX.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bing Guo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendices

1.1 A. Proof

Proof:

Sufficiency:

\( \because max\left\{ p\left( {\left\{ o_{k1} \rightarrow o_{k2}^{\prime } \right\} }^{ overall} \right) \right\} = p\left( {o_{k1} \rightarrow o_{k2}^{\prime } \vert }_{k2=*} \right) \),

\( \therefore \forall \left\{ {o_{k1} \rightarrow o_{k2}^{\prime } \vert }_{k2=?} \right\} \in {\left\{ o_{k1} \rightarrow o_{k2}^{\prime } \right\} }^{ overall}, p\left( {o_{k1} \rightarrow o_{k2}^{\prime } \vert }_{k2=*} \right) > p\left( {o_{k1} \rightarrow o_{k2}^{\prime } \vert }_{k2=?} \right) \);

\( \because \forall S_{ batch}, \forall {\left\{ o_{k1} \rightarrow o_{k2}^{\prime } \right\} }^{ batch}, {\left\{ o_{k1} \rightarrow o_{k2}^{\prime } \right\} }^{ batch} \subset {\left\{ o_{k1} \rightarrow o_{k2}^{\prime } \right\} }^{ overall} \),

\( \therefore \forall \left\{ {o_{k1} \rightarrow o_{k2}^{\prime } \vert }_{k2=?} \right\} \in {\left\{ o_{k1} \rightarrow o_{k2}^{\prime } \right\} }^{ batch}, p\left( {o_{k1} \rightarrow o_{k2}^{\prime } \vert }_{k2=*} \right) > p\left( {o_{k1} \rightarrow o_{k2}^{\prime } \vert }_{k2=?} \right) \).

Sufficiency is easily proved.

Necessity:

\( \because \forall S_{ batch}, max\left\{ p\left( {\left\{ o_{k1} \rightarrow o_{k2}^{\prime } \right\} }^{ batch} \right) \right\} = p\left( {o_{k1} \rightarrow o_{k2}^{\prime } \vert }_{k2=*} \right) \),

\( \therefore \forall \left\{ {o_{k1} \rightarrow o_{k2}^{\prime } \vert }_{k2=?} \right\} \in {\left\{ o_{k1} \rightarrow o_{k2}^{\prime } \right\} }^{batch,0}, p\left( {o_{k1} \rightarrow o_{k2}^{\prime } \vert }_{k2=*} \right) > p\left( {o_{k1} \rightarrow o_{k2}^{\prime } \vert }_{k2=?} \right) \),

\( \forall \left\{ {o_{k1} \rightarrow o_{k2}^{\prime } \vert }_{k2=?} \right\} \in {\left\{ o_{k1} \rightarrow o_{k2}^{\prime } \right\} }^{batch,1}, p\left( {o_{k1} \rightarrow o_{k2}^{\prime } \vert }_{k2=*} \right) > p\left( {o_{k1} \rightarrow o_{k2}^{\prime } \vert }_{k2=?} \right) \),

\( \cdots \),

\( \forall \left\{ {o_{k1} \rightarrow o_{k2}^{\prime } \vert }_{k2=?} \right\} \in {\left\{ o_{k1} \rightarrow o_{k2}^{\prime } \right\} }^{batch,BS-1}, p\left( {o_{k1} \rightarrow o_{k2}^{\prime } \vert }_{k2=*} \right) > p\left( {o_{k1} \rightarrow o_{k2}^{\prime } \vert }_{k2=?} \right) \),

They can be concluded that:

\( \forall \left\{ {o_{k1} \rightarrow o_{k2}^{\prime } \vert }_{k2=?} \right\} \in {\left\{ o_{k1} \rightarrow o_{k2}^{\prime } \right\} }^{batch,0} \cup {\left\{ o_{k1} \rightarrow o_{k2}^{\prime } \right\} }^{batch,1} \cup \cdots \cup {\left\{ o_{k1} \rightarrow o_{k2}^{\prime } \right\} }^{batch,BS-1} = {\cup }_{k4}^{BS-1}{\left\{ o_{k1} \rightarrow o_{k2}^{\prime } \right\} }_{k4}^{ batch} \),

namely,

\( \forall \left\{ {o_{k1} \rightarrow o_{k2}^{\prime } \vert }_{k2=?} \right\} \in {\left\{ o_{k1} \rightarrow o_{k2}^{\prime } \right\} }^{ overall} \), then there are: \( p\left( {o_{k1} \rightarrow o_{k2}^{\prime } \vert }_{k2=*} \right) > p\left( {o_{k1} \rightarrow }{ o_{k2}^{\prime } \vert }_{k2=?} \right) \),

therefore,

\( max\left\{ p\left( {\left\{ o_{k1} \rightarrow o_{k2}^{\prime } \right\} }^{ overall} \right) \right\} = p\left( {o_{k1} \rightarrow o_{k2}^{\prime } \vert }_{k2=*} \right) \).

The necessity is proved.

1.2 B. Comparison of probability knowledge representation learning models

See Table 17.

Table 17 Comparison of KRL models based on probability

Notes:

  1. 1.

    element-wise vector product.

  2. 2.

    replacing h or t in triple only.

  3. 3.

    f(\(\cdot \)): nonlinear function, e.g., RLU(rectified linear units); \({\bar{h}},{\bar{r}}\):2D form of hr;[,]: concatenating; \(*\):convolution operator \(\omega \): filter; vec(\(\cdot \)): reshaping as a vector;W:linear transformation matrix; replaing t with T(embeddings of multiple entities) when score involving multiple triples.

  4. 4.

    \(h_p,t_p\): projection vectors of head and tail entity.

  5. 5.

    \(\lambda \): scaling coefficient.

  6. 6.

    \(\textbf{p,m}\): embedding of a patient(p), a medicine(m) or a disease(d).

  7. 7.

    M: a set of medicines.

  8. 8.

    N: size of a knowledge graph instance.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Suo, X., Guo, B., Shen, Y. et al. KRL_Match: knowledge graph objects matching for knowledge representation learning. Knowl Inf Syst 65, 641–681 (2023). https://doi.org/10.1007/s10115-022-01764-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-022-01764-8

Keywords

Navigation