Skip to main content
Log in

HOPLoP: multi-hop link prediction over knowledge graph embeddings

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Large-scale Knowledge Graphs (KGs) support applications such as Web search and personal assistants and provide training data for numerous Natural Language Processing tasks. Nevertheless, building KGs with high accuracy and domain coverage remains difficult, and neither manual nor automatic efforts are up to par. Link Prediction (LP) is one of many tasks aimed at addressing this problem. Its goal is to find missing links between entities in the KG based on structural by exploiting regularities in the graph structure. Recent years have seen two approaches emerge: using KG embeddings, and modelling complex relations by exploiting correlations between individual links and longer paths connecting the same pair of entities. For the latter, state-of-the-art methods traverse the KG itself and are hampered both by incompleteness and skewed degree distributions found in most KGs, resulting in some entities being overly represented in the training set leading to poor generalization. We present HOPLoP: an efficient and effective multi-hop LP meta method that performs the equivalent to path traversals on the KG embedding space instead of the KG itself, marrying both ideas. We show how to train and tune our method with different underlying KG embeddings, and report on experiments on many benchmarks, showing both that HOPLoP improves each LP method on its own and that it consistently outperforms the previous state-of-the-art by a good margin. Finally, we describe a way to interpret paths generated by HOPLoP when used with TransE.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. https://en.wikipedia.org/wiki/Terminator_(franchise)#The_Terminator_(1984)

  2. In the next section, we see that HOPLoP learns to not hop; this is non-trivial because we do not explicitly provide HOPLoP with feedback regarding when not to hop, nor do we set up any constraints in the traversal process.

  3. This would imply that the single-hop relation \(r^{\prime }\) present in the graph is semantically similar to the task r.

References

  1. Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., Kudlur, M., Levenberg, J., Monga, R., Moore, S., Murray, D.G., Steiner, B., Tucker, P., Vasudevan, V., Warden, P., Wicke, M., Yu, Y., Zheng, X.: Tensorflow: a system for large-scale machine learning. In: Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, USENIX Association, USA, OSDI’16, pp. 265–283 (2016)

  2. Adnan, K., Akbar, R.: Limitations of information extraction methods and techniques for heterogeneous unstructured big data. Int. J. Eng. Bus. Manag. 11, 1847979019890771 (2019). https://doi.org/10.1177/1847979019890771

    Article  Google Scholar 

  3. Aggarwal, N., Shekarpour, S., Bhatia, S., Sheth, A.: Knowledge graphs: in theory and practice. In: Conference on Information and Knowledge Management, vol. 17 (2017)

  4. Balazevic, I., Allen, C., Hospedales, T.: TuckER: tensor factorization for knowledge graph completion. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics, Hong Kong, China, pp. 5185-5194. https://doi.org/10.18653/v1/D19-1522(2019)

  5. Bianchi, F., Rossiello, G., Costabello, L., Palmonari, M., Minervini, P.: Knowledge graph embeddings and explainable ai. arXiv:200414843(2020)

  6. Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, Association for Computing Machinery, New York, NY, USA, SIGMOD ’08, pp. 1247–1250. https://doi.org/10.1145/1376616.1376746 (2008)

  7. Bordes, A., Usunier, N., Garcia-Durán, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2, Curran Associates Inc., Red Hook, NY, USA, NIPS’13, pp. 2787–2795 (2013)

  8. Cai, H., Zheng, V.W., Chang, K.C.: A comprehensive survey of graph embedding: problems, techniques, and applications. IEEE Trans. Knowl. Data Eng. 30(9), 1616–1637 (2018). https://doi.org/10.1109/TKDE.2018.2807452

    Article  Google Scholar 

  9. Chami, I., Wolf, A., Juan, D.C., Sala, F., Ravi, S., Ré, C.: Low-dimensional hyperbolic knowledge graph embeddings. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, pp. 6901-6914. https://doi.org/10.18653/v1/2020.acl-main.617 (2020)

  10. Chen, W., Xiong, W., Yan, X., Wang, W.Y.: Variational knowledge graph reasoning. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), Association for Computational Linguistics, New Orleans, Louisiana, pp. 1823-1832. https://doi.org/10.18653/v1/N18-1165 (2018)

  11. Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. In: NIPS 2014 Workshop on Deep Learning, December 2014 (2014)

  12. Csáji, B.C., et al.: Approximation with artificial neural networks. Faculty of Sciences, Etvs Lornd University, Hungary 24(48), 7 (2001)

    Google Scholar 

  13. Das, R., Neelakantan, A., Belanger, D., McCallum, A.: Chains of reasoning over entities, relations, and text using recurrent neural networks. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, Association for Computational Linguistics, Valencia, Spain, pp. 132–141. https://www.aclweb.org/anthology/E17-1013 (2017)

  14. Das, R., Dhuliawala, S., Zaheer, M., Vilnis, L., Durugkar, I., Krishnamurthy, A., Smola, A., McCallum, A.: Go for a walk and arrive at the answer: reasoning over paths in knowledge bases using reinforcement learning. In: ICLR (2018)

  15. Dettmers, T., Pasquale, M., Pontus, S., Riedel, S.: Convolutional 2d knowledge graph embeddings. In: Proceedings of the 32th AAAI Conference on Artificial Intelligence, pp. 1811–1818. arXiv:1707.01476 (2018)

  16. Ding, B., Wang, Q., Wang, B., Guo, L.: Improving knowledge graph embedding using simple constraints. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Melbourne, Australia, pp. 110–121. https://doi.org/10.18653/v1/P18-1011(2018)

  17. Gal, Y.: Uncertainty in Deep Learning. PhD thesis, University of Cambridge (2016)

  18. Gardner, M., Talukdar, P., Krishnamurthy, J., Mitchell, T.: Incorporating vector space similarity in random walk inference over knowledge bases. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Doha, Qatar, pp. 397–406. https://doi.org/10.3115/v1/D14-1044 (2014)

  19. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press. http://www.deeplearningbook.org (2016)

  20. Guu, K., Miller, J., Liang, P.: Traversing knowledge graphs in vector space. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Lisbon, Portugal, pp. 318–327. https://doi.org/10.18653/v1/D15-1038 (2015)

  21. Haeb-Umbach, R., Ney, H.: Improvements in beam search for 10000-word continuous-speech recognition. IEEE Transactions on Speech and Audio Processing 2(2), 353–356 (1994). https://doi.org/10.1109/89.279287

    Article  Google Scholar 

  22. Halpern, J.: Reasoning about Uncertainty. MIT Press (2017)

  23. Hamilton, W.L., Bajaj, P., Zitnik, M., Jurafsky, D., Leskovec, J.: Embedding logical queries on knowledge graphs. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, Curran Associates Inc., Red Hook, NY, USA, NIPS’18, pp. 2030–2041 (2018)

  24. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735

    Article  Google Scholar 

  25. Hosmer, D., Lemeshow, S.: Applied Logistic Regression. Applied Logistic Regression, Wiley (2004)

  26. Ji, H., Grishman, R.: Knowledge base population: successful approaches and challenges. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, Association for Computational Linguistics, HLT ’11, pp. 1148–1158. https://aclanthology.org/P11-1115(2011)

  27. Ji, S., Pan, S., Cambria, E., Marttinen, P., Yu, P.S.: A survey on knowledge graphs: representation, acquisition and applications. arXiv:200200388 (2020)

  28. Jordan, M.I.: Chapter 25 - serial order: a parallel distributed processing approach. In: Donahoe, J.W., Packard Dorsel, V. (eds.) Neural-Network Models of Cognition, Advances in Psychology, vol 121, North-Holland, pp. 471–495. https://doi.org/10.1016/S0166-4115(97)80111-2 (1997)

  29. Kadlec, R., Bajgar, O., Kleindienst, J.: Knowledge base completion: baselines strike back. In: Proceedings of the 2nd Workshop on Representation Learning for NLP, Association for Computational Linguistics, Vancouver, Canada, pp. 69–74. https://doi.org/10.18653/v1/W17-2609 (2017)

  30. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv:1412.6980, cite arxiv:1412.6980comment: Published as a conference paper at the 3rd International Conference for Learning Representations, San Diego, 2015 (2014)

  31. Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14–16, 2014, Conference Track Proceedings. arXiv:1312.6114v10 (2014)

  32. Kolda, T.G., Bader, B.W.: Tensor decompositions and applications. SIAM Rev. 51(3), 455–500 (2009). https://doi.org/10.1137/07070111X

    Article  MathSciNet  MATH  Google Scholar 

  33. Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT press (2009)

  34. Lacroix, T., Usunier, N., Obozinski, G.: Canonical tensor decomposition for knowledge base completion. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, PMLR, Stockholmsmässan, Stockholm Sweden, Proceedings of Machine Learning Research, vol. 80. pp. 2863–2872. http://proceedings.mlr.press/v80/lacroix18a.html (2018)

  35. Lao, N., Cohen, W.W.: Relational retrieval using a combination of path-constrained random walks. Mach. Learn. 81(1), 53–67 (2010). https://doi.org/10.1007/s10994-010-5205-8

    Article  MathSciNet  MATH  Google Scholar 

  36. Lao, N., Mitchell, T., Cohen, W.W.: Random walk inference and learning in a large scale knowledge base. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, USA, EMNLP ’11, pp. 529–539 (2011)

  37. LeCun, Y., Bengio, Y., Hinton, G.: Deep Learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  38. Lin, Y., Liu, Z., Luan, H., Sun, M., Rao, S., Liu, S.: Modeling relation paths for representation learning of knowledge bases. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Lisbon, Portugal, pp. 705–714. https://doi.org/10.18653/v1/D15-1082 (2015)

  39. Lin, X.V., Socher, R., Xiong, C.: Multi-hop knowledge graph reasoning with reward shaping. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Brussels, Belgium, pp. 3243–3253. https://doi.org/10.18653/v1/D18-1362 (2018)

  40. Lin, Y., Liu, Z., Sun, M., Liu, Y., Zhu, X: Learning entity and relation embeddings for knowledge graph completion. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, AAAI Press, AAAI’15, pp. 2181–2187 (2015)

  41. Mahdisoltani, F., Biega, J., Suchanek, F.M.: YAGO3: a knowledge base from multilingual wikipedias. In: CIDR 2015, Seventh Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 4–7, 2015, Online Proceedings, www.cidrdb.org. http://cidrdb.org/cidr2015/Papers/CIDR15_Paper1.pdf (2015)

  42. Meilicke, C., Chekol, M.W., Ruffinelli, D., Stuckenschmidt, H.: Anytime bottom-up rule learning for knowledge graph completion. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, International Joint Conferences on Artificial Intelligence Organization, pp. 3137–3143. https://doi.org/10.24963/ijcai.2019/435 (2019)

  43. Mesquita, F., Cannaviccio, M., Schmidek, J., Mirza, P., Barbosa, D.: KnowledgeNet: a benchmark dataset for knowledge base population. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics, Hong Kong, China, pp. 749–758. https://doi.org/10.18653/v1/D19-1069 (2019)

  44. Miller, G.A.: Wordnet: a lexical database for english. Commun. ACM 38(11), 39–41 (1995). https://doi.org/10.1145/219717.219748

    Article  Google Scholar 

  45. Moschitti, A., Tymoshenko, K., Alexopoulos, P., Walker, A.D., Nicosia, M., Vetere, G., Faraotti, A., Monti, M., Pan, J.Z., Wu, H., Zhao, Y.: Question Answering and Knowledge Graphs. Springer, pp. 181–212 (2017)

  46. Nayyeri, M., Xu, C., Lehmann, J., Yazdi, H.S.: Logicenn: a neural based knowledge graphs embedding model with logical rules. arXiv:190807141 (2019)

  47. Neelakantan, A., Roth, B., Mccallum, A.: Compositional vector space models for knowledge base completion. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Association for Computational Linguistics, Beijing, China, pp. 156–166. https://doi.org/10.3115/v1/P15-1016 (2015)

  48. Nickel, M., Tresp, V., Kriegel, H.P.: A three-way model for collective learning on multi-relational data. In: Proceedings of the 28th International Conference on International Conference on Machine Learning, Omnipress, Madison, WI, USA, ICML’11, pp. 809–816 (2011)

  49. Nickel, M., Murphy, K., Tresp, V., Gabrilovich, E.: A review of relational machine learning for knowledge graphs. Proc. IEEE 104(1), 11–33 (2016). https://doi.org/10.1109/JPROC.2015.2483592

    Article  Google Scholar 

  50. Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. In: Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28, JMLR.org, ICML’13, pp. III–1310–III–1318 (2013)

  51. Pinter, Y., Eisenstein, J.: Predicting semantic relations using global graph properties. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Brussels, Belgium, pp. 1741–1751. https://doi.org/10.18653/v1/D18-1201 (2018)

  52. Ranganathan, V., Subramanyam, N.: Sde-Kg: a stochastic dynamic environment for knowledge graphs. In: Cellier, P., Driessens, K. (eds.) Machine Learning and Knowledge Discovery in Databases, Springer International Publishing, Cham, pp. 483–488 (2020)

  53. Ren, H., Leskovec, J.: Beta embeddings for multi-hop logical reasoning in knowledge graphs. In: Neural Information Processing Systems (2020)

  54. Ren, H., Hu, W., Leskovec, J.: Query2box: reasoning over knowledge graphs in vector space using box embeddings. In: International Conference on Learning Representations. https://openreview.net/forum?id=BJgr4kSFDS (2020)

  55. Robinson, J.A., Voronkov, A. (eds.): Handbook of Automated Reasoning (in 2 volumes). Elsevier and MIT Press. https://www.sciencedirect.com/book/9780444508133/handbook-of-automated-reasoning (2001)

  56. Rossi, A., Barbosa, D., Firmani, D., Matinata, A., Merialdo, P.: Knowledge graph embedding for link prediction: a comparative analysis. ACM Trans. Knowl. Discov. Data 15(2). https://doi.org/10.1145/3424672 (2021)

  57. Ruder, S.: An overview of gradient descent optimization algorithms. arXiv:160904747 (2016)

  58. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)

    Article  Google Scholar 

  59. Schlichtkrull, M., Kipf, T.N., Bloem, P., van den Berg, R., Titov, I., Welling, M.: Modeling relational data with graph convolutional networks. In: Gangemi, A., Navigli, R., Vidal, M.E., Hitzler, P., Troncy, R., Hollink, L., Tordai, A., Alam, M. (eds.) The Semantic Web, Springer International Publishing, Cham, pp. 593–607 (2018)

  60. Shen, Y., Chen, J., Huang, P.S., Guo, Y., Gao, J.: M-walk: learning to walk over graphs using monte carlo tree search. In: Bengio, S., Wallach. H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems, Curran Associates, Inc., vol. 31. pp. 6786–6797. https://proceedings.neurips.cc/paper/2018/file/c6f798b844366ccd65d99bc7f31e0e02-Paper.pdf (2018)

  61. Raghavan, S., Garcia-Molina, H.: Representing web graphs. In: Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405), pp. 405–416. https://doi.org/10.1109/ICDE.2003.1260809 (2003)

  62. Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of the 16th International Conference on World Wide Web, Association for Computing Machinery, New York, NY, USA, WWW ’07, pp. 697–706. https://doi.org/10.1145/1242572.1242667 (2007)

  63. Sun, Z., Deng, Z.H., Nie, J.Y., Tang, J.: Rotate: knowledge graph embedding by relational rotation in complex space. In: International Conference on Learning Representations. https://openreview.net/forum?id=HkgEQnRqYQ (2019)

  64. Sutton, R.S., Barto, A.G.: Reinforcement Learning: an Introduction. A Bradford Book Cambridge, MA, USA (2018)

  65. Toutanova, K., Lin, V., Yih, W.T., Poon, H., Quirk, C.: Compositional learning of embeddings for relation paths in knowledge base and text. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Berlin, Germany, pp. 1434–1444. https://doi.org/10.18653/v1/P16-1136 (2016)

  66. Trouillon, T., Welbl, J., Riedel, S., Gaussier, E., Bouchard, G.: Complex embeddings for simple link prediction. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, JMLR.org, ICML’16, pp. 2071–2080 (2016)

  67. Tucker, L.R.: Some mathematical notes on three-mode factor analysis. Psychometrika 31(3), 279–311 (1966)

    Article  MathSciNet  Google Scholar 

  68. Vashishth, S., Sanyal, S., Nitin, V., Talukdar, P.: Composition-based multi-relational graph convolutional networks. In: International Conference on Learning Representations. https://openreview.net/forum?id=BylA_C4tPr (2020)

  69. Vrandečić, D.: Wikidata: a new platform for collaborative data collection. In: Proceedings of the 21st international conference on world wide web, pp. 1063–1064 (2012)

  70. Wang, Q., Mao, Z., Wang, B., Guo, L.: Knowledge graph embedding: a survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 29 (12), 2724–2743 (2017). https://doi.org/10.1109/TKDE.2017.2754499

    Article  Google Scholar 

  71. Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3-4), 229–256 (1992). https://doi.org/10.1007/BF00992696

    Article  MATH  Google Scholar 

  72. Xiong, W., Hoang, T., Wang, W.Y.: DeepPath: a reinforcement learning method for knowledge graph reasoning. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Copenhagen, Denmark, pp. 564–5730. https://doi.org/10.18653/v1/D17-1060 (2017)

  73. Yang, B., Yih, W., He, X., Gao, J., Deng, L.: Embedding entities and relations for learning and inference in knowledge bases. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings. arXiv:1412.6575 (2015)

Download references

Funding

This work was funded in part by grants from the Natural Science Research Council of Canada and an Alberta Innovates Graduate Student Scholarship.

Author information

Authors and Affiliations

Authors

Contributions

V. Ranganathan is the main author having contributed ideas, code, and experimentation. D. Barbosa supervised the research and has contributed ideas.

Corresponding author

Correspondence to Denilson Barbosa.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Availability of data and other material

https://github.com/U-Alberta/HOPLoP.

Code availability

https://github.com/U-Alberta/HOPLoP.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special Issue on Large Scale Graph Data Analytics Guest Editors: Xuemin Lin, Lu Qin, Wenjie Zhang, and Ying Zhang

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ranganathan, V., Barbosa, D. HOPLoP: multi-hop link prediction over knowledge graph embeddings. World Wide Web 25, 1037–1065 (2022). https://doi.org/10.1007/s11280-021-00972-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-021-00972-6

Keywords

Navigation