Skip to main content

Using Embeddings to Predict Changes in Large Semantic Graphs

  • Conference paper
  • First Online:
Information Management and Big Data (SIMBig 2019)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1070))

Included in the following conference series:

Abstract

Understanding and predicting how large knowledge graphs change over time is as difficult as it is useful. An important subtask to address this artificial intelligence challenge is to characterize and predict three types of nodes: add-only nodes that can solely add up new edges, constant nodes whose edges remain unchanged, and del-only nodes whose edges can only be deleted. In this work, we improve previous prediction approaches by using word embeddings from NLP to identify the nodes of the large semantic graph and build a Logistic Regression model. We tested the proposed model in different versions of DBpedia and obtained the following prediction improvements on F1 measure: up to 10% for add-only nodes, close to 15% for constant nodes, and close to 22% for del-only nodes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://dbpedia.org.

  2. 2.

    https://wiki.dbpedia.org/develop/datasets.

  3. 3.

    https://spark.apache.org/.

  4. 4.

    https://github.com/data61/stellar-random-walk.

  5. 5.

    Note that \(k\le l\) because \(o_k\) could be a literal, date for example, then it has no outgoing edges.

  6. 6.

    https://databus.dbpedia.org/dbpedia/mappings.

References

  1. Barsotti, D., Dominguez, M.A., Duboue, P.A.: Predicting invariant nodes in large scale semantic knowledge graphs. In: Lossio-Ventura, J.A., Alatrista-Salas, H. (eds.) SIMBig 2017. CCIS, vol. 795, pp. 48–60. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-90596-9_4

    Chapter  Google Scholar 

  2. Cheng, S., Termehchy, A., Hristidis, V.: Efficient prediction of difficult keyword queries over databases. IEEE Trans. Knowl. Data Eng. 26(6), 1507–1520 (2014)

    Article  Google Scholar 

  3. Christopoulou, F., Miwa, M., Ananiadou, S.: A walk-based model on entity graphs for relation extraction. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). ACL, Melbourne (2018)

    Google Scholar 

  4. Drury, B., Valverde-Rebaza, J.C., de Andrade Lopes, A.: Causation generalization through the identification of equivalent nodes in causal sparse graphs constructed from text using node similarity strategies. In: Proceedings of SIMBig 2015, pp. 58–65 (2015)

    Google Scholar 

  5. Duboue, P.A., Domínguez, M.A.: Using robustness to learn to order semantic properties in referring expression generation. In: Montes-y-Gómez, M., Escalante, H.J., Segura, A., Murillo, J.D. (eds.) IBERAMIA 2016. LNCS (LNAI), vol. 10022, pp. 163–174. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47955-2_14

    Chapter  Google Scholar 

  6. Eder, J., Koncilia, C.: Modelling changes in ontologies. In: Meersman, R., Tari, Z., Corsaro, A. (eds.) OTM 2004. LNCS, vol. 3292, pp. 662–673. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30470-8_77

    Chapter  Google Scholar 

  7. Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. CoRR (2016)

    Google Scholar 

  8. Kauppinen, T., Hyvönen, E.: Modeling and reasoning about changes in ontology time series. In: Sharman, R., Kishore, R., Ramesh, R. (eds.) Integrated Series in Information Systems, pp. 319–338. Springer, Boston (2007)

    Google Scholar 

  9. Lassila, O., Swick, R.R., Wide, W., Consortium, W.: Resource description framework (RDF) model and syntax specification (1998)

    Google Scholar 

  10. Lehmann, J., et al.: DBpedia - a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web J. 6(2), 167–195 (2015)

    Article  Google Scholar 

  11. Meng, X., et al.: MLlib: machine learning in apache spark. J. Mach. Learn. Res. 17(1), 1235–1241 (2016)

    MathSciNet  MATH  Google Scholar 

  12. Meroño-Peñuela, A., Guéret, C., Schlobach, S.: Release early, release often: predicting change in versioned knowledge organization systems on the web. CoRR abs/1505.03101 (2015)

    Google Scholar 

  13. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of NIPS 2013 (2013)

    Google Scholar 

  14. Rula, A., Panziera, L., Palmonari, M., Maurino, A.: Capturing the currency of DBpedia descriptions and get insight into their validity. In: Proceedings of the 5th International Workshop on Consuming Linked Data (COLD 2014) (2014)

    Google Scholar 

  15. Arora, S., Liang, Y., Ma, T.: A simple but tough-to-beat baseline for sentence embeddings. In: Proceeding of International Conference on Learning Representations, ICLR 2017, 24–26 April, Toulon, France (2017)

    Google Scholar 

  16. Whitney, W.: Disentangled representations in neural models. CoRR abs/1602.02383 (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Damián Barsotti or Martín Ariel Domínguez .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Barsotti, D., Domínguez, M.A. (2020). Using Embeddings to Predict Changes in Large Semantic Graphs. In: Lossio-Ventura, J.A., Condori-Fernandez, N., Valverde-Rebaza, J.C. (eds) Information Management and Big Data. SIMBig 2019. Communications in Computer and Information Science, vol 1070. Springer, Cham. https://doi.org/10.1007/978-3-030-46140-9_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-46140-9_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-46139-3

  • Online ISBN: 978-3-030-46140-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics