Using Embeddings to Predict Changes in Large Semantic Graphs

Barsotti, Damián; Domínguez, Martín Ariel

doi:10.1007/978-3-030-46140-9_26

Damián Barsotti¹⁰ &
Martín Ariel Domínguez¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1070))

Included in the following conference series:

Annual International Symposium on Information Management and Big Data

605 Accesses
1 Citations

Abstract

Understanding and predicting how large knowledge graphs change over time is as difficult as it is useful. An important subtask to address this artificial intelligence challenge is to characterize and predict three types of nodes: add-only nodes that can solely add up new edges, constant nodes whose edges remain unchanged, and del-only nodes whose edges can only be deleted. In this work, we improve previous prediction approaches by using word embeddings from NLP to identify the nodes of the large semantic graph and build a Logistic Regression model. We tested the proposed model in different versions of DBpedia and obtained the following prediction improvements on F1 measure: up to 10% for add-only nodes, close to 15% for constant nodes, and close to 22% for del-only nodes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://dbpedia.org.
2.
https://wiki.dbpedia.org/develop/datasets.
3.
https://spark.apache.org/.
4.
https://github.com/data61/stellar-random-walk.
5.
Note that \(k\le l\) because \(o_k\) could be a literal, date for example, then it has no outgoing edges.
6.
https://databus.dbpedia.org/dbpedia/mappings.

References

Barsotti, D., Dominguez, M.A., Duboue, P.A.: Predicting invariant nodes in large scale semantic knowledge graphs. In: Lossio-Ventura, J.A., Alatrista-Salas, H. (eds.) SIMBig 2017. CCIS, vol. 795, pp. 48–60. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-90596-9_4
Chapter Google Scholar
Cheng, S., Termehchy, A., Hristidis, V.: Efficient prediction of difficult keyword queries over databases. IEEE Trans. Knowl. Data Eng. 26(6), 1507–1520 (2014)
Article Google Scholar
Christopoulou, F., Miwa, M., Ananiadou, S.: A walk-based model on entity graphs for relation extraction. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). ACL, Melbourne (2018)
Google Scholar
Drury, B., Valverde-Rebaza, J.C., de Andrade Lopes, A.: Causation generalization through the identification of equivalent nodes in causal sparse graphs constructed from text using node similarity strategies. In: Proceedings of SIMBig 2015, pp. 58–65 (2015)
Google Scholar
Duboue, P.A., Domínguez, M.A.: Using robustness to learn to order semantic properties in referring expression generation. In: Montes-y-Gómez, M., Escalante, H.J., Segura, A., Murillo, J.D. (eds.) IBERAMIA 2016. LNCS (LNAI), vol. 10022, pp. 163–174. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47955-2_14
Chapter Google Scholar
Eder, J., Koncilia, C.: Modelling changes in ontologies. In: Meersman, R., Tari, Z., Corsaro, A. (eds.) OTM 2004. LNCS, vol. 3292, pp. 662–673. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30470-8_77
Chapter Google Scholar
Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. CoRR (2016)
Google Scholar
Kauppinen, T., Hyvönen, E.: Modeling and reasoning about changes in ontology time series. In: Sharman, R., Kishore, R., Ramesh, R. (eds.) Integrated Series in Information Systems, pp. 319–338. Springer, Boston (2007)
Google Scholar
Lassila, O., Swick, R.R., Wide, W., Consortium, W.: Resource description framework (RDF) model and syntax specification (1998)
Google Scholar
Lehmann, J., et al.: DBpedia - a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web J. 6(2), 167–195 (2015)
Article Google Scholar
Meng, X., et al.: MLlib: machine learning in apache spark. J. Mach. Learn. Res. 17(1), 1235–1241 (2016)
MathSciNet MATH Google Scholar
Meroño-Peñuela, A., Guéret, C., Schlobach, S.: Release early, release often: predicting change in versioned knowledge organization systems on the web. CoRR abs/1505.03101 (2015)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of NIPS 2013 (2013)
Google Scholar
Rula, A., Panziera, L., Palmonari, M., Maurino, A.: Capturing the currency of DBpedia descriptions and get insight into their validity. In: Proceedings of the 5th International Workshop on Consuming Linked Data (COLD 2014) (2014)
Google Scholar
Arora, S., Liang, Y., Ma, T.: A simple but tough-to-beat baseline for sentence embeddings. In: Proceeding of International Conference on Learning Representations, ICLR 2017, 24–26 April, Toulon, France (2017)
Google Scholar
Whitney, W.: Disentangled representations in neural models. CoRR abs/1602.02383 (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

Group of Analysis and Processing of Large Social and Semantic Networks, FaMAF, Universidad Nacional de Córdoba, Córdoba, Argentina
Damián Barsotti & Martín Ariel Domínguez

Authors

Damián Barsotti
View author publications
You can also search for this author in PubMed Google Scholar
Martín Ariel Domínguez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Damián Barsotti or Martín Ariel Domínguez .

Editor information

Editors and Affiliations

Stanford University, Stanford, CA, USA
Juan Antonio Lossio-Ventura
University of A Coruña, A Coruña, Spain
Nelly Condori-Fernandez
Visibilia, São Paulo, Brazil
Jorge Carlos Valverde-Rebaza

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Barsotti, D., Domínguez, M.A. (2020). Using Embeddings to Predict Changes in Large Semantic Graphs. In: Lossio-Ventura, J.A., Condori-Fernandez, N., Valverde-Rebaza, J.C. (eds) Information Management and Big Data. SIMBig 2019. Communications in Computer and Information Science, vol 1070. Springer, Cham. https://doi.org/10.1007/978-3-030-46140-9_26

Download citation

DOI: https://doi.org/10.1007/978-3-030-46140-9_26
Published: 23 April 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-46139-3
Online ISBN: 978-3-030-46140-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics