ABSTRACT
On RDF datasets, the truth values of triples are known when they are either explicitly stated or can be inferred using logical entailment. Due to the open world semantics of RDF, nothing can be said about the truth values of triples that are neither in the dataset nor can be logically inferred. By estimating the truth values of such triples, one could discover new information from the database thus enabling to broaden the scope of queries to an RDF base that can be answered, support knowledge engineers in maintaining such knowledge bases or recommend users resources worth looking into for instance. In this paper, we present a new approach to predict the truth values of any RDF triple. Our approach uses a 3-dimensional tensor representation of the RDF knowledge base and applies tensor factorization techniques that take open world semantics into account to predict new true triples given already observed ones. We report results of experiments on real world datasets comparing different tensor factorization models. Our empirical results indicate that our approach is highly successful in estimating triple truth values on incomplete RDF datasets.
- B. W. Bader and T. G. Kolda. Algorithm 862: MATLAB tensor classes for fast algorithm prototyping. ACM Transactions on Mathematical Software, 32(4): 635--653, December 2006. Google ScholarDigital Library
- J. D. Carroll and J.-J. Chang. Analysis of individual differences in multidimensional scaling via an n-way generalization of eckart-young decomposition. Psychometrika, 35(3): 283--319, Sept. 1970.Google ScholarCross Ref
- S. Decker, D. Brickley, J. Saarela, and J. Angele. A query and inference service for rdf. In Online Proceedings of the QL'98 - The Query Languages Workshop, 1998.Google Scholar
- S. Elbassuoni, M. Ramanath, R. Schenkel, M. Sydow, and G. Weikum. Language-model-based ranking for queries on rdf-graphs. In Proceeding of the 18th ACM Conference on Information and Knowledge Management, CIKM '09, pages 977--986, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- T. Franz, A. Schultz, S. Sizov, and S. Staab. Triplerank: Ranking semantic web data by tensor decomposition. In International Semantic Web Conference (ISWC), pages 209--224. Springer-Verlag, Oct. 2009. Google ScholarDigital Library
- H. Huang and C. Liu. Query evaluation on probabilistic rdf databases. In G. Vossen, D. Long, and J. Yu, editors, Web Information Systems Engineering - WISE 2009, volume 5802 of Lecture Notes in Computer Science, pages 307--320. Springer, 2009. Google ScholarDigital Library
- G. Karvounarakis, A. Magganaraki, S. Alexaki, V. Christophides, D. Plexousakis, M. Scholl, and K. Tolle. Querying the semantic web with rql. Comput. Netw., 42(5): 617--640, 2003. Google ScholarDigital Library
- M. Nickel, V. Tresp, and H.-P. Kriegel. A three-way model for collective learning on multi-relational data. In Proceedings of the 28th International Conference on Machine Learning (ICML-11), pages 809--816, New York, NY, USA, June 2011. ACM.Google ScholarDigital Library
- E. Oren, C. Guéret, and S. Schlobach. Anytime query answering in rdf through evolutionary algorithms. In ISWC '08: Proceedings of the 7th International Conference on The Semantic Web, pages 98--113, Berlin, Heidelberg, 2008. Springer-Verlag. Google ScholarDigital Library
- E. Prud'hommeaux and A. Seaborne. SPARQL Query Language for RDF. Technical report, W3C, 2006.Google Scholar
- S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme. Bpr: Bayesian personalized ranking from implicit feedback. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence (UAI 2009), 2009. Google ScholarDigital Library
- S. Rendle, L. Marinho, A. Nanopoulos, and L. Schmidt-Thieme. Learning optimal ranking with tensor factorization for tag recommendation. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD 2009), pages 727--736, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- S. Rendle and L. Schmidt-Thieme. Pairwise interaction tensor factorization for personalized tag recommendation. In Third ACM International Conference on Web Search and Data Mining (WSDM 2010), pages 81--90, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
- M. Sintek, D. Gmbh, and S. Decker. Triple - an rdf query, inference, and transformation language. In In Deductive Databases and Knowledge Management (DDLP 2001), October 2001.Google Scholar
- L. Tucker. Some mathematical notes on three-mode factor analysis. Psychometrika, 31(3): 297--311, 1966.Google ScholarCross Ref
- O. Udrea, V. Subrahmanian, and Z. Majkic. Probabilistic rdf. In IEEE International Conference on Information Reuse and Integration, pages 172--177. IEEE, 2006.Google ScholarCross Ref
- W3C. RDF Vocabulary Description Language 1.0: RDF Schema, 2004.Google Scholar
- W3C. Resource Description Framework (RDF): Concepts and Abstract Syntax, 2004.Google Scholar
- G. Yang and M. Kifer. Reasoning about anonymous resources and meta statements on the semantic web. J. Data Semantics, 1: 69--97, 2003.Google ScholarCross Ref
Recommendations
Enquiring Semantic Relations among RDF Triples
DCABES '12: Proceedings of the 2012 11th International Symposium on Distributed Computing and Applications to Business, Engineering & ScienceThis paper presents the research work of extending the semantic web retrieval language SPARQL and the ARQ engine in Jena to searching semantic relations among RDF triples by introducing a new query pattern which retrieves semantic relation paths between ...
Extended RDF: Computability and complexity issues
ERDF stable model semantics is a recently proposed semantics for ERDF ontologies and a faithful extension of RDFS semantics on RDF graphs. In this paper, we elaborate on the computability and complexity issues of the ERDF stable model semantics. Based ...
On Computing Deltas of RDF/S Knowledge Bases
The ability to compute the differences that exist between two RDF/S Knowledge Bases (KB) is an important step to cope with the evolving nature of the Semantic Web (SW). In particular, RDF/S deltas can be employed to reduce the amount of data that need ...
Comments