Skip to main content
Log in

Augmenting Linked Data Ontologies with New Object Properties

  • Published:
New Generation Computing Aims and scope Submit manuscript

Abstract

Although several RDF knowledge bases are available through the Linked Open Data (LOD) initiative, the ontology schema of such linked datasets is not very rich. In particular, they lack object properties. The problem of finding new object properties between any two given classes has not been investigated in detail in the context of Linked Data. In the first part of this paper, we present DARO (Detecting Arbitrary Relations for enriching Ontology of Linked Data)—an unsupervised solution to enrich the LOD cloud with new object properties (and their instances) between two given classes. DARO first identifies text patterns from the web corpus that can potentially represent relations between individuals. These text patterns are then clustered based on semantic similarity to capture the object properties between the two given classes. We have empirically evaluated our approach on several pairs of classes and found that the system can indeed be used for enriching the linked datasets with new object properties and their instances. We have compared DARO with newOntExt which is an offshoot of the NELL (Never-Ending Language Learning) effort. Our experiments reveal that DARO gives better results than newOntExt as a recall-oriented system. In the second part of the paper, we propose a methodology to predict potential pairs of classes which could be connected by object properties but are not yet connected. We claim that evidence obtained from external textual resources and their Word2Vec representations can be made use of, for this purpose. Our approach gives results that are complementary to those given by the traditional techniques found in the literature. Hence our method can be used in combination with the traditional techniques for maximum benefits.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData.

  2. https://www.w3.org/2001/sw/sweo/public/UseCases/BBC/.

  3. http://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago/statistics/—totally there are 60 object properties, but 28 of them connect the domain class to the class http://yago-knowledge.org/resource/yagoLiteral.

  4. http://yago-knowledge.org/resource/wordnet_religion_105946687.

  5. http://yago-knowledge.org/resource/wordnet_country_108544813.

  6. http://yago-knowledge.org/resource/Christianity.

  7. http://yago-knowledge.org/resource/Australia.

  8. http://dbpedia.org/ontology/SportFacility.

  9. http://dbpedia.org/ontology/SportsManager.

  10. http://dbpedia.org/ontology/SportsSeason.

  11. http://dbpedia.org/ontology/SportsTeam.

  12. ReVerb ClueWeb Extractions 1.1: dataset consisting of 15 million triples produced by running ReVerb on the English portion of ClueWeb09 corpus.

  13. http://yago-knowledge.org/resource/wordnet_writer_110794014.

  14. http://yago-knowledge.org/resource/wordnet_novel_106367879.

  15. https://sites.google.com/site/ontoworks/projects.

  16. NELL.08m.1050.esv.csv downloaded from http://rtw.ml.cmu.edu/rtw/resources on 26th April 2017.

  17. https://github.com/MaLL-UFSCar/ontext.

  18. Downloaded from http://rtw.ml.cmu.edu/resources/nps/NELL.ClueWeb09.v1.nps.csv.gz.

  19. http://wordnetweb.princeton.edu/perl/webwn?s=flow.

  20. http://wordnetweb.princeton.edu/perl/webwn?s=traverse.

  21. https://code.google.com/archive/p/word2vec/.

  22. https://sites.google.com/site/ontoworks/projects.

References

  1. Adamic, L.A., Adar, E.: Friends and neighbors on the web. Soc. Netw. 25(3), 211–230 (2003)

    Article  Google Scholar 

  2. Appel, A.P., Junior, E.R.H.: Prophet—a link-predictor to learn new rules on NELL. In: Data Mining Workshops (ICDMW), 2011 IEEE 11th International Conference on, Vancouver, BC, Canada, December 11, 2011, pp. 917–924 (2011)

  3. Aprosio, A.P., Giuliano, C., Lavelli, A.: Extending the coverage of dbpedia properties using distant supervision over wikipedia. In: Proceedings of the NLP & DBpedia workshop co-located with the 12th International Semantic Web Conference (ISWC 2013), Sydney, Australia, October 22, 2013 (2013)

  4. Baader, F., Horrocks, I., Sattler, U.: Description logics. Handbook of Knowledge Representation, pp. 135–179. Elsevier, Oxford (2008)

    Chapter  Google Scholar 

  5. Banerjee, S., Pedersen, T.: An adapted lesk algorithm for word sense disambiguation using wordnet. In: Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing, CICLing ’02, pp. 136–145. Springer-Verlag, London (2002)

  6. Barchi, P.H., Hruschka, E.R.: Never-ending ontology extension through machine reading. In: 2014 14th International Conference on Hybrid Intelligent Systems, pp. 266–272 (2014)

  7. Barchi, P.H., Hruschka, E.R.: Two different approaches to ontology extension through machine reading. J. Netw. Innov. Comput. 3(1), 78–87 (2015)

    Google Scholar 

  8. Bhatia, S., Dwivedi, P., Kaur, A.: That’s interesting, tell me more! finding descriptive support passages for knowledge graph relationships. In: The Semantic Web—ISWC 2018-17th International Semantic Web Conference, Monterey, CA, USA, October 8–12, 2018, Proceedings, Part I, pp. 250–267 (2018)

  9. Bordes, A., Usunier, N., García-Durán, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5–8, 2013, Lake Tahoe, Nevada, United States, pp. 2787–2795 (2013)

  10. Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Jr., E.R.H., Mitchell, T.M.: Toward an architecture for never-ending language learning. In: M. Fox, D. Poole (eds.) AAAI. AAAI Press (2010)

  11. Cergani, E., Miettinen, P.: Discovering relations using matrix factorization methods. In: 22nd ACM International Conference on Information and Knowledge Management, CIKM’13, San Francisco, CA, USA, October 27–November 1, 2013, pp. 1549–1552 (2013)

  12. Dominich, S.: The Modern Algebra of Information Retrieval, The Information Retrieval Series, vol. 24. Springer, Berlin (2008)

    MATH  Google Scholar 

  13. Etzioni, O., Fader, A., Christensen, J., Soderland, S., Mausam, M.: Open information extraction: the second generation. In: Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence—Volume One, IJCAI’11, pp. 3–10. AAAI Press (2011)

  14. Ferrucci, D., Brown, E., Chu-Carroll, J., Fan, J., Gondek, D., Kalyanpur, A.A., Lally, A., Murdock, J.W., Nyberg, E., Prager, J., Schlaefer, N., Welty, C.: The AI Behind Watson—The Technical Article. The AI Magazine (2010). http://www.aaai.org/Magazine/Watson/watson.php

  15. Galárraga, L.A., Teflioudi, C., Hose, K., Suchanek, F.M.: AMIE: association rule mining under incomplete evidence in ontological knowledge bases. In: 22nd International World Wide Web Conference, WWW ’13, Rio de Janeiro, Brazil, May 13–17, 2013, pp. 413–422 (2013)

  16. Hirst, G., St-Onge, D.: Lexical chains as representations of context for the detection and correction of malapropisms. In: Fellbaum, C. (ed.) WordNet: An Electronic Lexical Database, pp. 305–332. MIT Press, London (1998)

    Google Scholar 

  17. Hoffart, J., Suchanek, F.M., Berberich, K., Weikum, G.: YAGO2: a spatially and temporally enhanced knowledge base from wikipedia. Artif. Intell. 194, 28–61 (2013)

    Article  MathSciNet  Google Scholar 

  18. Jin, X., Han, J.: K-Means Clustering, pp. 695–697. Springer, Boston (2017)

    Google Scholar 

  19. Krause, S., Li, H., Uszkoreit, H., Xu, F.: Large-scale learning of relation-extraction rules with distant supervision from the web. In: The Semantic Web-ISWC 2012-11th International Semantic Web Conference, Boston, MA, USA, November 11–15, 2012, Proceedings, Part I, pp. 263–278 (2012)

  20. Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on Machine Learning, vol. 32, pp 1188–1196 (2014)

  21. Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., Bizer, C.: Dbpedia— a large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web 6, 167–195 (2015)

    Article  Google Scholar 

  22. Leicht, E.A., Holme, P., Newman, M.E.J.: Vertex similarity in networks. Phys. Rev. E 73, 026120 (2006)

    Article  Google Scholar 

  23. Mahdisoltani, F., Biega, J., Suchanek, F.M.: YAGO3: A knowledge base from multilingual wikipedias. In: CIDR 2015, Seventh Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 4–7, 2015, Online Proceedings (2015)

  24. Martínez, V., Berzal, F., Talavera, J.C.C.: A survey of link prediction in complex networks. ACM Comput. Surv. 49(4), 69:1–69:33 (2017)

    Article  Google Scholar 

  25. Meilicke, C., Fink, M., Wang, Y., Ruffinelli, D., Gemulla, R., Stuckenschmidt, H.: Fine-grained evaluation of rule- and embedding-based systems for knowledge graph completion. In: The Semantic Web-ISWC 2018-17th International Semantic Web Conference, Monterey, CA, USA, October 8–12, 2018, Proceedings, Part I, pp. 3–20 (2018)

  26. Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: Dbpedia spotlight: shedding light on the web of documents. In: Proceedings the 7th International Conference on Semantic Systems, I-SEMANTICS 2011, Graz, Austria, September 7–9, 2011, pp. 1–8 (2011)

  27. Mihalcea, R., Corley, C., Strapparava, C.: Corpus-based and knowledge-based measures of text semantic similarity. In: Proceedings of the 21st National Conference on Artificial Intelligence, Volume 1, AAAI’06, pp. 775–780. AAAI Press (2006)

  28. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held, December 5–8, 2013, Lake Tahoe, Nevada, United States, pp. 3111–3119 (2013)

  29. Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: ACL 2009, Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the AFNLP, 2–7 August 2009, Singapore, pp. 1003–1011 (2009)

  30. Mohamed, T.P., Hruschka Jr., E.R., Mitchell, T.M.: Discovering relations between noun categories. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP ’11, pp. 1447–1455 (2011)

  31. Muñoz, E., Hogan, A., Mileo, A.: Triplifying wikipedia’s tables. In: Proceedings of the First International Workshop on Linked Data for Information Extraction (LD4IE 2013) co-located with the 12th International Semantic Web Conference (ISWC 2013), Sydney, Australia, October 21, 2013 (2013)

  32. Nakashole, N., Weikum, G., Suchanek, F.M.: PATTY: A taxonomy of relational patterns with semantic types. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL 2012, July 12–14, 2012, Jeju Island, Korea, pp. 1135–1145 (2012)

  33. Navarro, L.F.: Mining ontologies to extract implicit knowledge. Ph.D. thesis, Federal University of Sao Carlos (2016)

  34. Nickel, M., Rosasco, L., Poggio, T.A.: Holographic embeddings of knowledge graphs. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12–17, 2016, Phoenix, Arizona, USA, pp. 1955–1961 (2016)

  35. Nickel, M., Tresp, V., Kriegel, H.: A three-way model for collective learning on multi-relational data. In: Proceedings of the 28th International Conference on Machine Learning, ICML 2011, Bellevue, Washington, USA, June 28–July 2, 2011, pp. 809–816 (2011)

  36. Nimishakavi, M., Saini, U.S., Talukdar, P.P.: Relation Schema Induction using Tensor Factorization with Side Information. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, November 1–4, 2016, pp. 414–423 (2016)

  37. Paulheim, H.: Knowledge graph refinement: a survey of approaches and evaluation methods. Semant. Web 8(3), 489–508 (2017)

    Article  Google Scholar 

  38. Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets. Cambridge University Press, New York (2011)

    Book  Google Scholar 

  39. Ritze, D., Lehmberg, O., Bizer, C.: Matching HTML tables to dbpedia. In: Proceedings of the 5th International Conference on Web Intelligence, Mining and Semantics, WIMS 2015, Larnaca, Cyprus, July 13–15, 2015, pp. 10:1–10:6 (2015)

  40. Salton, G., McGill, M.: Introduction to Modern Information Retrieval. McGraw-Hill Book Company, Pennsylvania (1984)

    MATH  Google Scholar 

  41. Sørensen, T.: A method of establishing groups of equal amplitude in plant sociology based on similarity of species and its application to analyses of the vegetation on Danish commons. Biol. Skr. 5, 1–34 (1948)

    Google Scholar 

  42. Subhashree, S., Kumar, P.S.: Detecting new and arbitrary relations among linked data entities using pattern extraction. CoRR abs/1606.07572 (2016)

  43. Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: A core of semantic knowledge. In: Proceedings of the 16th International Conference on World Wide Web, WWW ’07, pp. 697–706 (2007)

  44. Töpper, G., Knuth, M., Sack, H.: DBpedia Ontology Enrichment for Inconsistency Detection. In: Proceedings of the 8th International Conference on Semantic Systems, pp. 33–40. ACM (2012)

  45. Virtuoso SPARQL Query Editor. http://lod.openlinksw.com/sparql. Accessed June 2017

  46. Wang, Q., Mao, Z., Wang, B., Guo, L.: Knowledge graph embedding: a survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 29(12), 2724–2743 (2017)

    Article  Google Scholar 

  47. Zhou, T., Lü, L., Zhang, Y.: Predicting missing links via local information. Eur. Phys. J. B 71(4), 623–630 (2009)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Subhashree.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

See Table 9.

Table 9 Sample of the synsets generated by PATTY for the classes (Athlete, Athlete)

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Subhashree, S., Kumar, P.S. Augmenting Linked Data Ontologies with New Object Properties. New Gener. Comput. 38, 125–152 (2020). https://doi.org/10.1007/s00354-020-00085-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00354-020-00085-0

Keywords

Navigation