Skip to main content
Log in

Bootstrapping Object Coreferencing on the Semantic Web

  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

An object on the Semantic Web is likely to be denoted with several URIs by different parties. Object coreferencing is a process to identify “equivalent” URIs of objects for achieving a better Data Web. In this paper, we propose a bootstrapping approach for object coreferencing on the Semantic Web. For an object URI, we firstly establish a kernel that consists of semantically equivalent URIs from the same-as, (inverse) functional properties and (max-)cardinalities, and then extend the kernel with respect to the textual descriptions (e.g., labels and local names) of URIs. We also propose a trustworthiness-based method to rank the coreferent URIs in the kernel as well as a similarity-based method for ranking the URIs in the extension of the kernel. We implement the proposed approach, called ObjectCoref, on a large-scale dataset that contains 76 million URIs collected by the Falcons search engine until 2008. The evaluation on precision, relative recall and response time demonstrates the feasibility of our approach. Additionally, we apply the proposed approach to investigate the popularity of the URI alias phenomenon on the current Semantic Web.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Hogan A, Harth A, Decker S. Performing object consolidation on the semantic web data graph. In Proc. WWW Workshop on I3: Identity, Identifiers, Identification, Banff, Canada, May 8, 2007.

  2. Jacobs I, Walsh N. Architecture of the World Wide Web, volume one. http://www.w3.org/TR/webarch/, Dec. 15, 2004.

  3. Bleiholder J, Naumann F. Data fusion. ACM Computing Surveys, 2008, 41(1): 1–41.

    Article  Google Scholar 

  4. Glaser H, Jaffri A, Millard I C. Managing co-reference on the Semantic Web. In WWW Workshop on LDOW, Madrid, Spain, Apr. 20, 2009.

  5. Bizer C, Heath T, Berners-Lee T. Linked data — The story so far. International Journal on Semantic Web and Information Systems, 2009, 5(3): 1–22.

    Article  Google Scholar 

  6. Volz R, Kleb J, Mueller W. Towards ontology-based disambiguation of geographical identifiers. In Proc. WWW Workshop on I3: Identity, Identifiers, Identification, Banff, Canada, May 8, 2007.

  7. Raimond Y, Sutton C, Sandler M. Automatic interlinking of music datasets on the Semantic Web. In WWW Workshop on LDOW, Beijing, China, Apr. 22, 2008.

  8. Hassanzadeh O, Consens M. Linked movie data base. In WWW Workshop on LDOW, Madrid, Spain, Apr. 20, 2009.

  9. Tummarello G, Delbru R, Oren E. Sindice.com: Weaving the open linked data. In Proc. ISWC/ASWC, Busan, Korea, Nov. 11–15, 2007, pp.552-565.

  10. Cheng G, Qu Y Z. Searching linked objects with Falcons: Approach, implementation and evaluation. International Journal on Semantic Web and Information Systems, 2009, 5(3): 49–70.

    Article  Google Scholar 

  11. Bouquet P, Stoermer H, Niederee C, Ma¹na A. Entity name system: The back-bone of an open and scalable web of data. In Proc. IEEE ICSC, Washington DC, USA, Aug. 4–7, 2008, pp.554-561.

  12. Hogan A, Polleres A, Umbrich J, Zimmermann A. Some entities are more equal than others: Statistical methods to consolidate linked data. In ESWC Workshop on NeFoRS, Heraklion, Greece, May 31, 2010.

  13. Elmagarmid A K, Ipeirotis P G, Verykios V S. Duplicate record detection: A survey. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(1): 1–16.

    Article  Google Scholar 

  14. Wang S, Du X Y, Meng X F, Chen H. Database research: Achievements and challenges. Journal of Computer Science and Technology, 2006, 21(5): 823–837.

    Article  Google Scholar 

  15. Li Y, Musílek P, Reformat M, Wyard-Scott L. Identification of pleonastic it using the web. Journal of Artificial Intelligence Research, 2009, 34(1): 339–389.

    MATH  Google Scholar 

  16. Dean M, Schreiber G. OWL web ontology language reference. http://www.w3.org/TR/owl-ref/, Feb. 10, 2004.

  17. Nikolov A, Uren V, Motta E, de Roeck A. Overcoming schema heterogeneity between linked semantic repositories to improve coreference resolution. In Proc. ASWC, Shanghai, China, Dec. 6–9, 2009, pp.332–346.

  18. Qu Y Z, Hu W, Cheng G. Constructing virtual documents for ontology matching. In Proc. WWW, Edinburgh, UK, May 23–26, 2006, pp.23–31.

  19. Hu W, Qu Y Z, Cheng G. Matching large ontologies: A divide- and-conquer approach. Data and Knowledge Engineering, 2008, 67(1): 140–160.

    Article  Google Scholar 

  20. Ferrara A, Lorusso D, Montanelli S. Automatic identity recognition in the Semantic Web. In Proc. ESWC Workshop on IRSW, Tenerife, Spain, Jun. 2, 2008.

  21. Volz J, Bizer C, Gaedke M, Kobilarov G. Discovering and maintaining links on the web of data. In Proc. ISWC, Chantilly, USA, Oct. 25–29, 2009, pp.650–665.

  22. Halpin P, Hayes P J, McCusker J P, McGuinness D L, Thompson H S. When owl:sameAs isn’t the same: An analysis of identity in linked data. In Proc. ISWC, Shanghai, China, Nov. 7–11, 2010, pp.305–320.

  23. Ding L, Shinavier J, Shangguan Z N, McGuinness D L. SameAs networks and beyond: Analyzing deployment status and implications of owl:sameAs in linked data. In Proc. ISWC, Shanghai, China, Nov. 7–11, 2010, pp.145–160.

  24. Gracia J, d’Aquin M, Mena E. Large scale integration of senses for the SemanticWeb. In Proc. WWW, Madrid, Spain, Apr. 20–24, 2009, pp.611-620.

  25. Fellegi I P, Sunter A B. A theory for record linkage. Journal of the American Statistical Society, 1969, 64(328): 1183–1210.

    Google Scholar 

  26. Cheng T Y, Wang S. A novel approach to clustering merchandise records. Journal of Computer Science and Technology, 2007, 22(2): 228–231.

    Article  Google Scholar 

  27. Euzenat J, Shvaiko P. Ontology Matching. Heidelberg: Springer, 2007.

    MATH  Google Scholar 

  28. Wang S, Englebienne G, Schlobach S. Learning concept mappings from instance similarity. In Proc. ISWC, Karlsruhe, Germany, Oct. 26–30, 2008, pp.339–355.

  29. Klyne G, Carroll J J. Resource description framework (RDF): Concepts and abstract syntax. http://www.w3.org/TR/rdf-concepts/, Feb. 10, 2004.

  30. Urbani J, Kotoulas S, Maassen J, van Harmelen F, Bal H. OWL reasoning with WebPIE: Calculating the closure of 100 billion triples. In Proc. ESWC, Heraklion, Greece, May 30-Jun. 3, 2010, pp.213-227.

  31. Hogan A, Pan J Z, Polleres A, Decker S. SAOR: Template rule optimisations for distributed reasoning over 1 billion linked data triples. In: Proc. ISWC, Shanghai, China, Nov. 7–11, 2010, pp.337–353.

  32. Ghazvinian A, Noy N F, Jonquet C, Shah N, Musen M A. What four million mappings can tell you about two hundred ontologies. In Proc. ISWC, Chantilly, USA, Oct. 25–29, 2009, pp.229–242.

  33. Page L, Brin S, Motwani R, Winograd T. The PageRank citation ranking: Bringing order to the web. Technical Report, Stanford University, 1998.

  34. Kleinberg J. Authoritative sources in a hyperlinked environment. In Proc. SODA, San Francisco, USA, Jan. 25–27, 1998, pp.668–677.

  35. Tummarello G, Morbidoni C, Bachmann-Gmür R, Erling O. RDFSync: Efficient remote synchronization of RDF models. In Proc. ISWC/ASWC, Busan, Korea, Nov. 11–15, 2007, pp.537-551.

  36. Stickler P. CBD — Concise bounded description. http://www.w3.org/Submission/CBD/, Jun. 3, 2005.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Hu.

Additional information

This work is supported in part by the National Natural Science Foundation of China under Grant Nos. 61003018 and 60973024, in part by the National Research Foundation for the Doctoral Program of Higher Education of China under Grant No. 20100091120041, and also in part by the IBM CRL UR Joint Project.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

(PDF 124 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hu, W., Qu, YZ. & Sun, XZ. Bootstrapping Object Coreferencing on the Semantic Web. J. Comput. Sci. Technol. 26, 663–675 (2011). https://doi.org/10.1007/s11390-011-1166-z

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-011-1166-z

Keywords

Navigation