Skip to main content

COMET: A Contextualized Molecule-Based Matching Technique

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11706))

Abstract

Context-specific description of entities –expressed in RDF– poses challenges during data-driven tasks, e.g., data integration, and context-aware entity matching represents a building-block for these tasks. However, existing approaches only consider inter-schema mapping of data sources, and are not able to manage several contexts during entity matching. We devise COMET, an entity matching technique that relies on both the knowledge stated in RDF vocabularies and context-based similarity metrics to match contextually equivalent entities. COMET executes a novel 1-1 perfect matching algorithm for matching contextually equivalent entities based on the combined scores of semantic similarity and context similarity. COMET employs the Formal Concept Analysis algorithm in order to compute the context similarity of RDF entities. We empirically evaluate the performance of COMET on a testbed from DBpedia. The experimental results suggest that COMET is able to accurately match equivalent RDF graphs in a context-dependent manner.

This research was supported by the European project QualiChain (number 822404).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://en.wikipedia.org/wiki/Formal_concept_analysis.

  2. 2.

    https://github.com/RDF-Molecules/COMET.

References

  1. Acosta, M., Vidal, M.-E., Lampo, T., Castillo, J., Ruckhaus, E.: ANAPSID: an adaptive query processing engine for SPARQL endpoints. In: Aroyo, L., et al. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 18–34. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25073-6_2

    Chapter  Google Scholar 

  2. Beek, W., Schlobach, S., van Harmelen, F.: A contextualised semantics for owl:sameAs. In: Sack, H., Blomqvist, E., d’Aquin, M., Ghidini, C., Ponzetto, S.P., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9678, pp. 405–419. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-34129-3_25

    Chapter  Google Scholar 

  3. Collarana, D., Galkin, M., Ribón, I.T., Vidal, M., Lange, C., Auer, S.: MINTE: semantically integrating RDF graphs. In: Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics, WIMS, pp. 22:1–22:11 (2017)

    Google Scholar 

  4. Endris, K.M., Galkin, M., Lytra, I., Mami, M.N., Vidal, M.-E., Auer, S.: MULDER: querying the linked data web by bridging RDF molecule templates. In: Benslimane, D., Damiani, E., Grosky, W.I., Hameurlain, A., Sheth, A., Wagner, R.R. (eds.) DEXA 2017. LNCS, vol. 10438, pp. 3–18. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-64468-4_1

    Chapter  Google Scholar 

  5. Isele, R., Bizer, C.: Active learning of expressive linkage rules using genetic programming. J. Web Semant. 23, 2–15 (2013)

    Article  Google Scholar 

  6. Jagadish, H.V., et al.: Big data and its technical challenges. Commun. ACM 57(7), 86–94 (2014)

    Article  Google Scholar 

  7. Knoblock, C.A., et al.: Semi-automatically mapping structured sources into the Semantic Web. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 375–390. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-30284-8_32

    Chapter  Google Scholar 

  8. Michelfeit, J., Knap, T.: Linked data fusion in ODCleanStore. In: ISWC Posters and Demonstrations Track (2012)

    Google Scholar 

  9. Ribón, I.T., Vidal, M., Kämpgen, B., Sure-Vetter, Y.: GADES: a graph-based semantic similarity measure. In: SEMANTICS - 12th International Conference on Semantic Systems, Leipzig, Germany, pp. 101–104 (2016)

    Google Scholar 

  10. Schultz, A., Matteini, A., Isele, R., Mendes, P.N., Bizer, C., Becker, C.: LDIF - a framework for large-scale linked data integration. In: International World Wide Web Conference (2012)

    Google Scholar 

  11. Verroios, V., Garcia-Molina, H., Papakonstantinou, Y.: Waldo: an adaptive human interface for crowd entity resolution. In: International Conference on Management of Data, pp. 1133–1148 (2017)

    Google Scholar 

  12. Vychodil, V.: A new algorithm for computing formal concepts. na (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Diego Collarana .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tasnim, M., Collarana, D., Graux, D., Galkin, M., Vidal, ME. (2019). COMET: A Contextualized Molecule-Based Matching Technique. In: Hartmann, S., Küng, J., Chakravarthy, S., Anderst-Kotsis, G., Tjoa, A., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2019. Lecture Notes in Computer Science(), vol 11706. Springer, Cham. https://doi.org/10.1007/978-3-030-27615-7_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-27615-7_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-27614-0

  • Online ISBN: 978-3-030-27615-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics