Class Annotation Using Linked Open Data

Kellou-Menouer, Kenza; Kedad, Zoubida

doi:10.1007/978-3-319-48472-3_44

Class Annotation Using Linked Open Data

Kenza Kellou-Menouer²⁰ &
Zoubida Kedad²⁰

Conference paper
First Online: 18 October 2016

1421 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 10033))

Abstract

The meaningful usage of RDF datasets requires a description of their content. Part of this description is provided in the dataset itself through class definitions. However, the name of a class does not always reflect accurately its semantics. This meaning can be captured by providing some annotations for each class.

In this paper, we present a set of algorithms exploiting the instances of a dataset in order to provide annotations which best capture the semantics of a class. These algorithms rely on an external knowledge source. We introduce three ways of extracting annotations: (i) using the names of instances, (ii) using their property sets and (iii) considering the vocabularies used by the dataset. As an external source, we have used Linked Open Data, which represents an unprecedented amount of knowledge provided on the Web. We also show how annotations can be used to discover a class hierarchy and we present some evaluation results showing the effectiveness of our approach.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Conference: data.semanticweb.org/dumps/conferences/dc-2010-complete.rdf.
2.
BNF: datahub.io/fr/dataset/data-bnf-fr.
3.
DBpedia: dbpedia.org.

References

Linked Open Data Cloud (LOD Cloud) cache, sparql endpoint. http://lod.openlinksw.com/
Linked Open Vocabularies (LOV). http://lov.okfn.org/dataset/lov/
Agrawal, R., Srikant, R., et al.: Fast algorithms for mining association rules. In: Proceedings 20th International Conference Very Large Data Bases, VLDB, vol. 1215, pp. 487–499 (1994)
Google Scholar
Carmel, D., Roitman, H., Zwerdling, N.: Enhancing cluster labeling using wikipedia. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 139–146. ACM (2009)
Google Scholar
Christodoulou, K., Paton, N.W., Fernandes, A.A.A.: Structure inference for linked data sources using clustering. In: Hameurlain, A., Küng, J., Wagner, R., Bianchini, D., Antonellis, V., Virgilio, R. (eds.) Transactions on Large-Scale Data- and Knowledge-Centered Systems XIX. LNCS, vol. 8990, pp. 1–25. Springer, Heidelberg (2015). doi:10.1007/978-3-662-46562-2_1
Google Scholar
Ferragina, P., Scaiella, U.: Fast and accurate annotation of short texts with wikipedia pages. IEEE Softw. 1(29), 70–75 (2012)
Article Google Scholar
Fuglede, B., Topsøe, F.: Jensen-shannon divergence and hilbert space embedding. In: Proceedings of the International Symposium on Information Theory, ISIT, p. 31. IEEE (2004)
Google Scholar
Hagen, M., Michel, M., Stein, B.: What was the query? generating queries for document sets with applications in cluster labeling. In: Biemann, C., Handschuh, S., Freitas, A., Meziane, F., Métais, E. (eds.) NLDB 2015. LNCS, vol. 9103, pp. 124–133. Springer, Heidelberg (2015). doi:10.1007/978-3-319-19581-0_10
Chapter Google Scholar
Hignette, G., Buche, P., Dibie-Barthélemy, J., Haemmerlé, O.: Fuzzy annotation of web data tables driven by a domain ontology. In: Aroyo, L., et al. (eds.) ESWC 2009. LNCS, vol. 5554, pp. 638–653. Springer, Heidelberg (2009). doi:10.1007/978-3-642-02121-3_47
Chapter Google Scholar
Kellou-Menouer, K., Kedad, Z.: Schema discovery in RDF data sources. In: Johannesson, P., Lee, M.L., Liddle, S.W., Opdahl, A.L., López, Ó.P. (eds.) ER 2015. LNCS, vol. 9381, pp. 481–495. Springer, Heidelberg (2015). doi:10.1007/978-3-319-25264-3_36
Chapter Google Scholar
Kellou-Menouer, K., Kedad, Z.: Discovering types in RDF datasets. In: Gandon, F., Guéret, C., Villata, S., Breslin, J., Faron-Zucker, C., Zimmermann, A. (eds.) ESWC 2015. LNCS, vol. 9341, pp. 77–81. Springer, Heidelberg (2015). doi:10.1007/978-3-319-25639-9_15
Chapter Google Scholar
Limaye, G., Sarawagi, S., Chakrabarti, S.: Annotating and searching web tables using entities, types and relationships. Proc. VLDB Endowment 3(1–2), 1338–1347 (2010)
Article Google Scholar
Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: Dbpedia spotlight: shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems, pp. 1–8. ACM (2011)
Google Scholar
Milne, D., Witten, I.H.: Learning to link with wikipedia. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, pp. 509–518. ACM (2008)
Google Scholar
Nestorov, S., Abiteboul, S., Motwani, R.: Extracting schema from semistructured data. In: ACM SIGMOD Record, vol. 27, pp. 295–306. ACM (1998)
Google Scholar
Oram, P.: Wordnet: an electronic lexical database. In: Fellbaum, C. (ed.) Mit Press, Cambridge (2001)
Google Scholar
Papakonstantinou, Y., Garcia-Molina, H., Widom, J.: Object exchange across heterogeneous information sources. In: Proceedings of the Eleventh International Conference on Data Engineering, pp. 251–260. IEEE (1995)
Google Scholar
Pirró, G.: A semantic similarity metric combining features and intrinsic information content. Data Knowl. Eng. 68(11), 1289–1308 (2009)
Article Google Scholar
Quercini, G., Reynaud, C.: Entity discovery, annotation in tables. In: Proceedings of the 16th International Conference on Extending Database Technology, pp. 693–704. ACM (2013)
Google Scholar
Röder, M., Usbeck, R., Speck, R., Ngomo, A.-C.N.: CETUS – a baseline approach to type extraction. In: Gandon, F., Cabrio, E., Stankovic, M., Zimmermann, A. (eds.) SemWebEval 2015. CCIS, vol. 548, pp. 16–27. Springer, Heidelberg (2015). doi:10.1007/978-3-319-25518-7_2
Chapter Google Scholar
Stein, B., Zu Eissen, S.M.: Topic identification: Framework and application. In: Proceedings of the International Conference on Knowledge Management (2004)
Google Scholar
Treeratpituk, P., Callan, J.: Automatically labeling hierarchical clusters. In: Proceedings of the International Conference on Digital Government Research (2006)
Google Scholar
Venetis, P., Halevy, A., Madhavan, J., Paşca, M., Shen, W., Wu, F., Miao, G., Wu, C.: Recovering semantics of tables on the web. Proc. VLDB Endowment 4(9), 528–538 (2011)
Article Google Scholar

Download references

Acknowledgments

This work was partially funded by the French National Research Agency through the CAIR ANR-14-CE23-0006 project.

Author information

Authors and Affiliations

DAVID Laboratory, University of Versailles Saint-Quentin-en-Yvelines, Versailles, France
Kenza Kellou-Menouer & Zoubida Kedad

Authors

Kenza Kellou-Menouer
View author publications
You can also search for this author in PubMed Google Scholar
Zoubida Kedad
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kenza Kellou-Menouer .

Editor information

Editors and Affiliations

ADAPT Centre, Trinity College Dublin, Dublin 2, Ireland
Christophe Debruyne
University of Lorraine, Vandoeuvre-les-Nancy, France
Hervé Panetto
TU Graz, Graz, Austria
Robert Meersman
La Trobe University, Melbourne, Australia
Tharam Dillon
Institute of Computer Languages, TU Wien, Vienna, Austria
eva Kühn
ADAPT Centre, Trinity College Dublin, Dublin 2, Ireland
Declan O'Sullivan
Università degli Studi di Milano Crema, Crema, Italy
Claudio Agostino Ardagna

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kellou-Menouer, K., Kedad, Z. (2016). Class Annotation Using Linked Open Data. In: Debruyne, C., et al. On the Move to Meaningful Internet Systems: OTM 2016 Conferences. OTM 2016. Lecture Notes in Computer Science(), vol 10033. Springer, Cham. https://doi.org/10.1007/978-3-319-48472-3_44

Download citation

DOI: https://doi.org/10.1007/978-3-319-48472-3_44
Published: 18 October 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-48471-6
Online ISBN: 978-3-319-48472-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics