Identifying Relevant Data in RDF Sources

Chevallier, Zoé; Kedad, Zoubida; Finance, Béatrice; Chaillan, Frédéric

doi:10.1007/978-3-031-59468-7_11

Zoé Chevallier^10,11,
Zoubida Kedad¹⁰,
Béatrice Finance¹⁰ &
…
Frédéric Chaillan¹¹

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 514))

Included in the following conference series:

International Conference on Research Challenges in Information Science

40 Accesses

Abstract

The increasing number of RDF data sources published on the web represents an unprecedented amount of information. However, querying these sources to extract the relevant information for a specific need represented by a target schema is a complex task as the alignment between the target and the source schemas might not be provided or incomplete. This paper presents an approach which aims at automatically populating the classes of a target schema. Our approach relies on a semi-supervised learning algorithm that iteratively identifies instance patterns in the data source that represent candidate instances for the target schema. We present some preliminary experiments showing the effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Bogatu, A., Fernandes, A.A.A., Paton, N.W., Konstantinou, N.: Dataset discovery in data lakes. In: 2020 IEEE 36th International Conference on Data Engineering (ICDE), pp. 709–720. IEEE, Dallas, TX, USA, April 2020
Google Scholar
Fagin, R., Haas, L.M., Hernández, M., Miller, R.J., Popa, L., Velegrakis, Y.: Clio: schema mapping creation and data exchange. In: Borgida, A.T., Chaudhri, V.K., Giorgini, P., Yu, E.S. (eds.) Conceptual Modeling: Foundations and Applications. LNCS, vol. 5600, pp. 198–236. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-02463-4_12
Chapter Google Scholar
Fernández, J.D., Martínez-Prieto, M.A., de la Fuente Redondo, P., Gutierrez, C.: Characterising RDF data sets. J. Inf. Sci. 44(2), 203–229 (2018)
Article Google Scholar
Jaccard, P.: Distribution de la flore alpine dans le Bassin des Dranses et dans quelques régions voisines (1901)
Google Scholar
Koutras, C., et al.: Valentine: evaluating matching techniques for dataset discovery. In: 37th IEEE International Conference on Data Engineering, ICDE 2021, Chania, Greece, 19–22 April 2021, pp. 468–479. IEEE (2021)
Google Scholar
Mazilu, L., Paton, N.W., Fernandes, A.A., Koehler, M.: Schema mapping generation in the wild. Inf. Syst. 104, 101904 (2022)
Article Google Scholar
Miller, R.J.: Open data integration. Proc. VLDB Endowment 11(12), 2130–2139 (2018)
Article Google Scholar
Paulheim, H.: Knowledge graph refinement: a survey of approaches and evaluation methods. Semantic Web 8(3), 489–508 (2017)
Article Google Scholar
Paulheim, H., Bizer, C.: Type inference on noisy RDF data. In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8218, pp. 510–525. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41335-3_32
Chapter Google Scholar
Sacramento, E.R., Vidal, V.M.P., de Macêdo, J.A.F., Lóscio, B.F., Lopes, F.L.R., Casanova, M.A.: Towards automatic generation of application ontologies. J. Inf. Data Manag. 1(3), 535–550 (2010)
Google Scholar
Yarowsky, D.: Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of the 33rd Annual Meeting on Association for Computational Linguistics, pp. 189–196. Association for Computational Linguistics, Cambridge, Massachusetts (1995)
Google Scholar

Download references

Author information

Authors and Affiliations

DAVID Lab, University of Versailles Paris-Saclay, Versailles, France
Zoé Chevallier, Zoubida Kedad & Béatrice Finance
Grand Paris Sud, Evry-Courcouronnes, France
Zoé Chevallier & Frédéric Chaillan

Authors

Zoé Chevallier
View author publications
You can also search for this author in PubMed Google Scholar
Zoubida Kedad
View author publications
You can also search for this author in PubMed Google Scholar
Béatrice Finance
View author publications
You can also search for this author in PubMed Google Scholar
Frédéric Chaillan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zoé Chevallier .

Editor information

Editors and Affiliations

NOVA University Lisbon, Caparica, Portugal
João Araújo
University of Castilla La Mancha, Albacete, Albacete, Spain
Jose Luis de la Vara
University of Minho, Guimarães, Portugal
Maribel Yasmina Santos
Institut Mines-Télécom Business School, Evry, France
Saïd Assar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chevallier, Z., Kedad, Z., Finance, B., Chaillan, F. (2024). Identifying Relevant Data in RDF Sources. In: Araújo, J., de la Vara, J.L., Santos, M.Y., Assar, S. (eds) Research Challenges in Information Science. RCIS 2024. Lecture Notes in Business Information Processing, vol 514. Springer, Cham. https://doi.org/10.1007/978-3-031-59468-7_11

Download citation

DOI: https://doi.org/10.1007/978-3-031-59468-7_11
Published: 04 May 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-59467-0
Online ISBN: 978-3-031-59468-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Identifying Relevant Data in RDF Sources