Abstract
The increasing number of RDF data sources published on the web represents an unprecedented amount of information. However, querying these sources to extract the relevant information for a specific need represented by a target schema is a complex task as the alignment between the target and the source schemas might not be provided or incomplete. This paper presents an approach which aims at automatically populating the classes of a target schema. Our approach relies on a semi-supervised learning algorithm that iteratively identifies instance patterns in the data source that represent candidate instances for the target schema. We present some preliminary experiments showing the effectiveness of our approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bogatu, A., Fernandes, A.A.A., Paton, N.W., Konstantinou, N.: Dataset discovery in data lakes. In: 2020 IEEE 36th International Conference on Data Engineering (ICDE), pp. 709–720. IEEE, Dallas, TX, USA, April 2020
Fagin, R., Haas, L.M., Hernández, M., Miller, R.J., Popa, L., Velegrakis, Y.: Clio: schema mapping creation and data exchange. In: Borgida, A.T., Chaudhri, V.K., Giorgini, P., Yu, E.S. (eds.) Conceptual Modeling: Foundations and Applications. LNCS, vol. 5600, pp. 198–236. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-02463-4_12
Fernández, J.D., Martínez-Prieto, M.A., de la Fuente Redondo, P., Gutierrez, C.: Characterising RDF data sets. J. Inf. Sci. 44(2), 203–229 (2018)
Jaccard, P.: Distribution de la flore alpine dans le Bassin des Dranses et dans quelques régions voisines (1901)
Koutras, C., et al.: Valentine: evaluating matching techniques for dataset discovery. In: 37th IEEE International Conference on Data Engineering, ICDE 2021, Chania, Greece, 19–22 April 2021, pp. 468–479. IEEE (2021)
Mazilu, L., Paton, N.W., Fernandes, A.A., Koehler, M.: Schema mapping generation in the wild. Inf. Syst. 104, 101904 (2022)
Miller, R.J.: Open data integration. Proc. VLDB Endowment 11(12), 2130–2139 (2018)
Paulheim, H.: Knowledge graph refinement: a survey of approaches and evaluation methods. Semantic Web 8(3), 489–508 (2017)
Paulheim, H., Bizer, C.: Type inference on noisy RDF data. In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8218, pp. 510–525. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41335-3_32
Sacramento, E.R., Vidal, V.M.P., de Macêdo, J.A.F., Lóscio, B.F., Lopes, F.L.R., Casanova, M.A.: Towards automatic generation of application ontologies. J. Inf. Data Manag. 1(3), 535–550 (2010)
Yarowsky, D.: Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of the 33rd Annual Meeting on Association for Computational Linguistics, pp. 189–196. Association for Computational Linguistics, Cambridge, Massachusetts (1995)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chevallier, Z., Kedad, Z., Finance, B., Chaillan, F. (2024). Identifying Relevant Data in RDF Sources. In: Araújo, J., de la Vara, J.L., Santos, M.Y., Assar, S. (eds) Research Challenges in Information Science. RCIS 2024. Lecture Notes in Business Information Processing, vol 514. Springer, Cham. https://doi.org/10.1007/978-3-031-59468-7_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-59468-7_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-59467-0
Online ISBN: 978-3-031-59468-7
eBook Packages: Computer ScienceComputer Science (R0)