Abstract
Nowadays, linked data (LD) are ubiquitous and mining them for knowledge, e.g. frequent patterns, needs not be argued for.
A domain ontology (DO) on top of a LD dataset enables the discovery of abstract patterns, a.k.a. generalized, capturing –rather than identical sub-structures–conceptual regularities in data. Yet with the resulting ontologically-generalized graph patterns (OGP), a miner faces the combined challenges of graph topology and a label hierarchy, which amplifies well-known difficulties with graphs such as support counting or non redundant pattern listing. As OGP mining is yet to be addressed in its generality, we propose a formalization and study two workaround methods that avoid tackling it head-on, i.e. deal with each aspect separately. Both perform pure graph mining with adapted label sets: gSpan-OF merely strips labels of hierarchical structure while Tax-ON first mines frequent graph topologies with only root classes as labels, then successively refines labels on each topology.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Pattern labels in \({\mathcal L}_p\) will be prefixed by & to differentiate them from ontology entities.
References
Adda, M., et al.: On the discovery of semantically enhanced sequential patterns. In: 4th ICMLA, p. 8. IEEE (2005)
Adda, M., et al.: A framework for mining meaningful usage patterns within a semantically enhanced web portal. In: 3rd C* Conference on Computer Science and Software Engineering, pp. 138–147 (2010)
Aggarwal, C.C., Han, J. (eds.): Frequent Pattern Mining. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07821-2
Anand, S., et al.: The role of domain knowledge in data mining. In: CIKM, pp. 37–43 (1995)
Berendt, B.: Using and learning semantics in frequent subgraph mining. In: Nasraoui, O., Zaïane, O., Spiliopoulou, M., Mobasher, B., Masand, B., Yu, P.S. (eds.) WebKDD 2005. LNCS (LNAI), vol. 4198, pp. 18–38. Springer, Heidelberg (2006). https://doi.org/10.1007/11891321_2
Brett, D., et al.: A survey of semantic web technology for agriculture. Inf. Process. Agric. 6, 487–501 (2019)
Cakmak, A., Ozsoyoglu, G.: Taxonomy-superimposed graph mining. In: 11th EDBT, pp. 217–228. ACM (2008)
Cannataro, M., Santos, R.D., et al.: Biomedical and bioinformatics challenges to computer science. Procedia Comput. Sci. 1(1), 931–933 (2010)
Dou, D., et al.: Semantic data mining: a survey of ontology-based approaches. In: IEEE ICSC, pp. 244–251 (2015)
Fuentes, V., et al.: Dairy ontology to support precision farming. In: 12th ICBO (2021)
Gonçalves Frasco, C., et al.: Towards an effective decision-making system based on cow profitability using deep learning. In: 12th ICAART, pp. 949–958 (2020)
Inokuchi, A.: Mining generalized substructures from a set of labeled graphs. In: Fourth IEEE International Conference on Data Mining (ICDM 2004), pp. 415–418. IEEE (2004)
Inokuchi, A., Washio, T., Motoda, H.: An apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45372-5_2
Jiang, T., et al.: Mining generalized associations of semantic relations from textual web content. IEEE Trans. Knowl. Data Eng. 19(2), 164–179 (2007)
Kramer, F., Beißbarth, T.: Working with ontologies. In: Keith, J.M. (ed.) Bioinformatics. MMB, vol. 1525, pp. 123–135. Springer, New York (2017). https://doi.org/10.1007/978-1-4939-6622-6_6
Martin, T., et al.: Leveraging a domain ontology in (neural) learning from heterogeneous data. In: CIKM (Workshops) (2020)
Monnin, P.: Matching and mining in knowledge graphs of the web of data-applications in pharmacogenomics. Ph.D. thesis, Université de Lorraine (2020)
Nijssen, S., Kok, J.: A quickstart in frequent structure mining can make a difference. In: 10th ACM KDD, pp. 647–652 (2004)
Rettinger, A., et al.: Mining the semantic web. DMKD 24(3), 613–662 (2012)
Srikant, R., Agrawal, R.: Mining generalized association rules. Futur. Gener. Comput. Syst. 13(2–3), 161–180 (1997)
Szathmary, L., et al.: Towards rare itemset mining. In: 19th IEEE ICTAI, vol. 1, pp. 305–312, October 2007
Yan, X., Han, J.: gSpan: graph-based substructure pattern mining. In: IEEE ICDM, pp. 721–724 (2002)
Zhang, X., et al.: Mining link patterns in linked data. In: 13th WAIM, pp. 83–94 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Martin, T. et al. (2021). Towards Mining Generalized Patterns from RDF Data and a Domain Ontology. In: Kamp, M., et al. Machine Learning and Principles and Practice of Knowledge Discovery in Databases. ECML PKDD 2021. Communications in Computer and Information Science, vol 1524. Springer, Cham. https://doi.org/10.1007/978-3-030-93736-2_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-93736-2_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93735-5
Online ISBN: 978-3-030-93736-2
eBook Packages: Computer ScienceComputer Science (R0)