Abstract
In this work, we propose a method for computing generalized frequent subgraph patterns which is based on the graph edit distance. Graph data is often equipped with semantic information in form of an ontology, for example when dealing with linked data or knowledge graphs. Previous work suggests to exploit this semantic information in order to compute frequent generalized patterns, i.e. patterns for which the total frequency of all more specific patterns exceeds the frequency threshold. However, the problem of computing the frequency of a generalized pattern has not yet been fully addressed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bause, F., Schubert, E., Kriege, N.M.: EmbAssi: embedding assignment costs for similarity search in large graph databases. Data Mining Knowl. Disc. 36, 1–28 (2022). https://doi.org/10.1007/s10618-022-00850-3
Blumenthal, D.B., Boria, N., Gamper, J., Bougleux, S., Brun, L.: Comparing heuristics for graph edit distance computation. VLDB J. 29(1), 419–458 (2019). https://doi.org/10.1007/s00778-019-00544-1
Blumenthal, D.B., Bougleux, S., Gamper, J., Brun, L.: GEDLIB: A C++ library for graph edit distance computation. In: Conte, D., Ramel, J.-Y., Foggia, P. (eds.) GbRPR 2019. LNCS, vol. 11510, pp. 14–24. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20081-7_2
Cakmak, A., Özsoyoglu, G.: Taxonomy-superimposed graph mining. In: International Conference on Extending Database Technology, ACM International Conference Proceeding Series, vol. 261, pp. 217–228. ACM (2008). https://doi.org/10.1145/1353343.1353372
Debnath, A.K., de Compadre, R.L.L., Debnath, G., Shusterman, A.J., Hansch, C.: Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds correlation with molecular orbital energies and hydrophobicity. J. Med. Chem. 34(2), 786–797 (1991). https://doi.org/10.1021/jm00106a046
Deshpande, M., Kuramochi, M., Wale, N., Karypis, G.: Frequent substructure-based approaches for classifying chemical compounds. IEEE Trans. Knowl. Data Eng. 17(8), 1036–1050 (2005). https://doi.org/10.1109/TKDE.2005.127
Faci, A., Lesot, M.-J., Laudy, C.: cgSpan: Pattern mining in conceptual graphs. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds.) ICAISC 2021. LNCS (LNAI), vol. 12855, pp. 149–158. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87897-9_14
Helma, C., King, R.D., Kramer, S., Srinivasan, A.: The predictive toxicology challenge 2000–2001. Bioinformatics 17(1), 107–108 (2001). https://doi.org/10.1093/bioinformatics/17.1.107
Holder, L.B., Cook, D.J., Djoko, S.: Substucture discovery in the SUBDUE system. In: AAAI Workshop on Knowledge Discovery in Databases, pp. 169–180. AAAI Press (1994)
Inokuchi, A.: Mining generalized substructures from a set of labeled graphs. In: IEEE International Conference on Data Mining, pp. 415–418. IEEE Computer Society (2004). https://doi.org/10.1109/ICDM.2004.10041
Leal, W., Restrepo, G., Bernal, A.: A network study of chemical elements: from binary compounds to chemical trends. MATCH Commun. Math. Comput. Chem. 68, 417–442 (2012)
Martin, T., Fuentes, V., Valtchev, P., Diallo, A.B., Lacroix, R.: Generalized graph pattern discovery in linked data with data properties and a domain ontology. In: Symposium on Applied Computing, pp. 1890–1899. ACM (2022). https://doi.org/10.1145/3477314.3507301
Morris, C., et al.: Tudataset: a collection of benchmark datasets for learning with graphs. In: ICML Workshop on Graph Representation Learning and Beyond (2020)
Petermann, A., Micale, G., Bergami, G., Pulvirenti, A., Rahm, E.: Mining and ranking of generalized multi-dimensional frequent subgraphs. In: International Conference on Digital Information Management, pp. 236–245. IEEE (2017). https://doi.org/10.1109/ICDIM.2017.8244685
Sanfeliu, A., Fu, K.: A distance measure between attributed relational graphs for pattern recognition. IEEE Trans. Syst. Man Cybern. 13(3), 353–362 (1983). https://doi.org/10.1109/TSMC.1983.6313167
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Palme, R., Welke, P. (2023). Frequent Generalized Subgraph Mining via Graph Edit Distances. In: Koprinska, I., et al. Machine Learning and Principles and Practice of Knowledge Discovery in Databases. ECML PKDD 2022. Communications in Computer and Information Science, vol 1753. Springer, Cham. https://doi.org/10.1007/978-3-031-23633-4_32
Download citation
DOI: https://doi.org/10.1007/978-3-031-23633-4_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23632-7
Online ISBN: 978-3-031-23633-4
eBook Packages: Computer ScienceComputer Science (R0)