Abstract
We propose a new probabilistic graph kernel. It is defined by the set of frequent subtrees generated from a small random sample of spanning trees of the transaction graphs. In contrast to the ordinary frequent subgraph kernel it can be computed efficiently for any arbitrary graphs. Due to its probabilistic nature, the embedding function corresponding to our graph kernel is not always correct. Our empirical results on artificial and real-world chemical datasets, however, demonstrate that the graph kernel we propose is much faster than other frequent pattern based graph kernels, with only marginal loss in predictive accuracy.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
We assume that the query graph has been selected from the same (unknown) probability distribution as the graphs in the input database.
- 2.
- 3.
MUTAG, NCI1, NCI109, and PTC were obtained from http://www.di.ens.fr/~shervashidze/code.html.
- 4.
- 5.
References
Chang, C.-C., Lin, C.-J.: Libsvm: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)
Chi, Y., Muntz, R.R., Nijssen, S., Kok, J.N.: Frequent subtree mining - an overview. Fundam. Inform. 66(1–2), 161–198 (2001)
Cook, D.J., Holder, L.B.: Substructure discovery using minimum description length and background knowledge. J. Artif. Intell. Res. (JAIR) 1, 231–255 (1994)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
Deshpande, M., Kuramochi, M., Wale, N., Karypis, G.: Frequent substructure-based approaches for classifying chemical compounds. IEEE Trans. Knowl. Data Eng. 17(8), 1036–1050 (2005)
Erdős, P., Rényi, A.: On random graphs. Publicationes Mathematicae 6(5), 290–297 (1959)
Gärtner, T.: Kernels for structured data. Ph.D. thesis, Universität Bonn (2005)
Haussler, D.: Convolution kernels on discrete structures. Technical report UCSC-CRL-99-10, Univerisity of California - Santa Cruz, July 1999
Horváth, T., Bringmann, B., De Raedt, L.: Frequent hypergraph mining. In: Muggleton, S.H., Otero, R., Tamaddoni-Nezhad, A. (eds.) ILP 2006. LNCS (LNAI), vol. 4455, pp. 244–259. Springer, Heidelberg (2007)
Horváth, T., Ramon, J.: Efficient frequent connected subgraph mining in graphs of bounded tree-width. Theor. Comput. Sci. 411(31–33), 2784–2797 (2010)
Mahé, P., Vert, J.-P.: Graph kernels based on tree patterns for molecules. Mach. Learn. 75(1), 3–35 (2009)
Shervashidze, N., Schweitzer, P., Van Leeuwen, E.J., Mehlhorn, K., Borgwardt, K.M.: Weisfeiler-lehman graph kernels. J. Mach. Learn. Res. 12, 2539–2561 (2011)
Wilson, D.B.: Generating random spanning trees more quickly than the cover time. In: Proceedings of the Twenty-Eighth Annual ACM Symposium on Theory of Computing, pp. 296–303. ACM (1996)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Welke, P., Horváth, T., Wrobel, S. (2016). Probabilistic Frequent Subtree Kernels. In: Ceci, M., Loglisci, C., Manco, G., Masciari, E., Ras, Z. (eds) New Frontiers in Mining Complex Patterns. NFMCP 2015. Lecture Notes in Computer Science(), vol 9607. Springer, Cham. https://doi.org/10.1007/978-3-319-39315-5_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-39315-5_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-39314-8
Online ISBN: 978-3-319-39315-5
eBook Packages: Computer ScienceComputer Science (R0)