Graph-Based Induction for General Graph Structured Data and Its Application to Chemical Compound Data

Matsuda, Takashi; Horiuchi, Tadashi; Motoda, Hiroshi; Washio, Takashi

doi:10.1007/3-540-44418-1_9

Graph-Based Induction for General Graph Structured Data and Its Application to Chemical Compound Data

Takashi Matsuda³,
Tadashi Horiuchi³,
Hiroshi Motoda³ &
…
Takashi Washio³

Conference paper
First Online: 19 October 2001

382 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1967))

Abstract

Most of the relations are represented by a graph structure, e.g., chemical bonding, Web browsing record, DNA sequence, Inference pattern (program trace), to name a few. Thus, efficiently finding characteristic substructures in a graph will be a useful technique in many important KDD/ML applications. However, graph pattern matching is a hard problem. We propose a machine learning technique called Graph- Based Induction (GBI) that efficiently extracts typical patterns from graph data in an approximate manner by stepwise pair expansion (pairwise chunking). It can handle general graph structured data, i.e., directed/undirected, colored/uncolored graphs with/without (self) loop and with colored/uncolored links. We show that its time complexity is almost linear with the size of graph. We, further, show that GBI can effiectively be applied to the extraction of typical patterns from chemical compound data from which to generate classification rules, and that GBI also works as a feature construction component for other machine learning tools.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R. and Srikant, R. Mining sequential patterns. Proc. of the Eleventh International Conference on Data Engineering (ICDE’95), pp. 3–14, 1995.
Google Scholar
Mannila, H., Toivonen, H. and Verkamo, A. I. Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery, Vol.1, No.3, pp. 259–289, 1997.
Article Google Scholar
Sintani, T. and Kituregawa, M. Mining algorithms for sequential patterns in parallel: Hash based approach. Proc. of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98), pp. 283–294, 1998.
Google Scholar
Srikant, R., Vu, Q. and Agrawal, R. Mining Association Rules with Item Constraints. Proc. of the Third International Conference on Knowledge Discovery and Data Mining (KDD-97), pp. 67–73, 1997.
Google Scholar
Agrwal, R and Srikant, R. First Algorithms for Mining Association Rules. Proc. of the 20th VLDB Conference, pp. 487–499, 1994.
Google Scholar
Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J. Classification and Regression Trees. Wadsworth & Brooks/Cole Advanced Books & Software, 1984.
Google Scholar
Clark, P. and Niblett, T. The CN2 Induction Algorithm. Machine Learning Vol. 3, pp. 261–283, 1989.
Google Scholar
Debnath, A. K., Lopez de Compadre, R. L., Debnath, G., Shusterman, A. J. and Hansch, C. Structure-Activity Relationship of Mutagenic Aromatic and Heteroaromatic Nitro Compunds. Correlation with Molecular Orbital Energies and Hydrophobicity J. Med. Chem., Vol. 34, pp. 786–797, 1991.
Google Scholar
Michalski, R. S. Learning Flexible Concepts: Fundamental Ideas and a Method Based on Two-Tiered Representation. In Machine Learning, An Artificial Intelligence Approach, Vol. III, (eds. Kodrtoff Y. and Michalski T.), pp. 63–102, 1990.
Google Scholar
S. Muggleton and L. de Raedt. Inductive Logic Programming: Theory and Methods. Journal of Logic Programming Vol. 19, No. 20, pp. 629–679, 1994.
Article MathSciNet Google Scholar
Cook, D. J. and Holder, L. B. Substructure Discovery Using Minimum Description Length and Background Knowledge Journal of Artificial Intelligence Research, Vol. 1 pp. 231–235, 1994.
Google Scholar
S. Fortin. The graph isomorphism problem. Technical Report 96-20, University of Alberta, Edomonton, Alberta, Canada., 1996. References
Google Scholar
T. Matsuda, T. Horiuchi, H. Motoda and T. Washio. Extension of Graph-Based Induction for General Graph Structured Data. Knowledge Discovery and Data Mining: Current Issues and New Applications, Springer Verlag, LNAI 1805, pp. 420–431, 2000.
Google Scholar
Matsumoto, T. and Tanabe, T. Carcinogenesis Prediction for Chlorinated Hydrocarbons using Neural Network (in Japanese). Japan Chemistry Program Exchange Journal Vol. 11 No. 1 pp. 29–34, 1999.
Google Scholar
Motoda, H. and Yoshida. K. Machine Learning Techniques to Make Computers Easier to Use. Journal of Artificial Intelligence, Vol. 103, No. 1–2, pp. 295–321, 1998
Article MATH Google Scholar
Quinlan, J. R. Induction of decision trees. Machine Learning, Vol. 1, pp. 81–106, 1986.
Google Scholar
Quinlan, J. R. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
Google Scholar
Yoshida, K., Motoda, H., and Indurkhya, N. Induction as a Unified Learning Framework. Journal of Applied Intelligence, Vol. 4, pp. 297–328, 1994
Article Google Scholar
Yoshida, K. and Motoda, H. CLIP: Concept Learning from Inference Pattern. Artificial Intelligence. Vol. 75, No. 1, pp. 63–92, 1995.
Article Google Scholar
Yoshida, K. and Motoda, H. Table, Graph and Logic for Induction. Machine Intelligence, Vol. 15, pp 298–311, Oxford Univ. Press, 1999.
Google Scholar

Download references

Author information

Authors and Affiliations

I.S.I.R., Osaka University, 8-1 Mihogaoka, 567-0047, Ibaraki, Osaka, JAPAN
Takashi Matsuda, Tadashi Horiuchi, Hiroshi Motoda & Takashi Washio

Authors

Takashi Matsuda
View author publications
You can also search for this author in PubMed Google Scholar
Tadashi Horiuchi
View author publications
You can also search for this author in PubMed Google Scholar
Hiroshi Motoda
View author publications
You can also search for this author in PubMed Google Scholar
Takashi Washio
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Information Science and Electrical Engineering, Department of Informatics, Kyushu University, 6-10-1 Hakozaki, Higashi-ku, 812-8581, Fukuoka, Japan
Setsuo Arikawa
Faculty of Science Department of Information Science, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, 113-0033, Tokyo, Japan
Shinichi Morishita

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Matsuda, T., Horiuchi, T., Motoda, H., Washio, T. (2000). Graph-Based Induction for General Graph Structured Data and Its Application to Chemical Compound Data. In: Arikawa, S., Morishita, S. (eds) Discovery Science. DS 2000. Lecture Notes in Computer Science(), vol 1967. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44418-1_9

Download citation

DOI: https://doi.org/10.1007/3-540-44418-1_9
Published: 19 October 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41352-3
Online ISBN: 978-3-540-44418-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics