Skip to main content

Graph-Based Induction for General Graph Structured Data and Its Application to Chemical Compound Data

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1967))

Abstract

Most of the relations are represented by a graph structure, e.g., chemical bonding, Web browsing record, DNA sequence, Inference pattern (program trace), to name a few. Thus, efficiently finding characteristic substructures in a graph will be a useful technique in many important KDD/ML applications. However, graph pattern matching is a hard problem. We propose a machine learning technique called Graph- Based Induction (GBI) that efficiently extracts typical patterns from graph data in an approximate manner by stepwise pair expansion (pairwise chunking). It can handle general graph structured data, i.e., directed/undirected, colored/uncolored graphs with/without (self) loop and with colored/uncolored links. We show that its time complexity is almost linear with the size of graph. We, further, show that GBI can effiectively be applied to the extraction of typical patterns from chemical compound data from which to generate classification rules, and that GBI also works as a feature construction component for other machine learning tools.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R. and Srikant, R. Mining sequential patterns. Proc. of the Eleventh International Conference on Data Engineering (ICDE’95), pp. 3–14, 1995.

    Google Scholar 

  2. Mannila, H., Toivonen, H. and Verkamo, A. I. Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery, Vol.1, No.3, pp. 259–289, 1997.

    Article  Google Scholar 

  3. Sintani, T. and Kituregawa, M. Mining algorithms for sequential patterns in parallel: Hash based approach. Proc. of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98), pp. 283–294, 1998.

    Google Scholar 

  4. Srikant, R., Vu, Q. and Agrawal, R. Mining Association Rules with Item Constraints. Proc. of the Third International Conference on Knowledge Discovery and Data Mining (KDD-97), pp. 67–73, 1997.

    Google Scholar 

  5. Agrwal, R and Srikant, R. First Algorithms for Mining Association Rules. Proc. of the 20th VLDB Conference, pp. 487–499, 1994.

    Google Scholar 

  6. Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J. Classification and Regression Trees. Wadsworth & Brooks/Cole Advanced Books & Software, 1984.

    Google Scholar 

  7. Clark, P. and Niblett, T. The CN2 Induction Algorithm. Machine Learning Vol. 3, pp. 261–283, 1989.

    Google Scholar 

  8. Debnath, A. K., Lopez de Compadre, R. L., Debnath, G., Shusterman, A. J. and Hansch, C. Structure-Activity Relationship of Mutagenic Aromatic and Heteroaromatic Nitro Compunds. Correlation with Molecular Orbital Energies and Hydrophobicity J. Med. Chem., Vol. 34, pp. 786–797, 1991.

    Google Scholar 

  9. Michalski, R. S. Learning Flexible Concepts: Fundamental Ideas and a Method Based on Two-Tiered Representation. In Machine Learning, An Artificial Intelligence Approach, Vol. III, (eds. Kodrtoff Y. and Michalski T.), pp. 63–102, 1990.

    Google Scholar 

  10. S. Muggleton and L. de Raedt. Inductive Logic Programming: Theory and Methods. Journal of Logic Programming Vol. 19, No. 20, pp. 629–679, 1994.

    Article  MathSciNet  Google Scholar 

  11. Cook, D. J. and Holder, L. B. Substructure Discovery Using Minimum Description Length and Background Knowledge Journal of Artificial Intelligence Research, Vol. 1 pp. 231–235, 1994.

    Google Scholar 

  12. S. Fortin. The graph isomorphism problem. Technical Report 96-20, University of Alberta, Edomonton, Alberta, Canada., 1996. References

    Google Scholar 

  13. T. Matsuda, T. Horiuchi, H. Motoda and T. Washio. Extension of Graph-Based Induction for General Graph Structured Data. Knowledge Discovery and Data Mining: Current Issues and New Applications, Springer Verlag, LNAI 1805, pp. 420–431, 2000.

    Google Scholar 

  14. Matsumoto, T. and Tanabe, T. Carcinogenesis Prediction for Chlorinated Hydrocarbons using Neural Network (in Japanese). Japan Chemistry Program Exchange Journal Vol. 11 No. 1 pp. 29–34, 1999.

    Google Scholar 

  15. Motoda, H. and Yoshida. K. Machine Learning Techniques to Make Computers Easier to Use. Journal of Artificial Intelligence, Vol. 103, No. 1–2, pp. 295–321, 1998

    Article  MATH  Google Scholar 

  16. Quinlan, J. R. Induction of decision trees. Machine Learning, Vol. 1, pp. 81–106, 1986.

    Google Scholar 

  17. Quinlan, J. R. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.

    Google Scholar 

  18. Yoshida, K., Motoda, H., and Indurkhya, N. Induction as a Unified Learning Framework. Journal of Applied Intelligence, Vol. 4, pp. 297–328, 1994

    Article  Google Scholar 

  19. Yoshida, K. and Motoda, H. CLIP: Concept Learning from Inference Pattern. Artificial Intelligence. Vol. 75, No. 1, pp. 63–92, 1995.

    Article  Google Scholar 

  20. Yoshida, K. and Motoda, H. Table, Graph and Logic for Induction. Machine Intelligence, Vol. 15, pp 298–311, Oxford Univ. Press, 1999.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Matsuda, T., Horiuchi, T., Motoda, H., Washio, T. (2000). Graph-Based Induction for General Graph Structured Data and Its Application to Chemical Compound Data. In: Arikawa, S., Morishita, S. (eds) Discovery Science. DS 2000. Lecture Notes in Computer Science(), vol 1967. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44418-1_9

Download citation

  • DOI: https://doi.org/10.1007/3-540-44418-1_9

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41352-3

  • Online ISBN: 978-3-540-44418-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics