Skip to main content

Cl-GBI: A Novel Approach for Extracting Typical Patterns from Graph-Structured Data

  • Conference paper
Book cover Advances in Knowledge Discovery and Data Mining (PAKDD 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3518))

Included in the following conference series:

Abstract

Graph-Based Induction (GBI) is a machine learning technique developed for the purpose of extracting typical patterns from graph-structured data by stepwise pair expansion (pair-wise chunking). GBI is very efficient because of its greedy search strategy, however, it suffers from the problem of overlapping subgraphs. As a result, some of typical patterns cannot be discovered by GBI though a beam search has been incorporated in an improved version of GBI called Beam-wise GBI (B-GBI). In this paper, improvement is made on the search capability by using a new search strategy, where frequent pairs are never chunked but used as pseudo nodes in the subsequent steps, thus allowing extraction of overlapping subgraphs. This new algorithm, called Cl-GBI (Chunkingless GBI), was tested against two datasets, the promoter dataset from UCI repository and the hepatitis dataset provided by Chiba University, and shown successful in extracting more typical patterns than B-GBI.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Blake, C.L., Keogh, E., Merz, C.J.: UCI Repository of Machine Learning Database (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html

  2. Borgelt, C., Berthold, M.R.: Mining Molecular Fragments: Finding Relevant Substructures of Molecules. In: Proc. ICDM 2002, pp. 51–58 (2002)

    Google Scholar 

  3. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth & Brooks/Cole Advanced Books & Software (1984)

    Google Scholar 

  4. Cook, D.J., Holder, L.B.: Substructure Discovery Using Minimum Description Length and Background Knowledge. Artificial Intelligence Research 1, 231–255 (1994)

    Google Scholar 

  5. Fortin, S.: The Graph Isomorphism Problem, Technical Report TR96-20, Department of Computer Science, University of Alberta, Edmonton, Canada (1996)

    Google Scholar 

  6. Gaemsakul, W., Matsuda, T., Yoshida, T., Motoda, M., Washio, T.: Classifier Construction by Graph-Based Induction for Graph-Structured Data. In: Proc. PAKDD 2003, pp. 52–62 (2003)

    Google Scholar 

  7. Huan, J., Wang, W., Prins, J.: Efficient Mining of Frequent Subgraphs in the Presence of Isomorphism. In: Proc. ICDM 2003, pp. 549–552 (2003)

    Google Scholar 

  8. Inokuchi, A., Washio, T., Motoda, H.: Complete Mining of Frequent Patterns from Graphs: Mining Graph Data. Machine Learning 50(3), 321–354 (2003)

    Article  MATH  Google Scholar 

  9. Inokuchi, A., Washio, T., Nishimura, K., Motoda, H.: A Fast Algorithm for Mining Frequent Connected Subgraphs. IBM Research Report RT0448, Tokyo Research Laboratory, IBM Japan (2002)

    Google Scholar 

  10. Kuramochi, M., Karypis, G.: An Efficient Algorithm for Discovering Frequent Subgraphs. IEEE Trans. Knowledge and Data Engineering 16(9), 1038–1051 (2004)

    Article  Google Scholar 

  11. Kuramochi, M., Karypis, G.: GREW–A Scalable Frequent Subgraph Discovery Algorithm. In: Proc. ICDM 2004, pp. 439–442 (2004)

    Google Scholar 

  12. Matsuda, T., Motoda, H., Yoshida, T., Washio, T.: Mining Patterns from Structured Data by Beam-wise Graph-Based Induction. In: Proc. DS 2002, pp. 422–429 (2002)

    Google Scholar 

  13. Quinlan, J.R.: Induction of Decision Trees. Machine Learning 1, 81–106 (1986)

    Google Scholar 

  14. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Francisco (1993)

    Google Scholar 

  15. Yan, X., Han, J.: gSpan: Graph-Based Structure Pattern Mining. In: Proc. ICDM 2002, pp. 721–724 (2002)

    Google Scholar 

  16. Yoshida, K., Motoda, M.: CLIP: Concept Learning from Inference Patterns. Artificial Intelligence 75(1), 63–92 (1995)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nguyen, P.C., Ohara, K., Motoda, H., Washio, T. (2005). Cl-GBI: A Novel Approach for Extracting Typical Patterns from Graph-Structured Data. In: Ho, T.B., Cheung, D., Liu, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2005. Lecture Notes in Computer Science(), vol 3518. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11430919_74

Download citation

  • DOI: https://doi.org/10.1007/11430919_74

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-26076-9

  • Online ISBN: 978-3-540-31935-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics