Regular Article
Approaches to Parallel Graph-Based Knowledge Discovery

https://doi.org/10.1006/jpdc.2000.1696Get rights and content

Abstract

The large amount of data collected today is quickly overwhelming researchers' abilities to interpret the data and discover interesting patterns. Knowledge discovery and data mining systems contain the potential to automate the interpretation process, but these approaches frequently utilize computationally expensive algorithms. In particular, scientific discovery systems focus on the utilization of richer data representation, sometimes without regard for scalability. This research investigates approaches for scaling a particular knowledge discovery–data mining system, Subdue, using parallel and distributed resources. Subdue has been used to discover interesting and repetitive concepts in graph-based databases from a variety of domains, but requires a substantial amount of processing time. Experiments that demonstrate scalability of parallel versions of the Subdue system are performed using CAD circuit databases, satellite images, and artificially-generated databases, and potential achievements and obstacles are discussed.

References (30)

  • G. Karypis et al.

    Multilevel k-way partitioning scheme for irregular graphs

    J. Parallel Distrib. Comput

    (1998)
  • L. Asker et al.

    Ensembles as a sequence of classifiers

    Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence

    (1997)
  • P. Chan et al.

    Toward parallel and distributed learning by meta-learning

    Working Notes of the AAAAI-93 Workshop on Knowledge Discovery in Databases

    (1993)
  • P. Cheeseman et al.

    Bayesian classification (AutoClass): Theory and results

  • S. Clearwater et al.

    A tool for knowledge-based introduction

    Proceedings of the Second International IEEE Conference on Tools for Artificial Intelligence

    (1990)
  • D. Conklin et al.

    Spatial analogy and subsumption

    Proceedings of the Machine Learning Conference

    (1992)
  • D.J. Cook et al.

    Substructure discovery using minimum description length and background knowledge

    J. Artificial Intelligent Res

    (1994)
  • D. J. Cook, and, L. B. Holder, Graph-based data mining, IEEE Intelligent Systems, in...
  • D.J. Cook et al.

    Scalable discovery of informative structural concepts using domain knowledge

    IEEE Expert

    (1996)
  • D.J. Cook et al.

    Maximizing the benefits of parallel search using machine learning

    Proceedings of the National Conference on Artificial Intelligence

    (1997)
  • D.J. Cook et al.

    Adaptable incremental deepening search

    J. Artificial Intelligence Research

    (1998)
  • S. Djoko

    The Role of Domain Knowledge in Substructure Discovery

    (Aug. 1995)
  • S. Djoko et al.

    Analyzing the benefits of domain knowledge in substructure discovery

    Proceedings of the First International Conference on Knowledge Discovery and Data Mining

    (1995)
  • U.M. Fayyad et al.

    From data mining to knowledge discovery: An overview

  • Cited by (38)

    • High performance GPU primitives for graph-tensor learning operations

      2021, Journal of Parallel and Distributed Computing
      Citation Excerpt :

      The entire cuGraph-Tensor library is running on top of NVIDIA CUDA [7] libraries including cuBLAS and cuSolver, and existing libraries including Magma [1] and KBLAS [4] to support upper applications. These graph-tensor operations are widely used in various applications such as video compression [35], data completion [16], data compression [30], machine learning [19] and pattern recognition [12], feature learning [2], knowledge discovery [5] and network analysis [30]. We develop a graph data completion application to demonstrate the usage of the cuGraph-Tensor library in Section 4.3.

    • Parallelizing Automatic Temporal Cognitive Tool for Large-Scale Online Learning Analytics

      2021, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    View all citing articles on Scopus

    This work is supported by NSF Grant IRI-9502260.

    f1

    E-mail: [email protected], [email protected], [email protected], [email protected]

    View full text