Mining Generalized Association Rule Using Parallel RDB Engine on PC Cluster

Pramudiono, Iko; Shintani, Takahiko; Tamura, Takayuki; Kitsuregawa, Masaru

doi:10.1007/3-540-48298-9_30

Mining Generalized Association Rule Using Parallel RDB Engine on PC Cluster

Iko Pramudiono⁶,
Takahiko Shintani⁶,
Takayuki Tamura⁶^nAff7 &
…
Masaru Kitsuregawa⁶

Conference paper
First Online: 01 January 2002

839 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1676))

Abstract

Data mining has been widely recognized as a powerful tool to explore added value from large-scale databases. One of data mining techniques, generalized association rule mining with taxonomy, is potential to discover more useful knowledge than ordinary flat association mining by taking application specific information into account. We proposed SQL queries, named TTR-SQL and TH-SQL to perform this kind of mining and evaluated them on PC cluster. Those queries can be more than 30% faster than Apriori based SQL query reported previously. Although RDBMS has powerful query processing ability through SQL, most data mining systems use specialized implementations to achieve better performance. There is a tradeoff between performance and portability. Performance is not necessarily sufficiently high but seamless integration with existing RDBMS would be considerably advantageous. Since RDB is already very popular, the feasibility of generalized association rule mining can be explored using the proposed SQL query instead of purchasing expensive mining software. In addition, parallel RDB is now also widely accepted. We showed that paralleling the SQL execution can offer the same performance with those native programs with 10 to 15 nodes. Since most organizations have a lot of PCs, which are not fully utilized. We are able to exploit such resources to explore the performance significantly.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

R. Agrawal, T. Imielinski, A. Swami. Mining Association Rules between Sets of Items in Large Databases. In Proc. of the ACM SIGMOD Conference on Management of Data, 1993.
Google Scholar
R. Agrawal, J.C. Shafer. Parallel Mining of Association Rules: Design, Implementation and Experience. IBM Research Report RJ 10004, 1996.
Google Scholar
R. Agrawal, R. Srikant. Fast Algorithms for Mining Association Rules. In Proc. of the VLDB Conference, 1994.
Google Scholar
M. Houtsma, A. Swami. Set-oriented Mining of Association Rules. In Proc. of International Conference on Data Engineering, 1995.
Google Scholar
S. Sarawagi, S. Thomas, R. Agrawal. Integrating Association Rule Mining with Relational Database Systems: Alternatives and Implications. In Proc. of the ACM SIGMOD Conference on Management of Data, 1998.
Google Scholar
S. Sarawagi, S. Thomas. Mining Generalized Association Rules and Sequential Patterns Using SQL Queries. In Proc. of KDD, 1998.
Google Scholar
R. Srikant, R. Agrawal. Mining Generalized Association Rules. In Proc. of VLDB, 1995.
Google Scholar
T. Tamura, M. Oguchi, M. Kitsuregawa. Parallel Database Processing on a 100 Node PC Cluster: Cases for Decision Support Query Processing and Data Mining. In Proc. of SC97: High Performance Networking and Computing, 1997.
Google Scholar

Download references

Author information

Takayuki Tamura
Present address: Information & Communication System Development Center, Mitsubishi Electric, Ohfuna 5-1-1, Kamakura-shi Kanagawa-ken, 247-8501, Japan

Authors and Affiliations

Institute of Industrial Science, The University of Tokyo, 7-22-1 Roppongi, Minato-ku, Tokyo, 106, Japan
Iko Pramudiono, Takahiko Shintani, Takayuki Tamura & Masaru Kitsuregawa

Authors

Iko Pramudiono
View author publications
You can also search for this author in PubMed Google Scholar
Takahiko Shintani
View author publications
You can also search for this author in PubMed Google Scholar
Takayuki Tamura
View author publications
You can also search for this author in PubMed Google Scholar
Masaru Kitsuregawa
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer and Information Science, University of South Australia, The Levels, Adelaide, Australia, 05
Mukesh Mohania
IFS, Technical University of Vienna, Resselgasse 3, A-1040, Vienna, Austria
A Min Tjoa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pramudiono, I., Shintani, T., Tamura, T., Kitsuregawa, M. (1999). Mining Generalized Association Rule Using Parallel RDB Engine on PC Cluster. In: Mohania, M., Tjoa, A.M. (eds) DataWarehousing and Knowledge Discovery. DaWaK 1999. Lecture Notes in Computer Science, vol 1676. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48298-9_30

Download citation

DOI: https://doi.org/10.1007/3-540-48298-9_30
Published: 01 March 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66458-1
Online ISBN: 978-3-540-48298-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics