Fast Distributed Mining Algorithm of Maximum Frequent Itemsets Based on Cloud Computing

He, Bo

doi:10.1007/978-3-642-53932-9_40

Bo He^4,5,6

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 391))

Included in the following conference series:

International Conference on Information Computing and Applications

1589 Accesses

Abstract

The paper proposed a fast distributed mining algorithm of maximum frequent itemsets based on cloud computing, namely, FDMMFI algorithm. FDMMFI algorithm made nodes compute local maximum frequent itemsets by cloud computing, then the center node exchanged data with other nodes and combined, finally, global maximum frequent itemsets were gained by cloud computing. Theoretical analysis and experimental results suggest that under the same minimum support threshold, communication traffic and runtime of FDMMFI decreases while comparing with CD and FDM. The less the minimum support threshold, the better the three performance parameters of FDMMFI.FDMMFI algorithm is fast and effective.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

The Algorithm for Mining Global Frequent Itemsets Based on Cloud Computing

A distributed frequent itemset mining algorithm using Spark for Big Data analytics

Article 28 October 2015

PFIMD: a parallel MapReduce-based algorithm for frequent itemset mining

Article 13 March 2021

References

Mao, Y.X., Le Shi, B.: AFOPT-tax: An efficient method for mining generalized frequent itemsets. In: Nguyen, N.T., Le, M.T., Świątek, J. (eds.) ACIIDS 2010. LNCS, vol. 5990, pp. 82–92. Springer, Heidelberg (2010)
Chapter Google Scholar
Bayardo, R.J.: Efficiently mining long patterns form databases. In: Haas, L.M., Tiwary, A. (eds.) Proc. of the ACM SIGMOD International Conference on Management of Data, pp. 1–12. ACM Press, Dallas (2000)
Google Scholar
Song, Y.Q., Zhu, Z.H., Chen, G.: An algorithm and its updating algorithm based on FP-tree for mining maximum frequent itemsets. Journal of Software 14(9), 1586–1592 (2003) (in Chinese with English abstract)
Google Scholar
Agrawal, R., Shafer, J.C.: Parallel mining of association rules. IEEE Transaction on Knowledge and Data Engineering 8(6), 962–969 (1996)
Article Google Scholar
Cheung, D.W., Han, J.W., Ng, W.T., Tu, Y.J.: A fast distributed algorithm for mining association rules. In: Proceedings of IEEE 4th International Conference on Management of Data, Miami Beach, Florida, pp. 31–34 (1996)
Google Scholar
He, B.: Fast Mining of Global Maximum Frequent Itemsets in Distributed Database. Control and Decision 26(8), 1214–1218 (2011) (in Chinese with English abstract)
Google Scholar
Li, J., Khuller, A.D.S: On computing compression trees for data collection in wireless sensor networks. In: Proc. of the IEEE INFOCOM 2010, pp. 2115–2123. IEEE Press, Washington (2010)
Google Scholar
He, B., Yan, H.: Incremental Updating Algorithm of Global Maximum Frequent Itemsets in Distributed Database. Journal of Sichuan University (Engineering Science Edition) 44(3), 112–117 (2012) (in Chinese with English abstract)
Google Scholar
Park, J.S., Chen, M.S., Yu, P.S.: Efficient parallel data mining for association rules. In: Proceedings of the 4th International Conference on Information and Knowledge Management, Baltimore, Maryland, pp. 31–36 (1995)
Google Scholar
He, B.: Distributed Algorithm for Mining Association Rules Based on FP-tree. Control and Decision 27(4), 618–622 (2012) (in Chinese)
Google Scholar
Tao, L.M., Huang, L.P.: Cherry: An Algorithm for Mining Frequent Closed Itemsets without Subset Checking. Journal of Software 19(2), 379–388 (2008) (in Chinese with English abstract)
Google Scholar
Wang, L.H., Zhao, H.: Algorithms of Mining Global Maximum Frequent Itemsets Based on FP-Tree. Journal of Computer Research and Develpment 44(3), 445–451 (2007) (in Chinese with English abstract)
Google Scholar
Aggarwal, C., Yu, P.: Outlier detection for high dimensional data. In: Proc. of SIGMOD 2001, pp. 37–47 (2001)
Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proc. of VLDB 1994, pp. 487–499 (1994)
Google Scholar
Barnett, V., Lewis, T.: Outliers In Statistical Data. John Wiley, Reading (1994)
MATH Google Scholar
Breunig, M.M., Kriegel, H.-P., Ng, R.T., Sander, J.: OPTICS-OF: Identifying local outliers. In: Żytkow, J.M., Rauch, J. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 262–270. Springer, Heidelberg (1999)
Chapter Google Scholar
Breunig, M., Kriegel, H.-P., Ng, R., Sander, J.: Lof: Identifying density-based local outliers. In: Proc. of SIGMOD 2000, pp. 93–104 (2000)
Google Scholar
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proc. of KDD 1996, pp. 226–231 (1996)
Google Scholar
Guha, S., Rastogi, R., Shim, K.: Cure: An efficient clustering algorithm for large databases. In: Proc. of SIGMOD 1998, pp. 73–84 (1998)
Google Scholar
Hawkins, D.: Identification of Outliers. Chapman and Hall, Reading (1980)
Book MATH Google Scholar
Hussain, F., Liu, H., Tan, C.L., Dash, M.: Discretization: An enabling technique. Technical Report TRC6/99, National University of Singapore, School of Computing (1999)
Google Scholar
Jin, W., Tung, A.K., Han, J.: Mining top-n local outliers in large databases. In: Proc. of KDD 2001, pp. 293–298 (2001)
Google Scholar
Knorr, E., Ng, R.: Algorithms for mining distance-based outliers in large datasets. In: Proc. of VLDB 1998, pp. 392–403 (1998)
Google Scholar
Knorr, E., Ng, R.: Finding intensional knowledge of distance-based outliers. In: Proc. of VLDB 1999, pp. 211–222 (1999)
Google Scholar
Merz, G., Murphy, P.: Uci repository of machine learning databases. Technical Report, University of California, Department of Information and Computer Science (1996), http://www.ics.uci.edu/mlearn/MLRepository.html

Download references

Author information

Authors and Affiliations

School of Computer Science and Engineering, Chongqing University of Technology, ChongQing, 400054, China
Bo He
State Key Laboratories for Novel Software Technology, Nanjing University, 210093, China
Bo He
Shenzhen Key Laboratory of High Performance Data Mining, Shenzhen, 518055, China
Bo He

Authors

Bo He
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Shanghai Jiao Tong University, 800 Dongchuan Road, Dianxinqunlou 1-401, 200240, Shanghai, China
Yuhang Yang
School of Electrical and Electronic Engineering, Nanyang Technological University, Nanyang Avenue, 639798, Singapore, Singapore
Maode Ma
College of Science, Hebei United University, 063009, Tangshan, Hebei, China
Baoxiang Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

He, B. (2013). Fast Distributed Mining Algorithm of Maximum Frequent Itemsets Based on Cloud Computing. In: Yang, Y., Ma, M., Liu, B. (eds) Information Computing and Applications. ICICA 2013. Communications in Computer and Information Science, vol 391. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-53932-9_40

Download citation

DOI: https://doi.org/10.1007/978-3-642-53932-9_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-53931-2
Online ISBN: 978-3-642-53932-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics