skip to main content
10.1145/2818869.2818903acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesase-bigdataConference Proceedingsconference-collections
research-article

Fast Mining Frequent Patterns with Secondary Memory

Published: 07 October 2015 Publication History

Abstract

Data mining technology has been widely studied and applied in recent years. Frequent pattern mining is one important technical field of such research. The frequent pattern mining technique is popular not only in academia but also in the business community. With advances in technology, databases have become so large that data mining is impossible because of memory restrictions. In this study, we propose a novel algorithm called Hybrid Mine (H-Mine) to help improve this situation. H-Mine saves a part of the information that is not stored in the memory, and through the use of mixed hard disk and memory mining we are able to complete data mining with limited memory. The results of empirical evaluation under various simulation conditions show that H-Mine delivers excellent performance in terms of execution efficiency and scalability.

References

[1]
Agrawal, R. and Srikant, R. 1994. Fast algorithms for mining association rules. Proceedings of the 20th International Conference on Very Large Data Bases. 1215, 487--499.
[2]
Han, J., Pei, J. and Yin, Y. 2000. Mining frequent patterns without candidate generation. In Proc. of the ACM SIGMOD Int. Conf. on Management of Data, 1--12.
[3]
Qiu, Y., Lan, Y.-J. and Xie, Q.-S. 2004. An improved algorithm of mining from FP-tree. Proceedings of the Third International Conference on Machine Learning and Cybernetics, 26--29.
[4]
Zhou, J. and Yu, K.-M. 2008. Tidset-based parallel FP-tree algorithm for the frequent pattern mining problem on PC clusters. Advances in Grid and Pervasive Computing, Lecture Notes in Computer Science, 5036, 18--28.
[5]
Javed, A. and Khokhar, A. 2004. Frequent pattern mining on message passing multiprocessor systems. Distributed and Parallel Databases, 16, 3 (2004), 321--334.
[6]
Lin, K. W. and Lo, Y.-C. 2013. Efficient algorithms for frequent pattern mining in many-task computing environments. Knowledge-Based Systems, 49 (2013), 10--21.
[7]
Schlegel, B., Gemulla, R. and Lehner, W. 2011. Memory-efficient frequent-itemset mining. EDBT/ICDT '11 Proceedings of the 14th International Conference on Extending Database Technology, 461--472.
[8]
Han, J., Pei, J., Yin, Y. and Mao, R. 2004. Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data mining and knowledge discovery, 8, 1 (2004), 53--87.
[9]
Grahne, G. and Zhu, J. 2004. Mining frequent itemsets from secondary memory. International Conference on Data Mining, 91--98.
[10]
Adnan, M. and Alhajj, R. 2009. DRFP-tree: disk-resident frequent pattern tree. Applied Intelligence, 30, 2 (2009), 84--97.
[11]
Cameron, J. J., Cuzzocrea, A. and Leung, C. K. 2013. Stream mining of frequent sets with limited memory. AC '13 Proceedings of the 28th Annual ACM Symposium on Applied Computing, 173--175.
[12]
Lin, K. W. and Deng, D.-J. 2010. A novel parallel algorithm for frequent pattern mining with privacy preserved in cloud computing environments. International Journal of Ad Hoc and Ubiquitous Computing, 6, 4 (2010), 205--215.
[13]
Agrawal, R. and Srikant, R. 2009. Quest Synthetic Data Generator. IBM Almaden Research Center, San Jose, California.
[14]
Orlando, S., Palmerini, P. and Perego, R. 2001. Dci: a hybrid algorithm for frequent set counting. High Performance Computing Lab. at ISTI-CNR., University of Venice.
[15]
Lucchese, C., Orlando, S., Perego, R., Silvestri, F. and Tolomei, G. 2011. Identifying task-based sessions in search engine query logs. Proceedings of the Forth International Conference on Web Search and Web Data Mining, WSDM 2011, 277--286.

Cited By

View all
  • (2018)Rare pattern mining: challenges and future perspectivesComplex & Intelligent Systems10.1007/s40747-018-0085-95:1(1-23)Online publication date: 10-Nov-2018

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ASE BD&SI '15: Proceedings of the ASE BigData & SocialInformatics 2015
October 2015
381 pages
ISBN:9781450337359
DOI:10.1145/2818869
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 October 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Data Mining
  2. Disk Storage
  3. Frequent Pattern Mining
  4. Main Memory

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ASE BD&SI '15
ASE BD&SI '15: ASE BigData & SocialInformatics 2015
October 7 - 9, 2015
Kaohsiung, Taiwan

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2018)Rare pattern mining: challenges and future perspectivesComplex & Intelligent Systems10.1007/s40747-018-0085-95:1(1-23)Online publication date: 10-Nov-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media