High Utility Itemsets Mining Based on Divide-and-Conquer Strategy

Liao, Jiyong; Wu, Sheng; Liu, Ailian

doi:10.1007/s11277-020-07753-w

High Utility Itemsets Mining Based on Divide-and-Conquer Strategy

Published: 27 August 2020

Volume 116, pages 1639–1657, (2021)
Cite this article

Wireless Personal Communications Aims and scope Submit manuscript

181 Accesses
2 Citations
Explore all metrics

Abstract

High utility itemsets mining has become a hot research topic in association rules mining. But many algorithms directly mine datasets, and there is a problem on dense datasets, that is, too many itemsets stored in each transaction. In the process of mining association rules, it takes a lot of storage space and affects the running efficiency of the algorithm. In the existing algorithms, there is a lack of efficient itemset mining algorithms for dense datasets. Aiming at this problem, a high utility itemsets mining algorithm based on divide-and-conquer strategy is proposed. Using the improved silhouette coefficient to select the best K-means cluster number, the datasets are divided into many smaller subclasses. Then, the association rules mining is performed by Boolean matrix compression operation on each subclass, and iteratively merge them to get the final mining results. We also analyze the time complexity of our method and Apriori algorithm. Finally, experimental results on several well-known real world datasets are conducted to show that the improved algorithm performs faster and consumes less memory on dense datasets, which can effectively improve the computational efficiency of the algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Trends and Future Perspective Challenges in Big Data

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Article 12 April 2024

Rashmin Gajera, Suresh Patel, … Ayush Solanki

Privacy-preserving data (stream) mining techniques and their impact on data mining accuracy: a systematic literature review

Article Open access 22 February 2023

U. H. W. A. Hewage, R. Sinha & M. Asif Naeem

References

Han, J., Pei, J., & Yin, Y. (2000). Mining frequent patterns without candidate generation. ACM Sigmod Record, 29(2), 1–12.
Article Google Scholar
Tseng, V. S., Wu, C. W., Shie, B. E., et al. (2010). UP-Growth: An efficient algorithm for high utility itemset mining. In Proceedings of the 16th international conference on knowledge discovery and data mining (pp. 253–262).
Tseng, V. S., Shie, B. E., Wu, C. W., et al. (2013). Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Transactions on Knowledge and Data Engineering, 25(8), 1772–1786.
Article Google Scholar
Agrawal, R., Imielinaki, T., & Swami, A. (1993). Mining association rules between sets of items in large databases. In Proceedings of the ACM SIGMOD international conference on the management of data (pp. 207–216).
Singh, H., & Dhir, R. (2012). An effective method for association rule mining based on transactional matrix. International Journal of Computer Applications, 39(9), 13–15.
Article Google Scholar
Fukuda, T., Morimoto, Y., Morishita, S., et al. (2001). Data mining with optimized two-dimensional association rules. ACM Transactions on Database Systems, 26(2), 179–213.
Article Google Scholar
Niu, K., Jiao, H., & Gao, Z., et al. (2017). A developed algorithm based on frequent matrix. In Proceedings of the 5th international conference on bioinformatics and computational biology (pp. 55–58).
Oguz, D., & Ergenc, B. (2012). Incremental itemset mining based on matrix Apriori algorithm. In Proceedings of the 14th international conference on data warehousing and knowledge discovery (pp. 192–204).
Ying, C., & Zhigang, M. (2016). Improved Apriori algorithm based on vector matrix optimization frequent items. Journal of Jilin University (Science Edition), 54(2), 349–353.
Google Scholar
Roul, R. K., Varshneya, S., Kalra, A., et al. (2015). A novel modified Apriori approach for web document clustering. Computer Science, 33, 159–171.
Google Scholar
Dahbi, A., Mouhir, M., & Balouki, Y. (2016). Classification of association rules based on K-means algorithm. In Proceedings the 4th IEEE international colloquium on information science and technology (pp. 300–305).
Yao, H., & Hamilton, H. J. (2006). Mining itemsets utilities from transaction databases. Data & Knowledge Engineering, 59(3), 603–626.
Article Google Scholar
Ling, W., Jian, Y., Meng, P. P., et al. (2018). Mining temporal association rules with frequent itemsets tree. Applied Soft Computing, 62, 817–829.
Article Google Scholar
Nguyen, L. T. T., Vo, B., Selamat, A., et al. (2017). Etarm: an efficient top-k association rule mining algorithm. Applied Intelligence, 48(5), 1148–1160.
Google Scholar
Ming, T. W. J., Justin, Z., Sanket, C., et al. (2018). Mining association rules for low-frequency itemsets. PLoS ONE, 13(7), e0198066.
Article Google Scholar
Lin, C. W., Yang, L., Fournier-Viger, P., et al. (2016). Mining high-utility itemsets based on particle swarm optimization. Engineering Applications of Artificial Intelligence, 55, 320–330.
Article Google Scholar
Jha, J., & Ragha, L. (2013). Educational data mining using improved Apriori algorithm. International Journal of Information and Computation Technology, 3(5), 411–418.
Google Scholar
Dutt, S., Choudhary, N., & Singh, D. (2014). An improved Apriori algorithm based on matrix data structure. Global Journal of Computer Science and Technology, 14(5), 6–10.
Google Scholar
Hartigan, J. A., & Wong, M. A. (1979). Algorithm as 136: a K-means clustering algorithm. Journal of the Royal Statistical Society Series C: Applied Statistics, 28(1), 100–108.
MATH Google Scholar
Chen, L., He, S., & Jiang, Q. (2009). Validation indices for projective clustering. Frontiers of Computer Science, 3(4), 477–484.
Article Google Scholar
Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53–65.
Article Google Scholar
Yu, C. H., Gao, F., Wang, Q. L., et al. (2016). Quantum algorithm for association rules mining. Physical Review A, 94(4), 1–8.
Article Google Scholar
Mai, T., Vo, B., & Nguyen, L. T. T. (2017). A lattice-based approach for mining high utility association rules. Information Sciences, 399, 81–97.
Article Google Scholar
Teng, S., Li, J., Li, R., & Zhang, W. (2013). The calculation of similarity and its application in data mining. In Proceedings the international conference on pervasive computing and the networked world (pp. 563–574).
Li, L., Li, Q., Wu, Y., et al. (2017). Mining association rules based on deep pruning strategies. Wireless Personal Communications, 102(3), 2157–2181.
Article Google Scholar
Zhao, C. J., Sun, Z. X., & Yuan, Y. (2016). An efficient association rule mining algorithm based on prejudging and screening. Journal of Electronics & Information Technology, 38(7), 1654–1659.
Google Scholar
Goethals, B., & Zak, M. (2016). Frequent itemset mining implementations repository. http://fimi.ua.ac.be/.
Pisharath, J., Liu, Y., & Parhi, J. (2016). NU-MineBench Version3.0.1. http://cucis.ece.northwestern.edu/projects/DMS/MineBench.html.
Yesilbudak, M. (2016). Clustering analysis of multidimensional wind speed data using k-means approach. In Proceedings of the 2016 IEEE international conference on renewable energy research and applications (pp. 961–965).

Download references

Author information

Authors and Affiliations

School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650500, Yunnan, China
Jiyong Liao, Sheng Wu & Ailian Liu

Authors

Jiyong Liao
View author publications
You can also search for this author in PubMed Google Scholar
Sheng Wu
View author publications
You can also search for this author in PubMed Google Scholar
Ailian Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiyong Liao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liao, J., Wu, S. & Liu, A. High Utility Itemsets Mining Based on Divide-and-Conquer Strategy. Wireless Pers Commun 116, 1639–1657 (2021). https://doi.org/10.1007/s11277-020-07753-w

Download citation

Published: 27 August 2020
Issue Date: February 2021
DOI: https://doi.org/10.1007/s11277-020-07753-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

High Utility Itemsets Mining Based on Divide-and-Conquer Strategy

Abstract

Access this article

Similar content being viewed by others

Trends and Future Perspective Challenges in Big Data

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Privacy-preserving data (stream) mining techniques and their impact on data mining accuracy: a systematic literature review

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

High Utility Itemsets Mining Based on Divide-and-Conquer Strategy

Abstract

Access this article

Similar content being viewed by others

Trends and Future Perspective Challenges in Big Data

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Privacy-preserving data (stream) mining techniques and their impact on data mining accuracy: a systematic literature review

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation