Abstract:
Although FP-growth is known as an efficient frequent pattern mining algorithm, it is still a problem how to make it work for large transactional databases. In this paper,...Show MoreMetadata
Abstract:
Although FP-growth is known as an efficient frequent pattern mining algorithm, it is still a problem how to make it work for large transactional databases. In this paper, we propose a shared-memory, task parallelization method for FPgrowth. In the proposed method, each task receives conditional transactions from the current FP-tree, builds a conditional FP-tree, and generates the next tasks for the successive branches in the search tree. During this process, we dynamically estimate the workloads of such next tasks based on the corresponding conditional FP-trees and balance them among processors in the manner of work-stealing. Furthermore, we found that Borgelt’s FP-tree construction method notably contributes to our shared-memory parallelization. In order to exploit computational resources in a light-weight way, we implement the proposed method with Rust, a compiler language that can handle memory safely without garbage collection. In most of the benchmark datasets we tested, a performance improvement in parallelization was generally observed in comparison with two previous methods.
Published in: 2021 IEEE 12th International Workshop on Computational Intelligence and Applications (IWCIA)
Date of Conference: 06-07 November 2021
Date Added to IEEE Xplore: 30 November 2021
ISBN Information:
Print on Demand(PoD) ISSN: 1883-3977