Conferences >2018 IEEE 22nd International ...

PNPFI: An Efficient Parallel Frequent Itemsets Mining Algorithm

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Frequent itemsets mining (FIM) plays an important role in many data mining areas. With the explosion of data scale, a number of parallel FIM algorithms have been proposed...Show More

Metadata

Abstract:

Frequent itemsets mining (FIM) plays an important role in many data mining areas. With the explosion of data scale, a number of parallel FIM algorithms have been proposed. Although existing solutions have outstanding scalability, they suffer from high consumption of CPU and memory for recursively mining frequent itemsets based on a tree-structure. In this paper, we propose a novel parallel algorithm, named PNPFI. It employs three novel key optimizations. In detail, the itemsets are stored by the N-list structure, which is more compact than existing tree-based structure. It uses a new structure, called P-Subsume, to generate some frequent itemsets without the process of N-list intersection. In addition, PNPFI proposes a new load balancing strategy, which intelligently divides a large-scale FIM problem into a set of tasks based on the profiled load of each item. Compared with the state-of-the-art algorithms, experimental results show that PNPFI gets a performance improvement of 39% on average (max to 79%), and reduces the memory usage by 58% on average (max to 90%).

Published in: 2018 IEEE 22nd International Conference on Computer Supported Cooperative Work in Design ((CSCWD))

Date of Conference: 09-11 May 2018

Date Added to IEEE Xplore: 16 September 2018

ISBN Information:

DOI: 10.1109/CSCWD.2018.8465270

Conference Location: Nanjing, China