An efficient method for mining frequent sequential patterns using multi-Core processors

Huynh, Bao; Vo, Bay; Snasel, Vaclav

doi:10.1007/s10489-016-0859-y

An efficient method for mining frequent sequential patterns using multi-Core processors

Published: 02 November 2016

Volume 46, pages 703–716, (2017)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Bao Huynh¹,
Bay Vo^2,3 &
Vaclav Snasel⁴

590 Accesses
20 Citations
Explore all metrics

Abstract

The problem of mining frequent sequential patterns (FSPs) has attracted a great deal of research attention. Although there are many efficient algorithms for mining FSPs, the mining time is still high, especially for large or dense datasets. Parallel processing has been widely applied to improve processing speed for various problems. Some parallel algorithms have been proposed, but most of them have problems related to synchronization and load balancing. Based on a multi-core processor architecture, this paper proposes a load-balancing parallel approach called Parallel Dynamic Bit Vector Sequential Pattern Mining (pDBV-SPM) for mining FSPs from huge datasets using the dynamic bit vector data structure for fast determining support values. In the pDBV-SPM approach, the support count is sorted in ascending order before the set of frequent 1-sequences is partitioned into parts, each of which is assigned to a task on a processor so that most of the nodes in the leftmost branches will be infrequent and thus pruned during the search; this strategy helps to better balance the search tree. Experiments are conducted to verify the effectiveness of pDBV-SPM. The experimental results show that the proposed algorithm outperforms PIB-PRISM for mining FSPs in terms of mining time and memory usage.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Agrawal R, Srikant R (1995) Mining Sequential Patterns. ICDE’95:3–14
Agrawal R, Srikant R (1996a) Mining Sequential Patterns: Generalizations and Performance Improvements. EDBT’96:3–17
Andrew B (2008) Multi-Core Processor Architecture Explained. http://software.intel.com/en-us/articles/multi-core-processor-architecture-explained. Accessed 20 Aug 2014
Ayres J, Gehrke J, Yiu T, Flannick J (2002) Sequential Pattern Mining using a Bitmap Representaion. SIGKDD’02:1–7
Casali A, Ernst C (2013) Extracting Correlated Patterns on Multicore Architectures. CD-ARES’13:118–133
Cong S, Han J, Padua D (2005) Parallel Mining of Closed Sequential Patterns. ACM SIGKDD’05:562–567
Flouri T, Iliopoulos C, Park K, Pissis S (2012) GapMis-OMP: Pairwise Short-Read Alignment on Multi-core Architectures. Artificial Intelligence Applications and Innovations 382:593–601
Article Google Scholar
Fournier-Viger P, Gomariz A, Campos M, Thomas R (2014) Fast Vertical Mining of Sequential Patterns Using Co-occurrence Information. PAKDD’14:40–52
Gouda K, Hassaan M, Zaki M (2010) Prism: An Effective Approach for Frequent Sequence Mining via Prime-Block Encoding. J Comput Syst Sci 76(1):88–102
Article MathSciNet MATH Google Scholar
Han J, Pei J, Yin Y (2000a) Mining Frequent Patterns Without Candiyear Generation. ACM SIGMOD:1–12
Han J, Pei J, Asl BM, Chen Q, Dayal U, Hsu M (2000b) Freespan: Frequent Pattern-Projected Sequential Pattern Mining. KDD’00:355–359
Huynh B, Vo B (2015) Using Multi-Core Processors for Mining Frequent Sequential Patterns. ICIC Express Letters 9(11):3071–3079
Google Scholar
Laurent A, Négrevergne B, Sicard N, Termier A (2012) Efficient Parallel Mining of Gradual Patterns on Multicore Processors. Advances in Knowledge Discovery and Management 398:137–151
Article Google Scholar
Liu L, Li E, Zhang Y, Tang Z (2007) Optimization of Frequent Itemset Mining on Multiple-Core Processor. VLDB ’07:1275–1285
Lo D, Khoo SC, Liu C (2008) Mining and Ranking Generators of Sequential Patterns. SDM’08:553–564
Masseglia F, Cathala F, Poncelet P (1998) The PSP Approach for Mining Sequential Patterns. PKDD’98:176–184
Mannila H, Toivonen H, Verkamo AI (1997) Discovery of Frequent Episodes in Event Sequences. Data Min Knowl Disc:259–289
Negrevergne B, Termier A, Méhaut JF, Uno T (2010) Discovering Closed Frequent Itemsets on Multicore: Parallelizing Computations and Optimizing Memory Accesses. HPCS’10 IEEE:521–528
Negrevergne B, Termier A, Rousset MC, Méhaut J F (2014) Para Miner: A Generic Pattern Mining Algorithm for Multi-Core Architectures. Data Min Knowl Disc 28(3):593–633. http://link.springer.com/article/10.1007/s10618-013-0313-2
Article MATH Google Scholar
Nguyen D, Vo B, Le B (2014) Efficient Strategies for Parallel Mining Class Association Rules. Expert Systems with Applications 41(10):4716–4729
Article Google Scholar
Pham T, Luo J, Vo B (2013) An Effective Algorithm for Mining Closed Sequential Patterns and Their Minimal Generators based on Prefix Trees. Int J Intell Inf Database Syst 7(4):324–339
Google Scholar
Pham T, Luo J, Hong TP, Vo B (2014) An Efficient Method for Mining Non-Redundant Sequential Rules using Attributed Prefix Trees. Eng Appl Artif Intell 32:88–99
Article Google Scholar
Pei J, Han J, Asl BM, Wang J, Pinto H, Chen Q, Dayal U, Hsu M (2004) Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach. IEEE Trans Knowl Data Eng 16(11):1424–1440
Article Google Scholar
Raza K (2013) Application of Data Mining In Bioinformatics. Indian J Comput Sci Engineer 1(2):114–118
Google Scholar
Sánchez F, Cabarcas F, Ramirez A, Valero M (2010) Long DNA Sequence Comparison on Multicore Architectures. Euro-Par 2010 - Parallel Process 6272:247–259
Article Google Scholar
Schlegel B, Karnagel T, Kiefer T, Lehner W (2013) Scalable frequent itemset mining on many-core processors. In: The 9th International Workshop on Data Management on New Hardware ACM Article No. 3
Tran T, Le B, Vo B (2015) Combination of Dynamic Bit Vectors and Transaction Information for Mining Frequent Closed Sequences Efficiently. Eng Appl Artif Intell 38:183–189
Article Google Scholar
Van T, Vo B, Le B (2014) IMSRPreTree: An Improved Algorithm for Mining Sequential Rules based on The Prefix-Tree. Vietnam. J Comput Sci 1(2):97–105
Vijayarani S, Deepa S (2014) An Efficient Algorithm for Sequence Generation in Data Mining. Int J Cybernetics & Inf 3(1):21–30
Article Google Scholar
Vo B, Hong TP, Le B (2012) DBV-Miner: A dynamic bit-vector approach for fast mining frequent closed itemsets. Expert Systems With Applications 39(8):7196–7206
Article Google Scholar
Wang W, Yang J (2005) Mining Sequential Patterns from Large Data Sets. Adv Database Syst 28:1–161
Article MATH Google Scholar
Wang J, Han J (2004) BIDE: Efficient Mining of Frequent Closed Sequences. In: ICDE ’04:79–90
Wanga CS, Lee AJT (2009) Mining Inter-Sequence Patterns. Expert Systems with Aplications 36 (4):8649–8658
Article Google Scholar
Weichbroth P, Owoc M, Pleszkun M (2012) Web User Navigation Patterns Discovery from WWW Server Log Files. FedCSIS’12:1177–1176
Yan X, Han J, Afshar R (2003) CloSpan: Mining Closed Sequential Patterns in Large Datasets. In: SDM’03:166–177
Yu KM, Wu SH (2011) An Efficient Load Balancing Multi-Core Frequent Patterns Mining Algorithm. In: TrustCom’11:1408–1412
Zaki J (2001a) SPADE: An Efficient Algorithm for Mining Frequent Sequences. Machine Learning Journal 42:31–60
Article MATH Google Scholar
Zaki J (2001b) Parallel Sequence Mining on Shared-Memory Machines. J Parallel Distrib Comput 61(3):401–426
Article MATH Google Scholar
Zaki J, Wang TL, Toivonen TT (2002) BIOKDD01: Workshop on Data Mining in Bioinformatics. In: ACM SIGKDD Explorations, 3(2):71–73
Zubi ZS, Raiani MSE (2014) Using Web Logs Dataset Via Web Mining for User Behavior Understanding. Int J Comput Comm 8:103–111
Google Scholar

Download references

Author information

Authors and Affiliations

Center for Applied Information Technology, Ton Duc Thang University, Ho Chi Minh City, Vietnam
Bao Huynh
Division of Data Science, Ton Duc Thang University, Ho Chi Minh City, Vietnam
Bay Vo
Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City, Vietnam
Bay Vo
VŠB-Technical University of Ostrava, Ostrava-Poruba, Czech Republic
Vaclav Snasel

Authors

Bao Huynh
View author publications
You can also search for this author in PubMed Google Scholar
Bay Vo
View author publications
You can also search for this author in PubMed Google Scholar
Vaclav Snasel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bay Vo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huynh, B., Vo, B. & Snasel, V. An efficient method for mining frequent sequential patterns using multi-Core processors. Appl Intell 46, 703–716 (2017). https://doi.org/10.1007/s10489-016-0859-y

Download citation

Published: 02 November 2016
Issue Date: April 2017
DOI: https://doi.org/10.1007/s10489-016-0859-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An efficient method for mining frequent sequential patterns using multi-Core processors

Abstract

Access this article

Similar content being viewed by others

Parallel Computing Algorithms for Bigdata Frequent Pattern Mining

A Parallel Incremental Frequent Itemsets Mining IFIN+: Improvement and Extensive Evaluation

Multi-level dataset decomposition for parallel frequent itemset mining on a cluster of personal computers

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Access this article

Similar content being viewed by others

Parallel Computing Algorithms for Bigdata Frequent Pattern Mining

A Parallel Incremental Frequent Itemsets Mining IFIN+: Improvement and Extensive Evaluation

Multi-level dataset decomposition for parallel frequent itemset mining on a cluster of personal computers

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation