Scalable Technique to Discover Items Support from Trie Data Structure

Noraziah, A.; Abdullah, Zailani; Herawan, Tutut; Deris, Mustafa Mat

doi:10.1007/978-3-642-34062-8_65

A. Noraziah¹⁹,
Zailani Abdullah²⁰,
Tutut Herawan¹⁹ &
…
Mustafa Mat Deris²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7473))

Included in the following conference series:

International Conference on Information Computing and Applications

4809 Accesses
3 Citations

Abstract

One of the popular and compact trie data structure to represent frequent patterns is via frequent pattern tree (FP-Tree). There are two scanning processes involved in the original database before the FP-Tree can be constructed. One of them is to determine the items support (items and their support) that fulfill minimum support threshold by scanning the entire database. However, if the changes are suddenly occurred in the database, this process must be repeated all over again. In this paper, we introduce a technique called Fast Determination of Item Support Technique (F-DIST) to capture the items support from our proposed Disorder Support Trie Itemset (DOSTrieIT) data structure. Experiments through three UCI benchmark datasets show that the computational time to capture the items support using F-DIST from DOSTrieIT is significantly outperformed the classical FP-Tree technique about 3 orders of magnitude, thus verify its scalability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Han, J., Pei, H., Yin, Y.: Mining Frequent Patterns without Candidate Generation. In: Proceeding of the 2000 ACM SIGMOD, pp. 1–12 (2000)
Google Scholar
Zheng, Z., Kohavi, R., Mason, L.: Real World Performance of Association Rule Algorithms. In: Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 401–406. ACM Press (August 2001)
Google Scholar
Han, J., Pei, J.: Mining Frequent Pattern without Candidate Itemset Generation: A Frequent Pattern Tree Approach. Data Mining and Knowledge Discovery 8, 53–87 (2004)
Article MathSciNet Google Scholar
Agrawal, R., Imielinski, T., Swami, A.: Database Mining: A Performance Perspective. IEEE Transactions on Knowledge and Data Engineering 5(6), 914–925 (1993)
Article Google Scholar
Liu, G., Lu, H., Lou, W., Xu, Yu, J.X.: Efficient Mining of Frequent Patterns using Ascending Frequency Ordered Prefix-Tree. Data Mining and Knowledge Discovery 9, 249–274 (2004)
Google Scholar
Koh, J.-L., Shieh, S.-F.: An Efficient Approach for Maintaining Association Rules Based on Adjusting FP-Tree Structures. In: Lee, Y., Li, J., Whang, K.-Y., Lee, D. (eds.) DASFAA 2004. LNCS, vol. 2973, pp. 417–424. Springer, Heidelberg (2004)
Chapter Google Scholar
Li, X., Deng, Z.-H., Tang, S.-W.: A Fast Algorithm for Maintenance of Association Rules in Incremental Databases. In: Li, X., Zaïane, O.R., Li, Z.-H. (eds.) ADMA 2006. LNCS (LNAI), vol. 4093, pp. 56–63. Springer, Heidelberg (2006)
Chapter Google Scholar
Cheung, W., Zaïane, O.R.: Incremental Mining of Frequent Patterns without Candidate Generation of Support Constraint. In: Proceeding of the 7th International Database Engineering and Applications Symposium, IDEAS 2003 (2003)
Google Scholar
Hong, T.-P., Lin, J.-W., We, Y.-L.: Incrementally Fast Updated Frequent Pattern Trees. An International Journal of Expert Systems with Applications 34(4), 2424–2435 (2008)
Article Google Scholar
Tanbeer, S.K., Ahmed, C.F., Jeong, B.S., Lee, Y.K.: Efficient Single-Pass Frequent Pattern Mining Using a Prefix-Tree. Information Science 279, 559–583 (2009)
Article MathSciNet Google Scholar
Totad, S.G., Geeta, R.B., Reddy, P.P.: Batch Processing for Incremental FP-Tree Construction. International Journal of Computer Applications 5(5), 28–32 (2010)
Article Google Scholar
Agrawal, R., Shafer, J.: Parallel Mining of Association Rules: Design, Implementation, and Experience. IEEE Transaction Knowledge and Data Engineering 8, 962–969 (1996)
Article Google Scholar
Pei, J., Han, J., Lu, H., Nishio, S., Tang, S., Yang, D.: Hmine: Hyper-Structure Mining of Frequent Patterns in Large Databases. In: The Proceedings of IEEE International Conference on Data Mining, pp. 441–448 (2001)
Google Scholar
Pietracaprina, Zandolin, D.: Mining Frequent Item sets Using Patricia Tries. In: The Proceedings of the ICDM 2003 (2003)
Google Scholar
Grahne, G., Zhu, J.: Efficiently using prefix-trees in mining frequent itemsets. In: Proceeding of FIMI 2003 (2003)
Google Scholar
Woon, Y.K., Ng, W.K., Lim, E.P.: A Support Order Trie for Fast Frequent Itemset Discovery. IEEE Transactions on Knowledge and Data Engineering 16(7), 875–879 (2004)
Article Google Scholar
Leung, C.K.-S., Khan, Q.I., Li, Z., Hoque, T.: CanTree: A Canonical-Order Tree for Incremental Frequent-Pattern Mining. Knowledge Information System 11(3), 287–311 (2007)
Article Google Scholar
Tanbeer, S.K., Ahmed, C.F., Jeong, B.-S., Lee, Y.-K.: CP-Tree: A Tree Structure for Single-Pass Frequent Pattern Mining. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 1022–1027. Springer, Heidelberg (2008)
Chapter Google Scholar
Tanbeer, S.K., Ahmed, C.F., Jeong, B.-S., Lee, Y.-K.: Sliding Window-based Frequent Pattern Mining Over Data Streams. Information Sciences 179, 3843–3865 (2009)
Article MathSciNet Google Scholar
Ivancsy, R., Vajk, I.: Fast Discovery of Frequent Itemsets: a Cubic Structure-Based Approach. Informatica (Slovenia) 29(1), 71–78 (2005)
Google Scholar
Frequent Itemset Mining Dataset Repository, http://fimi.ua.ac.be/data/
Abdullah, Z., Herawan, T., Deris, M.M.: An Alternative Measure for Mining Weighted Least Association Rule and Its Framework. In: Zain, J.M., Wan Mohd, W.M.B., El-Qawasmeh, E. (eds.) ICSECS 2011, Part II. CCIS, vol. 180, pp. 480–494. Springer, Heidelberg (2011)
Chapter Google Scholar
Herawan, T., Yanto, I.T.R., Deris, M.M.: Soft Set Approach for Maximal Association Rules Mining. In: Ślęzak, D., Kim, T.-H., Zhang, Y., Ma, J., Chung, K.-I. (eds.) DTA 2009. CCIS, vol. 64, pp. 163–170. Springer, Heidelberg (2009)
Chapter Google Scholar
Abdullah, Z., Herawan, T., Deris, M.M.: Mining Significant Least Association Rules Using Fast SLP-Growth Algorithm. In: Kim, T.-H., Adeli, H. (eds.) AST/UCMA/ISA/ACN 2010. LNCS, vol. 6059, pp. 324–336. Springer, Heidelberg (2010)
Chapter Google Scholar
Herawan, T., Deris, M.M.: A soft set approach for association rules mining. Knowledge Based Systems 24(1), 186–195 (2011)
Article Google Scholar
Abdullah, Z., Herawan, T., Noraziah, A., Deris, M.M.: Extracting Highly Positive Association Rules from Students’ Enrollment Data. Procedia Social and Behavioral Sciences 28, 107–111 (2011)
Article Google Scholar
Abdullah, Z., Herawan, T., Noraziah, A., Deris, M.M.: Mining Significant Association Rules from Educational Data using Critical Relative Support Approach. Procedia Social and Behavioral Sciences 28, 97–101 (2011)
Article Google Scholar
Herawan, T., Vitasari, P., Abdullah, Z.: Mining Interesting Association Rules of Student Suffering Mathematics Anxiety. In: Zain, J.M., Wan Mohd, W.M.B., El-Qawasmeh, E. (eds.) ICSECS 2011, Part II. CCIS, vol. 180, pp. 495–508. Springer, Heidelberg (2011)
Chapter Google Scholar
Herawan, T., Vitasari, P., Abdullah, Z.: Mining Interesting Association Rules on Student Suffering Study Anxieties using SLP-Growth Algorithm. International Journal of Knowledge and Systems Science 3(2), 24–41 (2012)
Article Google Scholar
Herawan, T., Yanto, I.T.R., Deris, M.M.: SMARViz: Soft Maximal Association Rules Visualization. In: Badioze Zaman, H., Robinson, P., Petrou, M., Olivier, P., Schröder, H., Shih, T.K. (eds.) IVIC 2009. LNCS, vol. 5857, pp. 664–674. Springer, Heidelberg (2009)
Chapter Google Scholar
Abdullah, Z., Herawan, T., Deris, M.M.: Visualizing the Construction of Incremental Disorder Trie Itemset Data Structure (DOSTrieIT) for Frequent Pattern Tree (FP-Tree). In: Badioze Zaman, H., Robinson, P., Petrou, M., Olivier, P., Shih, T.K., Velastin, S., Nyström, I. (eds.) IVIC 2011, Part I. LNCS, vol. 7066, pp. 183–195. Springer, Heidelberg (2011)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Computer Systems and Software Engineering, Universiti Malaysia Pahang Lebuhraya Tun Razak, 26300, Kuantan, Pahang, Malaysia
A. Noraziah & Tutut Herawan
Department of Computer Science, Universiti Malaysia Terengganu, 21030, Kuala, Terengganu, Malaysia
Zailani Abdullah
Faculty of Computer Science and Information Technology, Universiti Tun Hussein Onn Malaysia, Parit Raja, Batu Pahat, 86400, Johor, Malaysia
Mustafa Mat Deris

Authors

A. Noraziah
View author publications
You can also search for this author in PubMed Google Scholar
Zailani Abdullah
View author publications
You can also search for this author in PubMed Google Scholar
Tutut Herawan
View author publications
You can also search for this author in PubMed Google Scholar
Mustafa Mat Deris
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

College of Science, Hebei United University, 063000, Tangshan, Hebei, China
Baoxiang Liu
Nanyang Technological University, Singapore
Maode Ma
College of Science, Hebei United University, 063009, Tangshan, Hebei, China
Jincai Chang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Noraziah, A., Abdullah, Z., Herawan, T., Deris, M.M. (2012). Scalable Technique to Discover Items Support from Trie Data Structure. In: Liu, B., Ma, M., Chang, J. (eds) Information Computing and Applications. ICICA 2012. Lecture Notes in Computer Science, vol 7473. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34062-8_65

Download citation

DOI: https://doi.org/10.1007/978-3-642-34062-8_65
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34061-1
Online ISBN: 978-3-642-34062-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics