The Augmented Itemset Tree: A Data Structure for Online Maximum Frequent Pattern Mining

Schmidt, Jana; Kramer, Stefan

doi:10.1007/978-3-642-24477-3_23

Jana Schmidt²² &
Stefan Kramer²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6926))

Included in the following conference series:

International Conference on Discovery Science

1396 Accesses
3 Citations

Abstract

This paper introduces an approach for incremental maximal frequent pattern (MFP) mining in sparse binary data, where instances are observed one by one. For this purpose, we propose the Augmented Itemset Tree (AIST), a data structure that incorporates features of the FP-tree into the itemset tree. In the given setting, we assume that just the data structure is maintained in main memory, and each instance is observed only once. The AIST not only stores observed frequent patterns, but also allows for quick frequency updates of relevant subpatterns. In order to quickly identify the current set of exact MFPs, potential candidates are extracted from former MFPs and patterns that occur in the new instance. The presented approach is evaluated concerning the runtime and memory requirements depending on the number of instances, minimum support and different settings of pattern properties. The obtained results suggest that AISTs are useful for mining maximal frequent itemsets in an online setting whenever larger patterns can be expected.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Study of Effective Mining Algorithms for Frequent Itemsets

A high utility itemset mining algorithm based on subsume index

Article 09 December 2015

Efficient algorithm for mining high average-utility itemsets in incremental transaction databases

Article 14 February 2017

References

Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: SIGMOD 1993: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data 1993, pp. 207–216. ACM, New York (1993)
Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB 1994: Proceedings of the 20th International Conference on Very Large Data Bases, pp. 487–499. Morgan Kaufmann Publishers Inc., San Francisco (1994)
Google Scholar
Cheung, D.W., Han, J., Ng, V.T., Wong, C.Y.: Maintenance of discovered association rules in large databases: An incremental updating technique. In: Proceedings of the Twelfth International Conference on Data Engineering (ICDE), pp. 106–114. IEEE Computer Society, Los Alamitos (1996)
Chapter Google Scholar
Cheung, W., Zaiane, O.R.: Incremental mining of frequent patterns without candidate generation or support. In: IDEAS 2003: Proceedings of the 7th International Database Engineering and Applications Symposium 2003, pp. 111–116. IEEE Computer Society, Los Alamitos (2003)
Google Scholar
Chi, Y., Wang, H., Yu, P.S., Muntz, R.R.: Moment: Maintaining closed frequent itemsets over a stream sliding window. In: Proceedings of the Fourth IEEE International Conference on Data Mining, pp. 59–66. IEEE Computer Society, Los Alamitos (2004)
Google Scholar
Chiu, D.Y., Wu, Y.H., Chen, A.: Efficient frequent sequence mining by a dynamic strategy switching algorithm. The VLDB Journal 18, 303–327 (2009)
Article Google Scholar
Floratou, A., Tata, S., Patel, J.M.: Efficient and accurate discovery of patterns in sequence datasets. In: ICDE 2010: Proceedings of the 26th International Conference on Data Engineering, pp. 461–472. IEEE Computer Society, Los Alamitos (2010)
Chapter Google Scholar
Hafez, A., Deogun, J., Raghavan, V.V.: The item-set tree: A data structure for data mining. In: Mohania, M., Tjoa, A.M. (eds.) DaWaK 1999. LNCS, vol. 1676, pp. 183–192. Springer, Heidelberg (1999)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: SIGMOD 2000: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 1–12. ACM, New York (2000)
Chapter Google Scholar
Lee, D., Lee, W.: Finding maximal frequent itemsets over online data streams adaptively. In: ICDM, pp. 266–273 (2005)
Google Scholar
Lee, H.S.: Incremental association mining based on maximal itemsets. In: Khosla, R., Howlett, R.J., Jain, L.C. (eds.) KES 2005. LNCS (LNAI), vol. 3681, pp. 365–371. Springer, Heidelberg (2005)
Chapter Google Scholar
Leung, C.K.S., Khan, Q.I., Li, Z., Hoque, T.: Cantree: a canonical-order tree for incremental frequent-pattern mining. Knowledge and Information Systems 11(3), 287–311 (2007)
Article Google Scholar
Lian, W., Cheung, D.W., Yiu, S.M.: Maintenance of maximal frequent itemsets in large databases. In: Proceedings of the 2007 ACM Symposium on Applied Computing, SAC 2007, pp. 388–392. ACM, New York (2007)
Google Scholar
Mozafari, B., Thakkar, H., Zaniolo, C.: Verifying and Mining Frequent Patterns from Large Windows over Data Streams. In: ICDE 2008: Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, pp. 179–188. IEEE Computer Society, Los Alamitos (2008)
Google Scholar
Omari, A., Langer, R., Conrad, S.: Tartool: A temporal dataset generator for market basket analysis. In: Tang, C., Ling, C.X., Zhou, X., Cercone, N.J., Li, X. (eds.) ADMA 2008. LNCS (LNAI), vol. 5139, pp. 400–410. Springer, Heidelberg (2008)
Chapter Google Scholar
Savasere, A., Omiecinski, E., Navathe, S.B.: An efficient algorithm for mining association rules in large databases. In: Dayal, U., Gray, P.M.D., Nishio, S. (eds.) Proceedings of 21th International Conference on Very Large Data Bases, VLDB 1995, pp. 432–444. Morgan Kaufmann, San Francisco (1995)
Google Scholar
Schmidt, J., Kramer, S.: The augmented itemset tree: A data structure for online maximum frequent pattern mining. techreport (2011), http://drehscheibe.in.tum.de/forschung/pub/reports/2011/TUM-I1114.pdf
Seeland, M., Girschick, T., Buchwald, F., Kramer, S.: Online structural graph clustering using frequent subgraph mining. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010. LNCS, vol. 6323, pp. 213–228. Springer, Heidelberg (2010)
Chapter Google Scholar
Valtchev, P., Missaoui, R., Godin, R.: A framework for incremental generation of closed itemsets. Discrete Applied Mathematics 156, 924–949 (2008)
Article MathSciNet MATH Google Scholar
Valtchev, P., Missaoui, R., Godin, R., Meridji, M.: Generating frequent itemsets incrementally: two novel approaches based on galois lattice theory. Journal of Experimental & Theoretical Artificial Intelligence 14(2-3), 115–142 (2002)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Institut für Informatik/I12, TU München, Boltzmannstr. 3, 85748, Garching b. München, Germany
Jana Schmidt & Stefan Kramer

Authors

Jana Schmidt
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Kramer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Software Systems, Tampere University of Technology, P. O. Box 553, 33101, Tampere, Finland
Tapio Elomaa
Department of Information and Computer Science, Aalto University School of Science, P.O. Box 15400, 00076, Aalto, Finland
Jaakko Hollmén
Helsinki Institute for Information Technology (HIIT), Finland
Heikki Mannila

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schmidt, J., Kramer, S. (2011). The Augmented Itemset Tree: A Data Structure for Online Maximum Frequent Pattern Mining. In: Elomaa, T., Hollmén, J., Mannila, H. (eds) Discovery Science. DS 2011. Lecture Notes in Computer Science(), vol 6926. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24477-3_23

Download citation

DOI: https://doi.org/10.1007/978-3-642-24477-3_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24476-6
Online ISBN: 978-3-642-24477-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics