Itemset Support Queries Using Frequent Itemsets and Their Condensed Representations

Mielikäinen, Taneli; Panov, Panče; Džeroski, Sašo

doi:10.1007/11893318_18

Taneli Mielikäinen²¹,
Panče Panov²² &
Sašo Džeroski²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4265))

Included in the following conference series:

International Conference on Discovery Science

1276 Accesses

Abstract

The purpose of this paper is two-fold: First, we give efficient algorithms for answering itemset support queries for collections of itemsets from various representations of the frequency information. As index structures we use itemset tries of transaction databases, frequent itemsets and their condensed representations. Second, we evaluate the usefulness of condensed representations of frequent itemsets to answer itemset support queries using the proposed query algorithms and index structures. We study analytically the worst-case time complexities of querying condensed representations and evaluate experimentally the query efficiency with random itemset queries to several benchmark transaction databases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

GridWall: A Novel Condensed Representation of Frequent Itemsets

Mining Frequent and Homogeneous Closed Itemsets

Reference itemsets: useful itemsets to approximate the representation of frequent itemsets

Article 20 May 2016

References

Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Buneman, P., Jajodia, S. (eds.) SIGMOD Conference, pp. 207–216 (1993)
Google Scholar
Goethals, B.: Frequent set mining. In: Maimon, O., Rokach, L. (eds.) The Data Mining and Knowledge Discovery Handbook, pp. 377–397. Springer, Heidelberg (2005)
Chapter Google Scholar
Goethals, B., Zaki, M.J. (eds.): FIMI 2003, Frequent Itemset Mining Implementations, Proceedings of the ICDM 2003 Workshop on Frequent Itemset Mining Implementations, Melbourne, Florida, USA, December 19, 2003. CEUR Workshop Proceedings, vol. 90 (2003)
Google Scholar
Bayardo Jr., R.J., Goethals, B., Zaki, M.J. (eds.): FIMI 2004, Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations, Brighton, UK, November 1, 2004. CEUR Workshop Proceedings, vol. 126 (2004)
Google Scholar
Mannila, H., Toivonen, H.: Multiple uses of frequent sets and condensed representations (extended abstract). In: KDD, pp. 189–194 (1996)
Google Scholar
Calders, T., Rigotti, C., Boulicaut, J.F.: A survey on condensed representations for frequent sets. In: [30], pp. 64–80
Google Scholar
Mielikäinen, T.: Transaction databases, frequent itemsets, and their condensed representations. In: [31], pp. 139–164
Google Scholar
Boulicaut, J.-F.: Inductive databases and multiple uses of frequent itemsets: The cInQ approach. In: Boulicaut, J.-F., De Raedt, L., Mannila, H. (eds.) Constraint-Based Mining and Inductive Databases. LNCS (LNAI), vol. 3848, pp. 1–23. Springer, Heidelberg (2006)
Chapter Google Scholar
Imielinski, T., Mannila, H.: A database perspective on knowledge discovery. Communications of the ACM 39, 58–64 (1996)
Article Google Scholar
Mannila, H.: Inductive databases and condensed representations for data mining. In: ILPS, pp. 21–30 (1997)
Google Scholar
Siebes, A.: Data mining in inductive databases. In: [31], pp. 1–23
Google Scholar
Clark, P., Niblett, T.: The CN2 induction algorithm. Machine Learning 3, 261–283 (1989)
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1992)
Google Scholar
Maron, M.E.: Automatic indexing: An experimental inquiry. J. ACM 8, 404–417 (1961)
Article MATH Google Scholar
Panov, P., Džeroski, S., Blockeel, H., Loškovska, S.: Predictive data mining using itemset frequencies. In: Proceedings of the 8th International Multiconference Information Society, pp. 224–227 (2005)
Google Scholar
Kearns, M.J.: Efficient noise-tolerant learning from statistical queries. J. ACM 45, 983–1006 (1998)
Article MATH MathSciNet Google Scholar
Pavlov, D., Mannila, H., Smyth, P.: Beyond independence: Probabilistic models for query approximation on binary transaction data. IEEE Transactions on Knowledge and Data Engineering 15, 1409–1421 (2003)
Article Google Scholar
Seppänen, J.K., Mannila, H.: Boolean formulas and frequent sets. In: [30], pp. 348–361
Google Scholar
Mielikäinen, T.: Separating structure from interestingness. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS, vol. 3056, pp. 476–485. Springer, Heidelberg (2004)
Chapter Google Scholar
Toivonen, H.: Sampling large databases for association rules. In: Vijayaraman, T.M., Buchmann, A.P., Mohan, C., Sarda, N.L. (eds.): VLDB 1996, pp. 134–145 (1996)
Google Scholar
Kubat, M., Hafez, A., Raghavan, V.V., Lekkala, J.R., Chen, W.K.: Itemset trees for targeted association querying. IEEE Transactions on Knowledge and Data Engineering 15, 1522–1534 (2003)
Article Google Scholar
Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Min. Knowl. Discov. 8, 53–87 (2004)
Article MathSciNet Google Scholar
Moore, A.W., Lee, M.S.: Cached sufficient statistics for efficient machine learning with large datasets. JAIR 8, 67–91 (1998)
MATH MathSciNet Google Scholar
Mielikäinen, T.: Implicit enumeration of patterns. In: [32], pp. 150–172
Google Scholar
Laur, S., Lipmaa, H., Mielikäinen, T.: Private itemset support counting. In: Qing, S., Mao, W., López, J., Wang, G. (eds.) ICICS 2005. LNCS, vol. 3783, pp. 97–111. Springer, Heidelberg (2005)
Chapter Google Scholar
Mielikäinen, T.: An automata approach to pattern collections. In: [32], pp. 130–149
Google Scholar
Calders, T., Goethals, B.: Quick inclusion-exclusion. In: [31], pp. 86–103
Google Scholar
Geerts, F., Goethals, B., Mielikäinen, T.: What you store is what you get. In: [33], pp. 60–69
Google Scholar
Mielikäinen, T.: Finding all occurring patterns of interest. In: [33], pp. 97–106
Google Scholar
Boulicaut, J.-F., De Raedt, L., Mannila, H. (eds.): Constraint-Based Mining and Inductive Databases. LNCS, vol. 3848. Springer, Heidelberg (2006)
Google Scholar
Bonchi, F., Boulicaut, J.-F. (eds.): KDID 2005. LNCS, vol. 3933. Springer, Heidelberg (2006)
MATH Google Scholar
Goethals, B., Siebes, A. (eds.): KDID 2004 (Revised Selected and Invited Papers). LNCS, vol. 3377. Springer, Heidelberg (2005)
Google Scholar
Boulicaut, J.F., Dzeroski, S. (eds.): Proceedings of the Second International Workshop on Inductive Databases, Cavtat-Dubrovnik, Croatia, September 22 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

HIIT BRU, Department of Computer Science, University of Helsinki, Finland
Taneli Mielikäinen
Department of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia
Panče Panov & Sašo Džeroski

Authors

Taneli Mielikäinen
View author publications
You can also search for this author in PubMed Google Scholar
Panče Panov
View author publications
You can also search for this author in PubMed Google Scholar
Sašo Džeroski
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Jozef Stefan Institute, Jamova 39, 1000, Ljubljana, Slovenia
Ljupčo Todorovski
University of Nova Gorica, Nova Gorica, Slovenia
Nada Lavrač
Meme Media Laboratory, Hokkaido University Sapporo, Kita 13, Nishi 8, Kita-ku, P.O. Box, 060-8628, Sapporo, Japan
Klaus P. Jantke

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mielikäinen, T., Panov, P., Džeroski, S. (2006). Itemset Support Queries Using Frequent Itemsets and Their Condensed Representations. In: Todorovski, L., Lavrač, N., Jantke, K.P. (eds) Discovery Science. DS 2006. Lecture Notes in Computer Science(), vol 4265. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11893318_18

Download citation

DOI: https://doi.org/10.1007/11893318_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-46491-4
Online ISBN: 978-3-540-46493-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics