On Maximal Frequent and Minimal Infrequent Sets in Binary Matrices

Boros, E.; Gurvich, V.; Khachiyan, L.; Makino, K.

doi:10.1023/A:1024605820527

On Maximal Frequent and Minimal Infrequent Sets in Binary Matrices

Published: November 2003

Volume 39, pages 211–221, (2003)
Cite this article

Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

E. Boros¹,
V. Gurvich¹,
L. Khachiyan² &
…
K. Makino³

153 Accesses
Explore all metrics

Abstract

Given an m×n binary matrix A, a subset C of the columns is called t-frequent if there are at least t rows in A in which all entries belonging to C are non-zero. Let us denote by α the number of maximal t-frequent sets of A, and let β denote the number of those minimal column subsets of A which are not t-frequent (so called t-infrequent sets). We prove that the inequality α≤(m−t+1)β holds for any binary matrix A in which not all column subsets are t-frequent. This inequality is sharp, and allows for an incremental quasi-polynomial algorithm for generating all minimal t-infrequent sets. We also prove that the analogous generation problem for maximal t-frequent sets is NP-hard. Finally, we discuss the complexity of generating closed frequent sets and some other related problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Structure of frequent itemsets with extended double constraints

Article Open access 29 January 2016

Frequent Pattern Discovery as Table Constraint Satisfaction Problem

Counting frequent patterns in large labeled graphs: a hypergraph-based approach

Article 05 May 2020

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

R. Agrawal, T. Imielinski and A. Swami, Mining associations between sets of items in massive databases, in: Proceedings of the 1993 ACM-SIGMOD International Conference on Management of Data (1993) pp. 207–216.
R. Agrawal, H. Mannila, R. Srikant, H. Toivonen and A.I. Verkamo, Fast discovery of association rules, in: Advances in Knowledge Discovery and Data Mining, eds. U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth and R. Uthurusamy (AAAI Press, Menlo Park, CA, 1996) pp. 307–328.
Google Scholar
R. Agrawal and R. Srikant, Mining sequential patterns, in: Proceedings of the 11th International Conference on Data Engineering (1995) pp. 3–14.
R.J. Bayardo, Efficiently mining long patterns from databases, in: Proceedings of the 1998 ACM-SIGMOD International Conference on Management of Data (1998) pp. 85–93.
J.C. Bioch and T. Ibaraki, Complexity of identification and dualization of positive Boolean functions, Information and Computation 123 (1995) 50–63.
Google Scholar
M.M. Bongard, Problema Uznavania (Nauka Press, Moscow, 1967). English translation: Pattern Recognition (Hayden Book Co., Spartan Book, Rochelle Park, NJ, USA, 1970).
Google Scholar
E. Boros, V. Gurvich, L. Khachiyan and K. Makino, Generating partial and multiple transversals of a hypergraph, in: Proceedings of the 27th International Colloquium on Automata, Languages and Programming (ICALP), eds. U. Montanari, J.D.P. Rolim and E. Welzl, Lecture Notes in Computer Science, Vol. 1853 (Springer, Berlin, 2000) pp. 588–599.
Google Scholar
E. Boros, V. Gurvich, L. Khachiyan and K. Makino, Dual-bounded generating problems: Partial and multiple transversals of a hypergraph, SIAM Journal on Computing 30 (2001) 2036–2050.
Google Scholar
S. Brin, R. Motwani and C. Silverstein, Beyond market basket: Generalizing association rules to correlations, in: Proceedings of the 1997 ACM-SIGMOD Conference on Management of Data (1997) pp. 265–276.
S. Brin, R. Motwani, J. Ullman and S. Tsur, Dynamic itemset counting and implication rules for market basket data, in: Proceedings of the 1997 ACM-SIGMOD Conference on Management of Data (1997) pp. 255–264.
B.A. Davey and H.A. Priestley, Introduction to Lattices and Order (Cambridge University Press, 1990).
G. Dong and J. Li, Efficient mining of emerging patterns, in: Proceedings of the 1999 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (1999) pp. 43–52.
T. Eiter and G. Gottlob, Identifying the minimal transversals of a hypergraph and related problems, SIAM Journal on Computing 24 (1995) 1278–1304.
Google Scholar
D. Eppstein, Arboricity and bipartite subgraph listing algorithms, Information Processing Letters 51 (1994) 207–211.
Google Scholar
M.L. Fredman and L. Khachiyan, On the complexity of dualization of monotone disjunctive normal forms, Journal of Algorithms 21 (1996) 618–628.
Google Scholar
B. Ganter and R. Wille, Formal Concept Analysis (Springer, 1996).
M.R. Garey and D.S. Johnson, Computers and Intractability (Freeman, New York, 1979).
Google Scholar
D. Gunopulos, R. Khardon, H. Mannila and H. Toivonen, Data mining, hypergraph transversals and machine learning, in: Proceedings of the 16th ACM-SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (1997) pp. 12–15.
V. Gurvich and L. Khachiyan, On generating the irredundant conjunctive and disjunctive normal forms of monotone Boolean functions, Discrete Applied Mathematics 1996–97, 1–3 (1999) 363–373.
Google Scholar
J. Han, J. Pei and Y. Yin, Mining frequent patterns without candidate generation, in: Proceedings of the 2000 ACM-SIGMOD Conference on Management of Data (2000) pp. 1–12.
D.S. Johnson, M. Yannakakis and C.H. Papadimitriou, On generating all maximal independent sets, Information Processing Letters 27 (1988) 119–123.
Google Scholar
S.O. Kuznetsov, Interpretation on graphs and complexity characteristics of a search for specific patterns, Nauchn. Tekh. Inf., Ser. 2 (Automatic Document. Math. Linguist.) 23(1) (1989) 23–37.
Google Scholar
V. Levit, private communication (2000).
D. Lin and Z.M. Kedem, Pincer-search: a new algorithm for discovering the maximum frequent set, in: Proceedings of the Sixth European Conference on Extending Database Technology, to appear.
K. Makino and T. Ibaraki, Inner-core and outer-core functions of partially defined Boolean functions, Discrete Applied Mathematics 1996–97, 1–3 (1999) 307–326.
Google Scholar
H. Mannila and H. Toivonen, Multiple uses of frequent sets and condensed representations, in: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (1996) pp. 189–194.
H. Mannila and H. Toivonen, Levelwise search and borders of theories in knowledge discovery, Series of Publications C C-1997-8, Department of Computer Science, University of Helsinki (1997).
H. Mannila, H. Toivonen and A.I. Verkamo, Discovery of frequent episodes in event sequences, Data Mining and Knowledge Discovery 1 (1997) 259–289.
Google Scholar
N. Pasquier, Y. Bastide, R. Taouil and L. Lakhal, Discovering frequent closed itemsets for association rules, in: Proceedings of the 7th ICDT Conference, Jerusalem, Israel, January 10–12, 1999; Lecture Notes in Computer Science, Vol. 1540 (Springer, 1999) pp. 398–416.
N. Pasquier, Y. Bastide, R. Taouil and L. Lakhal, Closed set based discovery of small covers for association rules, in: Proc. 15emes Journees Bases de Donnees Avancees, BDA (1999) pp. 361–381.
R.H. Sloan, K. Takata and G. Turan, On frequent sets of Boolean matrices, Annals of Mathematics and Artificial Intelligence 24 (1998) 1–4.
Google Scholar
M.J. Zaki and M. Ogihara, Theoretical foundations of association rules, in: 3rd SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (June 1998).

Download references

Author information

Authors and Affiliations

RUTCOR, Rutgers University, 640 Bartholomew Road, Piscataway, New Jersey, 08854-8003, USA
E. Boros & V. Gurvich
Department of Computer Science, Rutgers University, 110 Frelinghuysen Road, Piscataway, New Jersey, 08854-8019, USA
L. Khachiyan
Division of Systems Science, Graduate School of Engineering Science, Osaka University, Toyonaka, Osaka, 560-8531, Japan
K. Makino

Authors

E. Boros
View author publications
You can also search for this author in PubMed Google Scholar
V. Gurvich
View author publications
You can also search for this author in PubMed Google Scholar
L. Khachiyan
View author publications
You can also search for this author in PubMed Google Scholar
K. Makino
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Boros, E., Gurvich, V., Khachiyan, L. et al. On Maximal Frequent and Minimal Infrequent Sets in Binary Matrices. Annals of Mathematics and Artificial Intelligence 39, 211–221 (2003). https://doi.org/10.1023/A:1024605820527

Download citation

Issue Date: November 2003
DOI: https://doi.org/10.1023/A:1024605820527

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On Maximal Frequent and Minimal Infrequent Sets in Binary Matrices

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Structure of frequent itemsets with extended double constraints

Frequent Pattern Discovery as Table Constraint Satisfaction Problem

Counting frequent patterns in large labeled graphs: a hypergraph-based approach

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Subscribe and save

Buy Now

Navigation

On Maximal Frequent and Minimal Infrequent Sets in Binary Matrices

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Structure of frequent itemsets with extended double constraints

Frequent Pattern Discovery as Table Constraint Satisfaction Problem

Counting frequent patterns in large labeled graphs: a hypergraph-based approach

Explore related subjects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Subscribe and save

Buy Now

Search

Navigation