On frequent sets of Boolean matrices

Sloan, Robert H.; Takata, Ken; Turán, György

doi:10.1023/A:1018905417023

Robert H. Sloan¹,
Ken Takata² &
György Turán^2,3

69 Accesses
11 Citations
Explore all metrics

Abstract

Given a Boolean matrix and a threshold t, a subset of the columns is frequent if there are at least t rows having a 1 entry in each corresponding position. This concept is used in the algorithmic, combinatorial approach to knowledge discovery and data mining. We consider the complexity aspects of frequent sets. An explicit family of subsets is given that requires exponentially many rows to be represented as the family of frequent sets of a matrix, with any threshold. Examples are given of families that can be represented by a small matrix with threshold t, but that require a significantly larger matrix if the threshold is less than t. We also discuss the connections of these problems to circuit complexity and the existence of efficient listing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

R. Agrawal, H. Mannila, R. Srikant, H. Toivonen and A.I. Verkamo, Fast discovery of association rules, in: Advances in Knowledge Discovery and Data Mining, eds. U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth and R. Uthurusamy (AAAI Press, Menlo Park, CA, 1996) pp. 307–328.
Google Scholar
R. Beigel, N. Reingold and D. Spielman, The perceptron strikes back (preliminary report), in: Proceedings of the 6th Annual Structure in Complexity Theory Conference (1991) pp. 286–291.
J.C. Bioch and T. Ibaraki, Complexity of identification and dualization of positive Boolean functions, Information and Computation 123 (1995) 50–63.
Article MathSciNet Google Scholar
B. Bollobás, Combinatorics: Set Systems, Hypergraphs, Families of Vectors and Combinatorial Probability (Cambridge University Press, Cambridge, 1986).
Google Scholar
T. Eiter and G. Gottlob, Identifying the minimal transversals of a hypergraph and related problems, SIAM J. Comput. 24 (1995) 1278–1304.
Article MATH MathSciNet Google Scholar
M.L. Fredman and L. Khachiyan, On the complexity of dualization of monotone disjunctive normal forms, J. Algorithms 21 (1996) 618–628.
Article MATH MathSciNet Google Scholar
L.A. Goldberg, Efficient Algorithms for Listing Combinatorial Objects, Distinguished Dissertations in Computer Science (Cambridge University Press, Cambridge, 1993).
Google Scholar
R.L. Graham, B.L. Rothschild and J.H. Spencer, Ramsey Theory, Interscience Series in Discrete Mathematics (Wiley, New York, 1980).
Google Scholar
D. Gunopulos, R. Khardon, H. Mannila and H. Toivonen, Data mining, hypergraph transversals, and machine learning, in: Proceedings of the 16th ACM SIGACT–SIGMOD–SIGART Symposium on Principles of Database Systems (1997) pp. 12–15.
V. Gurvich and L. Khachiyan, On generating the irredundant conjunctive and disjunctive normal forms of monotone Boolean functions, RUTCOR Research Report RRR 35-95, Rutgers Center for Operations Research (1997). Also available as LCSR-TR-251, Department of Computer Science, Rutgers University (1995). To appear in Discrete Appl. Math.
V. Gurvich and L. Khachiyan, On the frequency of the most frequently occurring variable in dual monotone DNFs, Discrete Math. 169 (1997) 245–248.
Article MATH MathSciNet Google Scholar
P. Hájek and T. Havránek, Mechanizing Hypothesis Formation: Mathematical Foundations for a General Theory (Springer, 1978).
A. Hajnal, W. Maass, P. Pudlák, M. Szegedy and G. Turán, Threshold circuits of bounded depth, J. Comput. System Sci. 46 (1993) 129–154.
Article MATH MathSciNet Google Scholar
D.S. Johnson, M. Yannakakis and C.H. Papadimitriou, On generating all maximal independent sets, Inform. Process. Lett. 27 (1988) 119–123.
Article MATH MathSciNet Google Scholar
S. Jukna, Computing threshold functions by depth-3 threshold circuits with smaller thresholds of their gates, Inform. Process. Lett. 56 (1995) 147–150.
Article MATH MathSciNet Google Scholar
G.O.H. Katona, T. Nemetz and M. Simonovits, On a problem of Turán in the theory of graphs, Mat. Lapok 15 (1964) 228–238 (in Hungarian).
MathSciNet Google Scholar
E.L. Lawler, J.K. Lenstra and A.H.G. Rinnooy Kan, Generating all maximal independent sets: NP-hardness and polynomial-time algorithms, SIAM J. Comput. 9(3) (1980) 558–565.
Article MATH MathSciNet Google Scholar
H. Mannila and H. Toivonen, Multiple uses of frequent sets and condensed representations, in: Proc. 2nd International Conference on Knowledge Discovery and Data Mining (1996) pp. 189–194.
H. Mannila and H. Toivonen, Levelwise search and borders of theories in knowledge discovery, Series of Publications C C-1997-8, University of Helsinki, Department of Computer Science (1997).
M. Minsky and S. Papert, Perceptrons (MIT Press, Cambridge, MA, 1969).
Google Scholar
N. Mishra and L. Pitt, Generating all maximal independent sets of bounded-degree hypergraphs, in: Proc. 10th Annu. Conf. on Comput. Learning Theory (ACM Press, New York, 1997) pp. 211–217.
Google Scholar
P. Pudlák and F.N. Springsteel, Complexity of mechanized hypothesis formation, Theoret. Comput. Sci. 8 (1979) 203–225.
Article MATH MathSciNet Google Scholar
K.-Y. Siu, V. Roychowdhury and T. Kailath, Discrete Neural Computation: A Theoretical Foundation (Prentice-Hall, Englewood Cliffs, NJ, 1995).
Google Scholar
S. Tsukiyama, M. Ide, H. Ariyoshi and I. Shirakawa, A new algorithm for generating all the maximal independent sets, SIAM J. Comput. 6(3) (1977) 505–517.
Article MATH MathSciNet Google Scholar
I. Wegener, The Complexity of Boolean Functions (Wiley–Teubner, 1987).

Download references

Author information

Authors and Affiliations

Department of EE and Computer Science, University of Illinois at Chicago, 851 S. Morgan Street, Rm 1120, Chicago, IL, 60607-7053, USA
Robert H. Sloan
Department of Mathematics, Statistics and Computer Science, University of Illinois, Chicago, USA
Ken Takata & György Turán
Research Group on Artificial Intelligence, Hungarian Academy of Sciences, Szeged, Hungary
György Turán

Authors

Robert H. Sloan
View author publications
You can also search for this author in PubMed Google Scholar
Ken Takata
View author publications
You can also search for this author in PubMed Google Scholar
György Turán
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sloan, R.H., Takata, K. & Turán, G. On frequent sets of Boolean matrices. Annals of Mathematics and Artificial Intelligence 24, 193–209 (1998). https://doi.org/10.1023/A:1018905417023

Download citation

Issue Date: February 1998
DOI: https://doi.org/10.1023/A:1018905417023

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On frequent sets of Boolean matrices

Abstract

Access this article

Similar content being viewed by others

The Complexity of Some Pattern Problems in the Logical Analysis of Large Genomic Data Sets

Probabilistic and exact frequent subtree mining in graphs beyond forests

Mine ’Em All: A Note on Mining All Graphs

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

On frequent sets of Boolean matrices

Abstract

Access this article

Similar content being viewed by others

The Complexity of Some Pattern Problems in the Logical Analysis of Large Genomic Data Sets

Probabilistic and exact frequent subtree mining in graphs beyond forests

Mine ’Em All: A Note on Mining All Graphs

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation