Skip to main content
Log in

On frequent sets of Boolean matrices

  • Published:
Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Abstract

Given a Boolean matrix and a threshold t, a subset of the columns is frequent if there are at least t rows having a 1 entry in each corresponding position. This concept is used in the algorithmic, combinatorial approach to knowledge discovery and data mining. We consider the complexity aspects of frequent sets. An explicit family of subsets is given that requires exponentially many rows to be represented as the family of frequent sets of a matrix, with any threshold. Examples are given of families that can be represented by a small matrix with threshold t, but that require a significantly larger matrix if the threshold is less than t. We also discuss the connections of these problems to circuit complexity and the existence of efficient listing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. R. Agrawal, H. Mannila, R. Srikant, H. Toivonen and A.I. Verkamo, Fast discovery of association rules, in: Advances in Knowledge Discovery and Data Mining, eds. U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth and R. Uthurusamy (AAAI Press, Menlo Park, CA, 1996) pp. 307–328.

    Google Scholar 

  2. R. Beigel, N. Reingold and D. Spielman, The perceptron strikes back (preliminary report), in: Proceedings of the 6th Annual Structure in Complexity Theory Conference (1991) pp. 286–291.

  3. J.C. Bioch and T. Ibaraki, Complexity of identification and dualization of positive Boolean functions, Information and Computation 123 (1995) 50–63.

    Article  MathSciNet  Google Scholar 

  4. B. Bollobás, Combinatorics: Set Systems, Hypergraphs, Families of Vectors and Combinatorial Probability (Cambridge University Press, Cambridge, 1986).

    Google Scholar 

  5. T. Eiter and G. Gottlob, Identifying the minimal transversals of a hypergraph and related problems, SIAM J. Comput. 24 (1995) 1278–1304.

    Article  MATH  MathSciNet  Google Scholar 

  6. M.L. Fredman and L. Khachiyan, On the complexity of dualization of monotone disjunctive normal forms, J. Algorithms 21 (1996) 618–628.

    Article  MATH  MathSciNet  Google Scholar 

  7. L.A. Goldberg, Efficient Algorithms for Listing Combinatorial Objects, Distinguished Dissertations in Computer Science (Cambridge University Press, Cambridge, 1993).

    Google Scholar 

  8. R.L. Graham, B.L. Rothschild and J.H. Spencer, Ramsey Theory, Interscience Series in Discrete Mathematics (Wiley, New York, 1980).

    Google Scholar 

  9. D. Gunopulos, R. Khardon, H. Mannila and H. Toivonen, Data mining, hypergraph transversals, and machine learning, in: Proceedings of the 16th ACM SIGACT–SIGMOD–SIGART Symposium on Principles of Database Systems (1997) pp. 12–15.

  10. V. Gurvich and L. Khachiyan, On generating the irredundant conjunctive and disjunctive normal forms of monotone Boolean functions, RUTCOR Research Report RRR 35-95, Rutgers Center for Operations Research (1997). Also available as LCSR-TR-251, Department of Computer Science, Rutgers University (1995). To appear in Discrete Appl. Math.

  11. V. Gurvich and L. Khachiyan, On the frequency of the most frequently occurring variable in dual monotone DNFs, Discrete Math. 169 (1997) 245–248.

    Article  MATH  MathSciNet  Google Scholar 

  12. P. Hájek and T. Havránek, Mechanizing Hypothesis Formation: Mathematical Foundations for a General Theory (Springer, 1978).

  13. A. Hajnal, W. Maass, P. Pudlák, M. Szegedy and G. Turán, Threshold circuits of bounded depth, J. Comput. System Sci. 46 (1993) 129–154.

    Article  MATH  MathSciNet  Google Scholar 

  14. D.S. Johnson, M. Yannakakis and C.H. Papadimitriou, On generating all maximal independent sets, Inform. Process. Lett. 27 (1988) 119–123.

    Article  MATH  MathSciNet  Google Scholar 

  15. S. Jukna, Computing threshold functions by depth-3 threshold circuits with smaller thresholds of their gates, Inform. Process. Lett. 56 (1995) 147–150.

    Article  MATH  MathSciNet  Google Scholar 

  16. G.O.H. Katona, T. Nemetz and M. Simonovits, On a problem of Turán in the theory of graphs, Mat. Lapok 15 (1964) 228–238 (in Hungarian).

    MathSciNet  Google Scholar 

  17. E.L. Lawler, J.K. Lenstra and A.H.G. Rinnooy Kan, Generating all maximal independent sets: NP-hardness and polynomial-time algorithms, SIAM J. Comput. 9(3) (1980) 558–565.

    Article  MATH  MathSciNet  Google Scholar 

  18. H. Mannila and H. Toivonen, Multiple uses of frequent sets and condensed representations, in: Proc. 2nd International Conference on Knowledge Discovery and Data Mining (1996) pp. 189–194.

  19. H. Mannila and H. Toivonen, Levelwise search and borders of theories in knowledge discovery, Series of Publications C C-1997-8, University of Helsinki, Department of Computer Science (1997).

  20. M. Minsky and S. Papert, Perceptrons (MIT Press, Cambridge, MA, 1969).

    Google Scholar 

  21. N. Mishra and L. Pitt, Generating all maximal independent sets of bounded-degree hypergraphs, in: Proc. 10th Annu. Conf. on Comput. Learning Theory (ACM Press, New York, 1997) pp. 211–217.

    Google Scholar 

  22. P. Pudlák and F.N. Springsteel, Complexity of mechanized hypothesis formation, Theoret. Comput. Sci. 8 (1979) 203–225.

    Article  MATH  MathSciNet  Google Scholar 

  23. K.-Y. Siu, V. Roychowdhury and T. Kailath, Discrete Neural Computation: A Theoretical Foundation (Prentice-Hall, Englewood Cliffs, NJ, 1995).

    Google Scholar 

  24. S. Tsukiyama, M. Ide, H. Ariyoshi and I. Shirakawa, A new algorithm for generating all the maximal independent sets, SIAM J. Comput. 6(3) (1977) 505–517.

    Article  MATH  MathSciNet  Google Scholar 

  25. I. Wegener, The Complexity of Boolean Functions (Wiley–Teubner, 1987).

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sloan, R.H., Takata, K. & Turán, G. On frequent sets of Boolean matrices. Annals of Mathematics and Artificial Intelligence 24, 193–209 (1998). https://doi.org/10.1023/A:1018905417023

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1018905417023

Keywords

Navigation