Abstract
In this paper we rule out output polynomial listing algorithms for the general problem of discovering theories for a conjunction of monotone and anti-monotone constraints as well as for the particular subproblem in which all constraints are frequency-based. For the general problem we prove a concrete exponential lower time bound that holds for any correct algorithm and even in cases in which the size of the theory as well as the only previous bound are constant. For the case of frequency-based constraints our result holds unless P = NP. These findings motivate further research to identify tractable subproblems and justify approaches with exponential worst case complexity.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB, pp. 487–499 (1994)
Bonchi, F., Giannotti, F., Mazzanti, A., Pedreschi, D.: ExAMiner: Optimized level-wise frequent pattern mining with monotone constraint. In: ICDM, pp. 11–18. IEEE Computer Society Press, Los Alamitos (2003)
Bonchi, F., Giannotti, F., Mazzanti, A., Pedreschi, D.: Exante: A preprocessing method for frequent-pattern mining. IEEE Intelligent Systems 20(3), 25–31 (2005)
Bonchi, F., Lucchese, C.: Pushing tougher constraints in frequent pattern mining. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 114–124. Springer, Heidelberg (2005)
Boros, E., Gurvich, V., Khachiyan, L., Makino, K.: On the complexity of generating maximal frequent and minimal infrequent sets. In: Alt, H., Ferreira, A. (eds.) STACS 2002. LNCS, vol. 2285, p. 733. Springer, Heidelberg (2002)
Bucila, C., Gehrke, J., Kifer, D., White, W.: DualMiner: A dual-pruning algorithm for itemsets with constraints. Data Mining and Knowledge Discovery 7(3), 241–272 (2003)
Dong, G., Li, J.: Efficient mining of emerging patterns: discovering trends and differences. In: KDD 1999: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 43–52. ACM, New York (1999)
El-Hajj, M., Zaïane, O.R., Nalos, P.: Bifold constraint-based mining by simultaneous monotone and anti-monotone checking. In: ICDM, pp. 146–153. IEEE Computer Society Press, Los Alamitos (2005)
Fischer, J., Heun, V., Kramer, S.: Optimal string mining under frequency constraints. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 139–150. Springer, Heidelberg (2006)
Goldberg, L.A.: Efficient algorithms for listing combinatorial structures. Cambridge University Press, New York (1993)
Gunopulos, D., Khardon, R., Mannila, H., Saluja, S., Toivonen, H., Sharma, R.S.: Discovering all most specific sentences. ACM Trans. Database Syst. 28(2), 140–174 (2003)
Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery 8(1), 53–87 (2004)
Johnson, D.S., Papadimitriou, C.H.: On generating all maximal independent sets. Inf. Process. Lett. 27(3), 119–123 (1988)
Lawler, E.L., Lenstra, J.K., Kan, A.H.G.R.: Generating all maximal independent sets: Np-hardness and polynomial-time algorithms. SIAM J. Comput. 9(3), 558–565 (1980)
Mannila, H., Toivonen, H.: Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery 1(3), 241–258 (1997)
Mitchell, T.M.: Generalization as search. Artificial Intelligence 18, 203 (1982)
Pei, J., Han, J.: Can we push more constraints into frequent pattern mining? In: KDD, pp. 350–354 (2000)
De Raedt, L., Kramer, S.: The levelwise version space algorithm and its application to molecular fragment finding. In: Nebel, B. (ed.) Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, IJCAI 2001, Seattle, Washington, USA, August 4-10, 2001, pp. 853–862. Morgan Kaufmann, San Francisco (2001)
Read, R.C., Tarjan, R.E.: Bounds on backtrack algorithms for listing cycles, paths, and spanning trees. Networks 5, 237–252 (1975)
Saigo, H., Nowozin, S., Kadowaki, T., Kudo, T., Tsuda, K.: gBoost: A mathematical programming approach to graph classification and regression. Machine Learning (2009)
Sloan, R.H., Takata, K., Turán, G.: On frequent sets of boolean matrices. Annals of Mathematics and Artificial Intelligence 24(1-4), 193–209 (1998)
Wang, L., Zhao, H., Dong, G., Li, J.: On the complexity of finding emerging patterns. Theoretical Computer Science 335(1), 15–27 (2005); Pattern Discovery in the Post Genome
Yang, G.: Computational aspects of mining maximal frequent patterns. Theoretical Computer Science 362(1-3), 63–85 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Boley, M., Gärtner, T. (2009). On the Complexity of Constraint-Based Theory Extraction. In: Gama, J., Costa, V.S., Jorge, A.M., Brazdil, P.B. (eds) Discovery Science. DS 2009. Lecture Notes in Computer Science(), vol 5808. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04747-3_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-04747-3_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04746-6
Online ISBN: 978-3-642-04747-3
eBook Packages: Computer ScienceComputer Science (R0)