skip to main content
article

Computational properties of metaquerying problems

Published:01 April 2003Publication History
Skip Abstract Section

Abstract

Metaquerying is a data mining technology by which hidden dependencies among several database relations can be discovered. This tool has already been successfully applied to several real-world applications, but only preliminary results about the complexity of metaquerying can be found in the literature. In this article, we define several variants of metaquerying that encompass, as far as we know, all the variants that have been defined in the literature. We study both the combined complexity and the data complexity of these variants. We show that under the combined complexity measure metaquerying is generally intractable (unless P = NP), lying sometimes quite high in the complexity hierarchies (as high as NPPP), depending on the characteristics of the plausibility index. Nevertheless, we are able to single out some tractable and interesting metaquerying cases, whose combined complexity is LOGCFL-complete. As for the data complexity of metaquerying, we prove that, in general, it is within TC0, but lies within AC0 in some simpler cases. Finally, we discuss the implementation of metaqueries by providing algorithms that answer them.

References

  1. Abiteboul, S., Hull, R., and Vianu, V. 1995. Foundations of databases. Addison-Wesley, Reading Mass. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Agrawal, M., Allender, E., and Datta, S. 2000. On TC0, AC0 and arithmetic circuits. J. Comput. Syst. Sci. 60, 2, 395--421. Google ScholarGoogle ScholarCross RefCross Ref
  3. Agrawal, R., Imielinski, T., and Swami, A. N. 1993. Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data (Washington, D.C.). P. Buneman and S. Jajodia, Eds. ACM, New York, 207--216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Ambainis, A., Barrington, D. M., and LêThanh, H. 1998. On counting AC0 circuits with negative constants. In Proceedings of the 23rd International Symposium on Mathematical Foundations of Computer Science (Brno, Czech Republic). 409--417. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Angluin, D. 1980. On counting problems and the polynomial-time hierarchy. Theoret. Comput. Sci. 12, 161--173.Google ScholarGoogle ScholarCross RefCross Ref
  6. Barrington, D. A. M., Immerman, N., and Straubing, H. 1990. On uniformity within NC1. J. Comput. Syst. Sci. 41, 3, 274--306. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Beeri, C., Fagin, R., Yannakakis, M., and Maier, D. 1983. On the desirability of acyclic database schemas. J. ACM 30, 3, 479--513. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Ben-Eliyahu-Zohary, R. and Gudes, E. 1999. Towards efficient metaquerying. In Proceedings of the 16th International Joint Conference on Artificial Intelligence (Stockholm, Sweden). 800--805. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Bernstein, P. and Goodman, N. 1981. The power of natural semijoins. SIAM J. Comput. 10, 4, 751--771.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Chandra, A. K. and Merlin, P. M. 1977. Optimal implementation of conjunctive queries in relational data bases. In Conference Record of the 9th Annual ACM Symposium on Theory of Computing (Boulder, Col.). ACM, New York, 77--90. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Domshlak, C., Gershkovich, D., Gudes, E., Liusternik, N., Meisels, A., Rosen, T., and Shimony, S. E.1998a. FlexiMine-homepage. Ben-Gurion University, Mathematics and Computer Science. Tel-Aviv, Israel, URL: www.cs.bgu.ac.il/kdd.Google ScholarGoogle Scholar
  12. Domshlak, C., Gershkovich, D., Gudes, E., Liusternik, N., Meisels, A., Rosen, T., and Shimony, S. E. 1998b. FlexiMine---A flexible platform for KDD research and application construction. In Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining (KDD-98) (New York City, New York).Google ScholarGoogle Scholar
  13. Dyer, M. E. and Frieze, A. M. 1988. On the complexity of computing the volume of a polyhedron. SIAM J. Comput. 17, 5, 967--974. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., and Uthurusamy, R. 1996. Advances in Knowledge Discovery and Data Mining. AAAI Press/MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Fu, Y. and Han, J. 1995. Meta-rule-guided mining of association rules in relational databases. In DOOD95 Workshop on Integration of Knowledge Discovery with Deductive and Object Oriented Databases (Singapore). 1--8.Google ScholarGoogle Scholar
  16. Garey, M. and Johnson, D. 1979. Computers and Intractability, A Guide to the Theory of NP-Completeness. Freeman. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Gottlob, G., Leone, N., and Scarcello, F. 2002. Hypertree decompositions and tractable queries. J. Comput. Syst. Sci. 64, 3, 579--627.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Gottlob, G., Leone, N., and Scarcello, F. 2001. The complexity of acyclic conjunctive queries. J. ACM 48, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Johnson, D. S. 1990. A Catalog of Complexity Classes, Chap. 2. In Handbook of Theoretical Computer Science, J. van Leenwen, Ed. Elsevier and MIT Press. 69--161. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Kero, B., Russell, L., Tsur, S., and Shen, W. M. 1995. An overview of data mining technologies. In Workshop on Integration of Knowledge Discovery with Deductive and Object Oriented Databases (DOOO95). (Singapore).Google ScholarGoogle Scholar
  21. Leng, B. and Shen, W. 1996. A metapattern-based automated discovery loop for integrated data mining---Unsupervised learning of relational patterns. IEEE Trans. Knowl. Data Eng. 8, 6, 898--910. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Mitbander, B. G., Ong, K., Shen, W., and Zaniolo, C. 1996. Metaqueries for data mining, Chap. 15. In Advances in Knowledge Discovery and Data Mining, V. Fayyad, G. Piatetsky-Shapiro P. Smyth, and R. Uthurusamy, Eds., 375--398. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Ruzzo, W. L. 1981. On uniform circuit complexity. J. Comput. Syst. Sci. 22, 3, 365--383.Google ScholarGoogle ScholarCross RefCross Ref
  24. Shen, W. M. 1992. Discovering regularities from knowledge bases. Int. Syst. 7, 7, 623--636.Google ScholarGoogle Scholar
  25. Simon, J. 1975. On some central problems in computational complexity. Ph.D. dissertation. Dept. of Computer Science, Cornell University, Ithaca, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Stockmeyer, L. J. 1976. The polynomial-time hierarchy. Theoret. Comput. Sci. 3, 1, 1--22.Google ScholarGoogle ScholarCross RefCross Ref
  27. Torán, J. 1988. An oracle characterization of the counting hierarchy. In Proceedings of the 3rd conference on Structure in Complexity Theory (Washington D.C.). 213--223.Google ScholarGoogle ScholarCross RefCross Ref
  28. Ullman, J. D. 1988. Principle of database and knowledge-base systems. Principle of computer science series. Computer Science Press, Inc. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Valiant, L. G. 1979a. The complexity of computing the permanent. Theoret. Comput. Sci. 8, 189--201.Google ScholarGoogle ScholarCross RefCross Ref
  30. Valiant, L. G. 1979b. The complexity of enumeration and reliability problems. SIAM J. Comput. 8, 3, 410--421.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. van Leeuwen (Ed.), J. 1990. Handbook of Theoretical Computer Science. Elsevier and MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Vardi, M. Y. 1982. The complexity of relational query languages. In Proceedings of the 14th ACM SIGACT Symposium on Theory of Computing (San Francisco, Calif.). ACM, New York, 137--146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Wagner, K. 1986. The complexity of combinatorial problems with succinct input representation. Acta Inf. 23, 325--356. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Computational properties of metaquerying problems

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Computational Logic
            ACM Transactions on Computational Logic  Volume 4, Issue 2
            April 2003
            146 pages
            ISSN:1529-3785
            EISSN:1557-945X
            DOI:10.1145/635499
            Issue’s Table of Contents

            Copyright © 2003 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 1 April 2003
            Published in tocl Volume 4, Issue 2

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader