Skip to main content

Advertisement

Log in

Answering constraint-based mining queries on itemsets using previous materialized results

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

In recent years, researchers have begun to study inductive databases, a new generation of databases for leveraging decision support applications. In this context, the user interacts with the DBMS using advanced, constraint-based languages for data mining where constraints have been specifically introduced to increase the relevance of the results and, at the same time, to reduce its volume. In this paper we study the problem of mining frequent itemsets using an inductive database. We propose a technique for query answering which consists in rewriting the query in terms of union and intersection of the result sets of other queries, previously executed and materialized. Unfortunately, the exploitation of past queries is not always applicable. We then present sufficient conditions for the optimization to apply and show that these conditions are strictly connected with the presence of functional dependencies between the attributes involved in the queries. We show some experiments on an initial prototype of an optimizer which demonstrates that this approach to query answering is viable and in many practical cases it drastically reduces the query execution time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abiteboul, S., & Duschka, O. M. (1998). Complexity of answering queries using materialized views. In Proceedings of the seventeenth ACM PODS conference, June 1–3, 1998, Seattle, Washington, (pp.) 254–263, ACM Press.

  • Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., & Verkamo, A. I. (1995). Fast Discovery of Association Rules. In U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy (Eds.), Knowledge discovery in databases, Vol. 2. AAAI/MIT Press.

  • Chaudhuri, S., & Dayal, U. (1997). An overview of data warehousing and olap technology. Sigmod Record, 26 (1), 65–74.

    Article  Google Scholar 

  • Chaudhuri, S., Krishnamurthy, R., Potamianos, S., & Shim, K. (1995). Optimizing queries with materialized views. P. S. Yu and A. L. P. Chen (Eds.), In Proceedings of the eleventh international conference on data engineering, March 6–10, 1995, Taipei, Taiwan, (pp. 190–200), IEEE Computer Society.

  • Chaudhuri, S., Narasayya, V., & Sarawagi, S. (2002). Efficient evaluation of queries with mining predicates. In Proceedings of the 18th int'l conf. on data engineering, USA.

  • Fang, M., Shivakumar, N., Garcia-Molina, H., Motwani, R., & Ullman, J. (1998). Computing iceberg queries efficiently. In Proceeding of VLDB '98.

  • Gallo, A., Esposito, R., Meo, R., & Botta, M. (2005). Optimization of association rules extraction through exploitation of context dependent constraints. In Proceedings of the ninth AI*IA conf., lecture notes in artificial intelligence (to appear).

  • Goldberg, D. E. (1989). Genetic algorithms, in search, optimization and machine learning. Reading, MA: Addison-Wesley.

    Google Scholar 

  • Günzel, H., Albrecht, J., & Lehner, W. (2000). Use and reuse of association rules in an OLAP environment. In Proceedings of the 2000 inf. resources management association int. conf. on challenges of information technology management in the 21st century, (pp. 566–569). Idea Group Publishing.

  • Gupta, A., Harinarayan, V., & Quass, D. (1995). Aggregate query processing in data warehousing environments. In Proceedings of VLDB '95, (pp. 358–369).

  • Hamming, R. W. (1950). E rror-D etecting and E rror-C orrecting C odes. Bell System Technical Journal, 29, 147–160.

    Google Scholar 

  • Han, J., Fu, Y., Wang, W., Koperski, K., & Zaiane, O. (1996). DMQL : A data mining query language for relational databases. In Proceedings of SIGMOD-96 workshop on research issues on data mining and knowledge discovery.

  • Imielinski, T., & Mannila, H. (1996). A database perspective on knowledge discovery. Communications of the ACM, 39(11), 58–64.

    Article  Google Scholar 

  • Imielinski, T., Virmani, A., & Abdoulghani, A. (1996). DataMine: application programming interface and query language for database mining. KDD-96, (pp. 256–260).

  • Lenz, H.-J., & Shoshani, A. (1997). Summarizability in OLAP and statistical data bases. In Proceedings ninth international conference on scientific and statistical database management, (pp. 132–143). IEEE Computer Society.

  • Leung, C. K.-S., Lakshmanan, L. V. S., & Ng, R. T. (2002). Exploiting succinct constraints using fp-trees. ACM SIGKDD Explorations, 4(1), 40–49.

    Google Scholar 

  • Levy, A.Y., Mendelzon, A.O., Sagiv, Y., & Srivastava, D. (1995). Answering queries using views. In Proceedings of the 14th ACM SIGACT -SIGMOD -SIGART symposium on principles of database systems, San Jose, Calif, (pp. 95–104).

  • Malvestuto, F. M. (1988). The derivation problem for summary data. In Proceedings of the ACM SIGMOD international conference on management of data, (pp. 82–89).

  • Meo, R. (2003). Optimization of a language for data mining. In Proceedings of the 2003 ACM symposium on applied computing. Melbourne, Florida.

  • Meo, R., Psaila, G., & Ceri, S. (1996). A new SQL -like operator for mining association rules. In Proceedings of the 22st VLDB conference. Bombay, India.

  • Ng, R. T., Lakshmanan, L. V. S., Han, J., & Pang, A. (1998). Exploratory mining and pruning optimizations of constrained associations rules. In Proceedings of 1998 ACM SIGMOD international conference management of data, (pp. 13–24).

  • Nutt, W., Sagiv, Y., & Shurin, S. (1998). Deciding equivalence among aggregate queries. In Proceedings of ACM PODS, (pp. 214–223).

  • Padmanabhan, B., & Tuzhilin, A. (1999). Unexpectedness as a measure of interestingness in knowledge discovery. Decis. Support Syst, 27 (3), 303–318.

    Google Scholar 

  • Padmanabhan, B., & Tuzhilin, A. (2002). Knowledge refinement based on the discovery of unexpected patterns in data mining. Decis. Support Syst, 33 (3), 309–321.

    Google Scholar 

  • Park, C.-S., Kim, M. H., & Lee, Y.-J. (2001). Rewriting OLAP queries using materialized views and dimension hierarchies in data warehouses. In Proceeding of ICDE'01.

  • Perng, C.-S., Wang, H., Ma, S., & Hellerstein, J. L. (2002). Discovery in multi-attribute data with user-defined constraints. ACM SIGKDD Explorations, 4(1), 56–64.

    Google Scholar 

  • Pottinger, R., & Levy, A.Y. (2000). A scalable algorithm for answering queries using views. In The VLDB Journal, (pp. 484–495).

  • Quass, D., & Widom, J. (1997). On-line warehouse view maintenance. In Proceedings of ACM SIGMOD international conference on management of data, (pp. 393–404).

  • Raedt, L. D. (2002). A perspective on inductive databases. ACM SIGKDD Explorations, 4(2), 69–77.

    Google Scholar 

  • Raedt, L. D., Jaeger, M., Lee, S. D., & Mannila, H. (2002). A theory of inductive query answering. In Proceedings of IEEE int'l conf. on data mining, (pp. 123–130).

  • Sebag, M., & Rouveirol, C. (2000). Resource-bounded relational reasoning: induction and deduction through stochastic matching. Machine Learning, 38(1/2), 41–62.

    Article  Google Scholar 

  • Srikant, R., Vu, Q., & Agrawal, R. (1997). Mining association rules with item constraints. In Proceedings of 1997 ACM KDD, (pp. 67–73).

  • Tsur, D., Ullman, J. D., Abiteboul, S., Clifton, C., Motwani, R., Nestorov, S., & Rosenthal, A. (1998). Query flocks: a generalization of association-rule mining. In Proceedings of 1998 ACM SIGMOD international conference management of data.

  • Wang, H., & Zaniolo, C. (1998). User defined aggregates for logical data languages. In Proceedings of DDLP (pp. 85–97).

  • Wang, X.S., & Li, C. (1999). Deriving orthogonality to optimize the search for summary data. Information Systems 24(1), 47–65.

    Article  MathSciNet  Google Scholar 

  • Widom, J. (1995). Research problems in data warehousing. In Proceedings of fourth int. conf. on information and knowledge management, (pp. 25–30).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roberto Esposito.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Esposito, R., Meo, R. & Botta, M. Answering constraint-based mining queries on itemsets using previous materialized results. J Intell Inf Syst 26, 95–111 (2006). https://doi.org/10.1007/s10844-006-5453-z

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-006-5453-z

Keywords

Navigation