Skip to main content
Log in

Query optimization via contention space partitioning and cost error controlling for dynamic multidatabase systems

  • Published:
Distributed and Parallel Databases Aims and scope Submit manuscript

Abstract

A multidatabase system (MDBS) integrates information from multiple autonomous local databases. Performing global query optimization to achieve efficient query processing in such a system is challenging due to local autonomy of the data sources. Dynamic factors in the environment make the problem even more difficult. In this paper, we present two techniques, i.e., contention space partitioning and cost error controlling, to perform global query optimization in a dynamic MDBS. Both techniques generate an execution plan with multiple versions for a query in a dynamic MDBS, utilizing the multistate cost models built for the dynamic environment via our previous multistate query sampling method. The first technique partitions the contention space of a dynamic multidatabase environment into a given number of subspaces and chooses a good query execution plan version for each subspace, while the second technique selects a set of execution plan versions by using a given error tolerance to control query execution costs. Experiments demonstrate that the proposed techniques are quite promising for performing global query optimization in a dynamic MDBS. Compared with related work on dynamic query optimization, our approach has an advantage of avoiding the high overhead for modifying or re-generating an execution plan for a query based on dynamic runtime information.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Adali, S., et al.: Query caching and optimization in distributed mediator systems. In: Proc. of ACM SIGMOD Conf., pp. 137–148 (1996)

  2. Amsaleg, L., Franklin, M.J., Tomasic, A., Urhan, T.: Scrambling query plans to cope with unexpected delays. In: Proc. of Int. Conf. on Paral. and Distr. Inf. Syst., pp. 208–219 (1996)

  3. Amsaleg, L., et al.: Scrambling query plans to cope with unexpected delays. In: Proc. of Int. Conf. on Paral. and Distr. Inf. Syst., pp. 208–219 (1996)

  4. Arasu, A., Babcock, B., et al.: STREAM: the Stanford stream data manager. IEEE Data Eng. Bull. 26(1), 19–26 (2003)

    Google Scholar 

  5. Bouganim, L., et al.: Dynamic query scheduling in data integration systems. In: Proc. of IEEE Int. Conf. on Data Eng., pp. 425–434 (2000)

  6. Chandrasekaran, S., Cooper, O., et al.: TelegraphCQ: continuous dataflow processing for an uncertain world. In: Proc. of CIDR Conf., pp. 1–12 (2003)

  7. Chandrasekaran, S., Cooper, O., et al.: TelegraphCQ: continuous dataflow processing. In: Proc. of ACM SIGMOD Conf., pp. 668 (2003)

  8. Chen, A.L.P.: Outerjoin optimization in multidatabase systems. In: Proc. of Int. Symp. on DB in Paral. and Distr. Syst., pp. 211–218 (1990)

  9. Chen, C.-M., Sun, W., Rishe, N.: Performance comparison of three alternatives of distributed multidatabase systems: a global query perspective.. In: Proc. of Int. Conf. on Performance, Computing and Communications, pp. 53–59 (1998)

  10. Cheng, X., Dong, G., Lau, T., Su, J.: Data integration by describing sources with constraint databases. In: Proc. of IEEE Int. Conf. on Data Eng., pp. 374–381 (1999)

  11. Reiss, F., Hellerstein, J.M.: Lifting the burden of history from adaptive query processing. In: Proc. of VLDB Conf., pp. 948–959 (2004)

  12. Du, W., et al.: Query optimization in heterogeneous DBMS. In: Proc. of VLDB Conf., pp. 277–291 (1992)

  13. Du, W., Shan, M.C., Dayal, U.: Reducing multidatabase query response time by tree balancing. In: Proc. of ACM SIGMOD Conf., pp. 293–303 (1995)

  14. Evrendilek, C., Dogac, A., Nural, S., Ozcan, F.: Multidatabase query optimization. Distrib. Parallel Databases 5(1), 77–113 (1997)

    Article  Google Scholar 

  15. Garcia-Molina, H., Labio, W., Yerneni, R.: Capability-sensitive query processing on Internet sources. In: Proc. of IEEE Int. Conf. on Data Eng., pp. 50–59 (1999)

  16. Gardarin, G., et al.: Calibrating the query optimizer cost model of IRO-DB, an object-oriented federated database system. In: Proc. of VLDB Conf., pp. 378–389 (1996)

  17. Goni, A., Bermudez, J., Blanco, J.M., Illarramendi, A.: Using reasoning of description logics for query processing in multidatabase systems. In: Proc. of the 3rd Workshop on Knowl. Repres. Meets DB, pp. 1–6 (1996)

  18. Hsu, C.-N., Knoblock, C.A.: Reformulating query plans for multidatabase systems. In: Proc. of ACM CIKM Conf., pp. 423–432 (1993)

  19. Hsu, C.-N., Knoblock, C.A.: Semantic query optimization for query plans of heterogeneous multidatabase systems. IEEE Trans. Knowl. Data Eng. 12(6), 959–978 (2000)

    Article  Google Scholar 

  20. Ives, Z.G., Florescu, D., Friedman, M.: An adaptive query execution system for data integration. In: Proc. of ACM SIGMOD Conf., pp. 299–310 (1999)

  21. Ives, Z.G., Levy, A.Y., Weld, D.S.: Adaptive query processing for Internet applications. IEEE Data Eng. Bull. 23(2), 19–26 (2000)

    Google Scholar 

  22. Josifovski, V., Katchaounov, T., Risch, T.: Optimizing queries in distributed and composable mediators. In: Proc. of Int. Conf. CoopIS, pp. 291–302 (1999)

  23. Josinski, H.: Dynamic query optimization and query processing in multidatabase systems. In: Int. Conf. on Extending DB Tech. Ph.D. Workshop, pp. 1–4 (2000)

  24. Kang, S., Moon, S.: Global query management in heterogeneous distributed database systems. Microproces. Microprogram. 38, 377–384 (1993)

    Article  Google Scholar 

  25. Lee, C., Chen, C.J.: Query optimization in multidatabase systems considering schema conflicts. IEEE Trans. Know. Data Eng. 9(6), 941–955 (1997)

    Article  Google Scholar 

  26. Lee, J.-O., Baik, D.-K.: SemQL: a semantic query language for multidatabase systems. In: Proc. of ACM CIKM Conf., pp. 259–266 (1999)

  27. Levy, A.Y., Rajaraman, A., Ordille, J.J.: Querying heterogeneous information sources using source descriptions. In: Proc. of VLDB Conf., pp. 226–251

  28. Lim, E.-P., et al.: An algebraic transformation framework for multidatabase queries. Distrib. Parallel Databases 3, 273–307 (1995)

    Article  Google Scholar 

  29. Motwani, R., Widom, J., et al.: Query processing, resource management, and approximation in a data stream management system. In: Proc. of CIDR Conf., pp. 1–12 (2003)

  30. Naacke, H., Gardarin, G., Tomasic, A.: Leveraging mediator cost models with heterogeneous data sources. In: Proc. of IEEE Int. Conf. on Data Eng., pp. 351–360 (1998)

  31. Otsuka, S., Miyazaki, N.: An incomplete database approach to global query processing. In: Proc. of the 12th Int. Conf. on Inf. Networking, pp. 337–342 (1998)

  32. Ozcan, F., Nural, S., Koksal, P., Evrendilek, C.: Dynamic query optimization in multidatabases. IEEE Data Eng. Bull. 20(3), 38–44 (1997)

    Google Scholar 

  33. Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C: the Art of Scientific Computing, 2nd edn. Cambridge University Press, Cambridge (1992)

    Google Scholar 

  34. Rahal, A., Zhu, Q., Larson, P.-Å.: Evolutionary techniques for updating query cost models in a dynamic multidatabase environment. VLDB J. 13(2), 162–176 (2004)

    Article  Google Scholar 

  35. Reiss, F., Hellerstein, J.M.: Data Triage: an adaptive architecture for load shedding in TelegraphCQ. In: Proc. of IEEE Int. Conf. on Data Eng., pp. 155–156 (2005)

  36. Roth, M.T. et al.: Cost models DO matter: providing cost information for diverse data sources in a federated system. In: Proc. of VLDB Conf., pp. 599–610 (1999)

  37. Subramanian, D.K., Subramanian, K.: Query optimization in multidatabase systems. Distrib. Parallel Databases 6(3), 183–210 (1998)

    Article  Google Scholar 

  38. Tsai, P.S.M., Chen, A.L.P.: Optimizing entity join queries when data transmission cost dominates. Data Knowl. Eng. 22, 283–308 (1997)

    Article  Google Scholar 

  39. Tomasic, A., Raschid, L.: Scaling access to heterogeneous data sources with DISCO. IEEE Trans. Knowl. Data Eng. 10(5), 808–823 (1998)

    Article  Google Scholar 

  40. Urhan, T., Franklin, M.J., Amsaleg, L.: Cost-based query scrambling for initial delays. In: Proc. of ACM SIGMOD Conf., pp. 130–141 (1998)

  41. Vassalos, V., Papakonstantinou, Y.: Describing and using query capabilities of heterogeneous sources. In: Proc. of VLDB Conf., pp. 256–265 (1997)

  42. Wei, C.-P., Sheng, O.R.L., Hu, P.J.-H.: Fuzzy statistics estimation in supporting multidatabase query optimization. Electron. Commer. Res. 2(3), 287–316 (2002)

    Article  MATH  Google Scholar 

  43. Zhu, Q., Haridas, J., Hou, W.-C.: Global query optimization based on multistate cost models for a dynamic multidatabase system. In: Proc. of Int. Conf. on Enterprise Infor. Syst., pp. 144–155 (2003)

  44. Zhu, Q., Larson, P.-Å.: A query sampling method for estimating local cost parameters in a multidatabase system. In: Proc. of IEEE Int. Conf. on Data Eng., pp. 144–153 (1994)

  45. Zhu, Q., Larson, P.-Å.: Building regression cost models for multidatabase systems. In: Proc. of Int. Conf. on Paral. and Distr. Inf. Syst., pp. 220–231 (1996)

  46. Zhu, Q., Larson, P.-Å.: Global query processing and optimization in the CORDS multidatabase system. In: Proc. of 9th Int. Conf. on Paral. and Distr. Comp. Syst., pp. 640–646 (1996)

  47. Zhu, Q., Larson, P.-Å.: A fuzzy query optimization approach for multidatabase systems. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 5(6), 701–722 (1997)

    Article  MathSciNet  Google Scholar 

  48. Zhu, Q., Larson, P.-Å.: Solving local cost estimation problem for global query optimization in multidatabase systems. Distrib. Parallel Databases 6(4), 373–420 (1998)

    Article  Google Scholar 

  49. Zhu, Q., Sun, Y., Motheramgari, S.: Developing cost models with qualitative variables for dynamic multidatabase environments. In: Proc. of IEEE Int. Conf. on Data Eng., pp. 413–424 (2000)

  50. Zhu, Q., Larson, P.-Å.: Classifying local queries for global query optimization in multidatabase systems. Int. J. Cooperative Inf. Syst. 9(3), 315–355 (2000)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiang Zhu.

Additional information

Communicated by Ahmed K. Elmagarmid.

Research was supported by the US National Science Foundation under Grant # IIS-9811980 and The University of Michigan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhu, Q., Haridas, J. & Hou, WC. Query optimization via contention space partitioning and cost error controlling for dynamic multidatabase systems. Distrib Parallel Databases 23, 151–188 (2008). https://doi.org/10.1007/s10619-008-7025-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10619-008-7025-4

Keywords

Navigation