Adaptive workload allocation in query processing in autonomous heterogeneous environments

Gounaris, Anastasios; Smith, Jim; Paton, Norman W.; Sakellariou, Rizos; Fernandes, Alvaro A. A.; Watson, Paul

doi:10.1007/s10619-008-7032-5

Adaptive workload allocation in query processing in autonomous heterogeneous environments

Published: 28 October 2008

Volume 25, pages 125–164, (2009)
Cite this article

Distributed and Parallel Databases Aims and scope Submit manuscript

Anastasios Gounaris¹,
Jim Smith³,
Norman W. Paton²,
Rizos Sakellariou²,
Alvaro A. A. Fernandes² &
…
Paul Watson³

151 Accesses
10 Citations
Explore all metrics

Abstract

The increasing prevalence of networked storage and computational resources, along with middleware for managing resource access and sharing, raises the prospect that queries can be run over resources obtained on demand, rather than on dedicated infrastructures. However, the movement of query processing into non-dedicated environments means that it is necessary to take account of the partial information and unstable conditions that characterise autonomous, shared, distributed settings. Thus, query processing on grid platforms needs to be adaptive, revising evaluation strategies at query runtime in response to the evolving environment, such as changes to machine load and availability. To address this challenge, adaptive techniques are described that: (i) balance load across plan partitions supporting intra-operator parallelism; (ii) remove bottlenecks in pipelined plans supporting inter-operator parallelism; and (iii) combine the two aforementioned techniques. The approach has been empirically evaluated in a grid-enabled adaptive query processor.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

A survey of Kubernetes scheduling algorithms

Article Open access 13 June 2023

Khaldoun Senjab, Sohail Abbas, … Atta ur Rehman Khan

Cloud benchmarking and performance analysis of an HPC application in Amazon EC2

Article Open access 28 June 2023

Tamara Dancheva, Unai Alonso & Michael Barton

A novel strategy for deterministic workflow scheduling with load balancing using modified min-min heuristic in cloud computing environment

Article 15 March 2024

Anjali Choudhary & Ranjit Rajak

References

Alpdemir, M.N., Mukherjee, A., Paton, N.W., Watson, P., Fernandes, A.A.A., Gounaris, A., Smith, J.: Service-based distributed querying on the grid. In: Proc. 1st ICSOC, pp. 467–482. Springer, Berlin (2003)
Google Scholar
Antonioletti, M., Atkinson, M., Baxter, R., Borley, A., Chue Hong, N.P., Collins, B., Hardman, N., Hulme, A.C., Knox, A., Jackson, M., Krause, A., Laws, S., Magowan, J., Paton, N.W., Pearson, D., Sugden, T., Watson, P., Westhead, M.: The design and implementation of grid database services in OGSA-DAI. Concurr. Pract. Exper. 17, 357–376 (2005)
Article Google Scholar
Arpaci-Dusseau, R., Anderson, E., Treuhaft, N., Culler, D., Hellerstein, J., Patterson, D., Yelick, K.: Cluster I/O with river: making the fast case common. In: Proc. of the Sixth IOPADS Workshop, pp. 10–22 (1999)
Avnur, R., Hellerstein, J.: Eddies: continuously adaptive query processing. In: Proc. of ACM SIGMOD 2000, pp. 261–272 (2000)
Babu, S., Bizarro, P., DeWitt, D.: Proactive re-optimization. In: Proc. ACM SIGMOD, pp. 107–118 (2005)
Babu, S., Bizarro, P.: Adaptive query processing in the looking glass. In: CIDR, pp. 238–249 (2005)
Braumandl, R., Keidl, M., Kemper, A., Kossmann, K., Kreutz, A., Seltzsam, S., Stocker, K.: ObjectGlobe: ubiquitous query processing on the Internet. VLDB J. 10(1), 48–71 (2001)
MATH Google Scholar
Chandrasekaran, S., Franklin, M.: PSoup: a system for streaming queries over streaming data. VLDB J. 12, 140–156 (2003)
Article Google Scholar
Chaudhuri, S., Narasayya, V., Ramamurthy, R.: Estimating progress of execution for sql queries. In: Proc. of ACM SIGMOD, pp. 803–814 (2004)
Cherniack, M., Balakrishnan, H., Balazinska, M., Carney, D., Cetintemel, U., Xing, Y., Zdonik, S.: Scalable distributed stream processing. In: CIDR (2003)
Yang, H.C., Dasdan, A., Hsiao, R.-L., Parker, D.S. Jr.: Map-reduce-merge: simplified relational data processing on large clusters. In: SIGMOD Conference, pp. 1029–1040 (2007)
Culler, D.E.: Planetlab: an open, community-driven infrastructure for experimental planetary-scale services. In: USENIX Symposium on Internet Technologies and Systems (2003)
Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. In: OSDI, pp. 137–150 (2004)
Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Article Google Scholar
Deshpande, A., Hellerstein, J.M.: Lifting the burden of history from adaptive query processing. In: Proc. of 30th VLDB Conf., pp. 948–959 (2004)
Deshpande, A., Ives, Z.G., Raman, V.: Adaptive query processing. Found. Trends Databases 1(1), 1–140 (2007)
Article Google Scholar
Eugster, P.Th., Felber, P.A., Guerraoui, R., Kermarrec, A.-M.: The many faces of publish/subscribe. ACM Comput. Surv. 35(2), 114–131 (2003)
Article Google Scholar
Foster, I., Kesselman, C.: The Grid: Blueprint for a New Computing Infrastructure, 2nd edn. Morgan Kaufmann, San Mateo (2003)
Google Scholar
Gounaris, A., Paton, N.W., Fernandes, A.A.A., Sakellariou, R.: Self monitoring query execution for adaptive query processing. Data Knowl. Eng. 51(3), 325–348 (2004)
Article Google Scholar
Gounaris, A., Paton, N.W., Sakellariou, R., Fernandes, A.A.A.: Adapting to changing resource performance in grid query processing. In: 1st Int. Workshop on Data Management in Grids, pp. 30–44. Springer, Berlin (2005)
Google Scholar
Gounaris, A., Sakellariou, R., Paton, N.W., Fernandes, A.A.A.: A novel approach to resource scheduling for parallel query processing on computational grids. Distrib. Parallel Databases 19(2–3), 87–106 (2006)
Article Google Scholar
Graefe, G.: Encapsulation of parallelism in the volcano query processing system. In: Proc. SIGMOD, pp. 102–111 (1990)
Hameurlain, A., Morvan, F.: CPU and incremental memory allocation in dynamic parallelization of SQL queries. Parallel Comput. 28(4), 525–556 (2002)
Article MATH Google Scholar
Hellerstein, J.M., Stonebraker, M.: Predicate migration: optimizing queries with expensive predicates. In: SIGMOD Conference, pp. 267–276 (1993)
Ives, Z.: Efficient query processing for data integration. PhD thesis, University of Washington (2002)
Ives, Z., Florescu, D., Friedman, M., Levy, A., Weld, D.: An adaptive query execution system for data integration. In: Proc. of ACM SIGMOD 1999, pp. 299–310 (1999)
Ives, Z., Halevy, A., Weld, D.: Adapting to source properties in processing data integration queries. In: Proc. of ACM SIGMOD, pp. 395–406 (2004)
Josifovski, V., Schwarz, P., Haas, L., Lin, E.: Garlic: a new flavor of federated query processing for db2. In: Proc. of ACM SIGMOD, pp. 524–532 (2002)
Kabra, N., DeWitt, D.: Efficient mid-query re-optimization of sub-optimal query execution plans. In: Proc. of ACM SIGMOD, pp. 106–117 (1998)
Li, Q., Shao, M., Markl, V., Beyer, K.S., Colby, L.S., Lohman, G.M.: Adaptively reordering joins during query execution. In: ICDE, pp. 26–35 (2007)
Liu, D.T., Franklin, M.J.: GridDB: a data-centric overlay for scientific grids. In: Proc. VLDB, pp. 600–611. Morgan Kaufmann, San Mateo (2004)
Chapter Google Scholar
Markl, V., Raman, V., Simmen, D.E., Lohman, G.M., Pirahesh, H.: Robust query processing through progressive optimization. In: Proc. ACM SIGMOD, pp. 659–670 (2004)
Narayanan, S., Kurc, T.M., Saltz, J.: Database support for data-driven scientific applications in the grid. Parallel Process. Lett. 13(2), 245–271 (2003)
Article MathSciNet Google Scholar
Ng, K., Wang, Z., Muntz, R., Nittel, S.: Dynamic query re-optimization. In: Proc. of 11th SSDBM, pp. 264–273 (1999)
Olston, C., Reed, B., Srivastava, U., Kumar, R., Tomkins, A.: Pig Latin: a not-so-foreign language for data processing. In: SIGMOD Conference, pp. 1099–1110 (2008)
Oram, A.: Peer-to-Peer: Harnessing the Power of Disruptive Technologies. O’Reilly (2001)
Ozcan, F., Nural, S., Koksal, P., Evrendilek, C., Dogac, A.: Dynamic query optimization in multidatabases. IEEE Data Eng. Bull. 20(3), 38–45 (1997)
Google Scholar
Paton, N.W., Chávez, J.B., Chen, M., Raman, V., Swart, G., Narang, I., Yellin, D.M., Fernandes, A.A.A.: Autonomic query parallelization using non-dedicated computers: an evaluation of adaptivity options. VLDB J. (2008). doi:10.1007/s00778-007-0090-x
Google Scholar
Porto, F., da Silva, V.F.V., Dutra, M.L., Schulze, B.: An adaptive distributed query processing grid service. In: Proc. 1st Data Management in Grids Workshop, pp. 45–57. Springer, Berlin (2005)
Google Scholar
Raman, V., Han, W., Narang, I.: Parallel querying with non-dedicated computers. In: Proc. VLDB, pp. 61–72 (2005)
Raman, V., Raman, B., Hellerstein, J.: Online dynamic reordering for interactive data processing. In: Proc. of 25th VLDB Conference, pp. 709–720 (1999)
Shah, M., Hellerstein, J., Chandrasekaran, S., Franklin, M.: Flux: an adaptive partitioning operator for continuous query systems. In: Proc. of ICDE, pp. 25–36 (2003)
Shah, M.A., Hellerstein, J.M., Brewer, E.A.: Highly available fault-tolerant, parallel dataflows. In: Proc. SIGMOD, pp. 827–838 (2004)
Smith, J., Gounaris, A., Watson, P., Paton, N.W., Fernandes, A.A.A., Sakellariou, R.: Distributed query processing on the grid. Intl. J. High Perform. Comput. Appl. 17(4), 353–368 (2003)
Article Google Scholar
Smith, J., Watson, P.: Fault-tolerance in distributed query processing. In: Proc. 9th IDEAS, pp. 329–338 (2005)
Srivastava, U., Munagala, K., Widom, J., Motwani, R.: Query optimization over web services. In: VLDB, pp. 355–366 (2006)
Stonebraker, M., Aoki, P.M., Litwin, W., Pfeffer, A., Sah, A., Sidell, J., Staelin, C., Mariposa, A.Yu.: A wide-area distributed database system. VLDB J. 5(1), 48–63 (1996)
Article Google Scholar
Tian, F., DeWitt, D.: Tuple routing strategies for distributed eddies. In: Proc. of 29th VLDB Conference, pp. 333–344 (2003)
Wang, X., Burns, R., Terzis, A.: Throughput-optimized, global-scale join processing in scientific federations. In: NETB’07: Proceedings of the 3rd USENIX International Workshop on Networking Meets Databases, pp. 1–6. USENIX Association, Berkeley (2007)
Google Scholar
Wang, X., Burns, R.C., Terzis, A., Deshpande, A.: Network-aware join processing in global-scale database federations. In: ICDE, pp. 586–595 (2008)
Xing, Y., Zdonik, S., Hwang, J.-H.: Dynamic load distribution in the Borealis stream processor. In: Proc ICDE, pp. 791–802 (2005)
Yu, M.J., Sheu, P.C.-Y.: Adaptive join algorithms in dynamic distributed databases. Distrib. Parallel Databases 5(1), 5–30 (1997)
Article Google Scholar
Zhou, Y., Ooi, B.C., Tan, K.-L., Tok, W.H.: An adaptable distributed query processing architecture. Data Knowl. Eng. 53(3), 283–309 (2005)
Article Google Scholar
Zhu, Y., Rundensteiner, E.A., Heineman, G.T.: Dynamic plan migration for continuous queries over data streams. In: Proc. ACM SIGMOD, pp. 431–442 (2004)

Download references

Author information

Authors and Affiliations

Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki, 541 24, Greece
Anastasios Gounaris
School of Computer Science, University of Manchester, Oxford Road, Manchester, M13 9PL, UK
Norman W. Paton, Rizos Sakellariou & Alvaro A. A. Fernandes
School of Computing Science, University of Newcastle upon Tyne, Newcastle upon Tyne, NE1 7RU, UK
Jim Smith & Paul Watson

Authors

Anastasios Gounaris
View author publications
You can also search for this author in PubMed Google Scholar
Jim Smith
View author publications
You can also search for this author in PubMed Google Scholar
Norman W. Paton
View author publications
You can also search for this author in PubMed Google Scholar
Rizos Sakellariou
View author publications
You can also search for this author in PubMed Google Scholar
Alvaro A. A. Fernandes
View author publications
You can also search for this author in PubMed Google Scholar
Paul Watson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anastasios Gounaris.

Additional information

Communicated by Ahmed K. Elmagarmid.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gounaris, A., Smith, J., Paton, N.W. et al. Adaptive workload allocation in query processing in autonomous heterogeneous environments. Distrib Parallel Databases 25, 125–164 (2009). https://doi.org/10.1007/s10619-008-7032-5

Download citation

Received: 23 December 2006
Accepted: 13 October 2008
Published: 28 October 2008
Issue Date: June 2009
DOI: https://doi.org/10.1007/s10619-008-7032-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Adaptive workload allocation in query processing in autonomous heterogeneous environments

Abstract

Access this article

Similar content being viewed by others

A survey of Kubernetes scheduling algorithms

Cloud benchmarking and performance analysis of an HPC application in Amazon EC2

A novel strategy for deterministic workflow scheduling with load balancing using modified min-min heuristic in cloud computing environment

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Adaptive workload allocation in query processing in autonomous heterogeneous environments

Abstract

Access this article

Similar content being viewed by others

A survey of Kubernetes scheduling algorithms

Cloud benchmarking and performance analysis of an HPC application in Amazon EC2

A novel strategy for deterministic workflow scheduling with load balancing using modified min-min heuristic in cloud computing environment

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation