Abstract
Fault-tolerance has long been a feature of database systems, with transactions supporting the structuring of applications so as to ensure continuation of updating applications in spite of machine failures. For read-only queries the perceived wisdom has been that support for fault-tolerance is too expensive to be worthwhile. Distributed query processing (DQP) is coming to be seen as a promising way of implementing applications that combine structured data and analysis operations in dynamic distributed settings such as computational grids. Accordingly, a number of protocols have been described that support tolerance to failure of intermediate machines, so as to permit continuation from surviving intermediate state. However, a distributed query can have a non-trivial mapping onto hardware resources. Because of this it is often possible to choose between a number of possible recovery strategies in the event of a failure. The work described here makes an initial investigation in this area in the context of an example query expressed over distributed resources in a Grid and shows that it can be worthwhile to make this choice between recovery alternatives dynamically, at the point a failure is detected rather than statically beforehand.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
N. Alpdemir, A. Mukherjee, A. Gounaris, N. W. Paton, P. Watson, and Alvaro A. A. Fernandes. OGSA-DQP: A grid service for distributed querying on the grid. In EDBT, pages 858–861, 2004.
S. Babu and J. Widom. Continuous queries over data streams. SIGMOD Record, 30(3): 109–120, September 2001.
R. S. Barga, D. B. Lomet, S. Paparizos, H. Yu, and S. Chandrasekaran. Persistent applications via automatic recovery. In IDEAS, pages 258–267, 2003.
R. Braumandl, M. Keidl, A. Kemper, D. Kossmann, A. Kreutz, S. Pröls, S. Seltzsam, and K. Stacker. ObjectGlobe: Ubiquitous query processing. The VLDB Journal, 10(1):48–71, August 2001.
I. Foster and C. Kesselman, editors. The Grid 2: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, 2003.
I. T. Foster, C. Kesselman, J. M. Nick, and S. Tuecke. Grid services for distributed system integration. Computer, 35(6):37–46, June 2002.
G. Graefe. Encapsulation of parallelism in the Volcano query processing system. In SIGMOD, pages 102–111, Atlantic City, NJ, USA, 1990. ACM Press.
G. Graefe. Query evaluation techniques for large databases. ACM Computing Surveys, 25(2):73–170, June 1993.
J. Gray and A. Reuter. Transaction Processing: Concepts and Techniques. Morgan Kaufmann, 1993.
D. Georgakopoulos M. Hornick and A. Sheth. An overview of workflow management: From process modeling to workflow automation infrastructure. Distributed and Parallel Databases, 3(2):119–153, April 1995.
J. Hwang, M. Balazinska, A. Rasin, U. Çetintemel, M. Stonebraker, and S. Zdonik. High-availability algorithms for distributed stream processing. Technical Report CS-04-05, Brown University, May 2004.
M. Kamath, G. Alonso, R. Gunthor, and C. Mohan. Providing high availability in very large workflow management systems. In EDBT, pages 427–442, March 1996.
D. Kossman. The state of the art in distributed query processing. Computing Surveys, 32(4):422–469, December 2000.
W. Labio, J. Wiener, and H. Garcia-Molina. Efficient resumption of interrupted warehouse loads. In SIGMOD, pages 46–57. ACM Press, 2000.
T. Malik, A. Szalay, T. Budavari, and A. Thakar. Skyquery: A web service approach to federate databases. In CIDR, 2003.
S. Narayanan, T. M. Kurç, Ü. V. Çatalyürek, and J. H. Saltz. Database support for data-driven scientific applications in the grid. Parallel Processing Letters, 13(2):245–271, 2003.
The OGSA-DAI project. http://www.ogsadai.org.uk, 2005.
M. Shah, J. M. Hellerstein, S. Chandrasekaran, and M. J. Franklin. Flux: An adaptive partitioning operator for continuous query systems. In ICDE, pages 25–36. IEEE, 2003.
J. Smith, A. Gounaris, P. Watson, N. W. Paton, A. A.A. Fernandes, and R. Sakalleriou. Distributed query processing on the grid. In GRID, pages 279–290, November 2002.
J. Smith, A. Gounaris, P. Watson, N. W. Paton, A. A.A. Fernandes, and R. Sakalleriou. Distributed query processing on the grid. International Journal of High Performance Computing Applications, 17(4), November 2003.
J. Smith and P. Watson. Fault-tolerance in distributed query processing. In IDEAS. IEEE Computer Society, 2005.
X. Zhang, T. M. Kurç, T. Pan, Ü. V. Çatalyürek, S. Narayanan, P. Wyckoff, and J. H. Saltz. Strategies for using additional resources in parallel hash-based join algorithms. In HPDC, pages 4–13. IEEE Computer Society, 2004.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer Science+Business Media, LLC
About this paper
Cite this paper
Smith, J., Watson, P. (2007). Failure Recovery Alternatives in Grid-Based Distributed Query Processing: A Case Study. In: Talia, D., Bilas, A., Dikaiakos, M.D. (eds) Knowledge and Data Management in GRIDs. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-37831-2_4
Download citation
DOI: https://doi.org/10.1007/978-0-387-37831-2_4
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-37830-5
Online ISBN: 978-0-387-37831-2
eBook Packages: Computer ScienceComputer Science (R0)