Skip to main content

Failure Recovery Alternatives in Grid-Based Distributed Query Processing: A Case Study

  • Conference paper
Knowledge and Data Management in GRIDs
  • 285 Accesses

Abstract

Fault-tolerance has long been a feature of database systems, with transactions supporting the structuring of applications so as to ensure continuation of updating applications in spite of machine failures. For read-only queries the perceived wisdom has been that support for fault-tolerance is too expensive to be worthwhile. Distributed query processing (DQP) is coming to be seen as a promising way of implementing applications that combine structured data and analysis operations in dynamic distributed settings such as computational grids. Accordingly, a number of protocols have been described that support tolerance to failure of intermediate machines, so as to permit continuation from surviving intermediate state. However, a distributed query can have a non-trivial mapping onto hardware resources. Because of this it is often possible to choose between a number of possible recovery strategies in the event of a failure. The work described here makes an initial investigation in this area in the context of an example query expressed over distributed resources in a Grid and shows that it can be worthwhile to make this choice between recovery alternatives dynamically, at the point a failure is detected rather than statically beforehand.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. N. Alpdemir, A. Mukherjee, A. Gounaris, N. W. Paton, P. Watson, and Alvaro A. A. Fernandes. OGSA-DQP: A grid service for distributed querying on the grid. In EDBT, pages 858–861, 2004.

    Google Scholar 

  2. S. Babu and J. Widom. Continuous queries over data streams. SIGMOD Record, 30(3): 109–120, September 2001.

    Article  Google Scholar 

  3. R. S. Barga, D. B. Lomet, S. Paparizos, H. Yu, and S. Chandrasekaran. Persistent applications via automatic recovery. In IDEAS, pages 258–267, 2003.

    Google Scholar 

  4. R. Braumandl, M. Keidl, A. Kemper, D. Kossmann, A. Kreutz, S. Pröls, S. Seltzsam, and K. Stacker. ObjectGlobe: Ubiquitous query processing. The VLDB Journal, 10(1):48–71, August 2001.

    MATH  Google Scholar 

  5. I. Foster and C. Kesselman, editors. The Grid 2: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, 2003.

    Google Scholar 

  6. I. T. Foster, C. Kesselman, J. M. Nick, and S. Tuecke. Grid services for distributed system integration. Computer, 35(6):37–46, June 2002.

    Article  Google Scholar 

  7. G. Graefe. Encapsulation of parallelism in the Volcano query processing system. In SIGMOD, pages 102–111, Atlantic City, NJ, USA, 1990. ACM Press.

    Google Scholar 

  8. G. Graefe. Query evaluation techniques for large databases. ACM Computing Surveys, 25(2):73–170, June 1993.

    Article  Google Scholar 

  9. J. Gray and A. Reuter. Transaction Processing: Concepts and Techniques. Morgan Kaufmann, 1993.

    Google Scholar 

  10. D. Georgakopoulos M. Hornick and A. Sheth. An overview of workflow management: From process modeling to workflow automation infrastructure. Distributed and Parallel Databases, 3(2):119–153, April 1995.

    Article  Google Scholar 

  11. J. Hwang, M. Balazinska, A. Rasin, U. Çetintemel, M. Stonebraker, and S. Zdonik. High-availability algorithms for distributed stream processing. Technical Report CS-04-05, Brown University, May 2004.

    Google Scholar 

  12. M. Kamath, G. Alonso, R. Gunthor, and C. Mohan. Providing high availability in very large workflow management systems. In EDBT, pages 427–442, March 1996.

    Google Scholar 

  13. D. Kossman. The state of the art in distributed query processing. Computing Surveys, 32(4):422–469, December 2000.

    Article  Google Scholar 

  14. W. Labio, J. Wiener, and H. Garcia-Molina. Efficient resumption of interrupted warehouse loads. In SIGMOD, pages 46–57. ACM Press, 2000.

    Google Scholar 

  15. T. Malik, A. Szalay, T. Budavari, and A. Thakar. Skyquery: A web service approach to federate databases. In CIDR, 2003.

    Google Scholar 

  16. S. Narayanan, T. M. Kurç, Ü. V. Çatalyürek, and J. H. Saltz. Database support for data-driven scientific applications in the grid. Parallel Processing Letters, 13(2):245–271, 2003.

    Article  MathSciNet  Google Scholar 

  17. The OGSA-DAI project. http://www.ogsadai.org.uk, 2005.

    Google Scholar 

  18. M. Shah, J. M. Hellerstein, S. Chandrasekaran, and M. J. Franklin. Flux: An adaptive partitioning operator for continuous query systems. In ICDE, pages 25–36. IEEE, 2003.

    Google Scholar 

  19. J. Smith, A. Gounaris, P. Watson, N. W. Paton, A. A.A. Fernandes, and R. Sakalleriou. Distributed query processing on the grid. In GRID, pages 279–290, November 2002.

    Google Scholar 

  20. J. Smith, A. Gounaris, P. Watson, N. W. Paton, A. A.A. Fernandes, and R. Sakalleriou. Distributed query processing on the grid. International Journal of High Performance Computing Applications, 17(4), November 2003.

    Google Scholar 

  21. J. Smith and P. Watson. Fault-tolerance in distributed query processing. In IDEAS. IEEE Computer Society, 2005.

    Google Scholar 

  22. X. Zhang, T. M. Kurç, T. Pan, Ü. V. Çatalyürek, S. Narayanan, P. Wyckoff, and J. H. Saltz. Strategies for using additional resources in parallel hash-based join algorithms. In HPDC, pages 4–13. IEEE Computer Society, 2004.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Science+Business Media, LLC

About this paper

Cite this paper

Smith, J., Watson, P. (2007). Failure Recovery Alternatives in Grid-Based Distributed Query Processing: A Case Study. In: Talia, D., Bilas, A., Dikaiakos, M.D. (eds) Knowledge and Data Management in GRIDs. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-37831-2_4

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-37831-2_4

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-37830-5

  • Online ISBN: 978-0-387-37831-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics