Online Recovery in Parallel Database Systems

Jimenez-Peris, Ricardo

doi:10.1007/978-0-387-39940-9_1089

Ricardo Jimenez-Peris

162 Accesses

Synonyms

High availability; Continuous availability; 24x7 operation

Definition

Replication (also known as clustering) is a technique to provide high availability in parallel and distributed databases. High availability aims to provide continuous service operation. High availability has two faces. On one hand, it provides fault-tolerance by introducing redundancy in the form of replication, that is, having multiple copies or replicas of the data at different sites. On the other hand, since sites holding the replicas may crash and/or fail, in order to keep a given degree of availability, failed or new replicas should be reintroduced into the system. Introducing new replicas requires transferring to them the current state in a consistent fashion (known as recovery). A simple solution to this problem is offline recovery, that is, in order to obtain a quiescent state, request processing is suspended, then the state is transferred from a working replica (termed recoverer replica) to the new...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 2,500.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

Bernstein P.A., Hadzilacos V., and Goodman N. Concurrency Control and Recovery in Database Systems. Addison Wesley, 1987.
Google Scholar
Castro M. and Liskov B. Practical byzantine fault tolerance and proactive recovery. ACM Trans. Comput. Syst., 20(4):398–461, 2002.
Article Google Scholar
Gançarski S. , Naacke H., Pacitti E., and Valduriez P. The leganet system: Freshness-aware transaction routing in a database cluster. Inform. Syst., 32(2):320–343, 2007.
Google Scholar
Gashi I., Popov P., and Strigini L. Fault Tolerance via Diversity for Off-The-Shelf Products: a Study with SQL Database Servers. IEEE Trans. Depend. Secur. Comput., 4(4):280–294, 2007.
Google Scholar
Jiménez-Peris R., M. Patiño-Martínez, and Alonso G. Non-Intrusive, Parallel Recovery of Replicated Data. In Proc. 21st Symp. on Reliable Distributed Syst., 2002, pp. 150–159.
Google Scholar
Kemme B. and Alonso G. Don’t be lazy, be consistent: Postgres-R, a new way to implement database replication. In Proc. 26th Int. Conf. on Very Large Data Bases, 2000, pp. 134–143.
Google Scholar
Kemme B. and Alonso G. A New Approach to Developing and Implementing Eager Database Replication Protocols. ACM Trans. Database Syst., 25(3):333–379, 2000.
Google Scholar
Kemme B., Bartoli A., and Babaoglu O. Online Reconfiguration in Replicated Databases Based on Group Communication. In Proc. Int. Conf. on Dependable Systems and Networks, 2001, pp. 117–130.
Google Scholar
Lau E. and Madden S. An Integrated Approach to Recovery and High Availability in an Updatable, Distributed Data Warehouse. In Proc. 32nd Int. Conf. on Very Large Data Bases. 2006, pp. 703–714.
Google Scholar
Manassiev K. and Amza C. Scaling and Continuous Availability in Database Server Clusters through Multiversion Replication. In Proc. Int. Conf. on Dependable Systems and Networks, 2007, pp. 666–676.
Google Scholar
Özsu M.T. and Valduriez P. Principles of Distributed Database Systems. Prentice-Hall, 2nd ed., 1999.
Google Scholar
Pacitti E. and Simon E. Update Propagation Strategies to Improve Freshness in Lazy Master Replicated Databases. VLDB J., 8(3):305–318, 2000.
Google Scholar
Patiño-Martínez M., Jiménez-Peris R., Kemme B., and Alonso G. Middle-R: Consistent Database Replication at the Middleware Level. ACM Trans. Comput. Syst., 23(4):375–423, 2005.
Google Scholar
Pedone F., Guerraoui R., and Schiper A. The Database State Machine Approach. Distributed and Parallel Databases, 14(1):71–98, 2003.
Article Google Scholar
Plattner C. and Alonso G. Ganymed: Scalable Replication for Transactional Web Applications. In Proc. ACM/IFIP/USENIX Int. Middleware Conf., 2004, pp. 155–174.
Google Scholar
PostgreSQL PostgreSQL Point in Time Recovery. http://www.postgresql.org/docs/8.0/interactive/backup-online.html.
Vandiver B., Balakrishnan H., Liskov B., and Madden S. Tolerating Byzantine Faults in Database Systems using Commit Barrier Scheduling. In Proc. 21st ACM Symp. on Operating System Principles, 2007, pp. 59–72.
Google Scholar

Download references

Author information

Authors and Affiliations

Authors

Ricardo Jimenez-Peris
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

College of Computing, Georgia Institute of Technology, 266 Ferst Drive, 30332-0765, Atlanta, GA, USA
LING LIU (Professor) (Professor)
Database Research Group David R. Cheriton School of Computer Science, University of Waterloo, 200 University Avenue West, N2L 3G1, Waterloo, ON, Canada
M. TAMER ÖZSU (Professor and Director, University Research Chair) (Professor and Director, University Research Chair)

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Jimenez-Peris, R. (2009). Online Recovery in Parallel Database Systems. In: LIU, L., ÖZSU, M.T. (eds) Encyclopedia of Database Systems. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-39940-9_1089

Download citation

DOI: https://doi.org/10.1007/978-0-387-39940-9_1089
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-35544-3
Online ISBN: 978-0-387-39940-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering

Publish with us

Policies and ethics