Abstract
The most important non-functional requirements for an SQL server are performance and dependability. This paper argues, based on empirical results from our on-going research with diverse SQL servers, in favour of diverse redundancy as a way of improving both. We show evidence that current data replication solutions are insufficient to protect against the range of faults documented for database servers; outline possible fault-tolerant architectures using diverse servers; discuss the design problems involved; and offer evidence of the potential for performance improvement through diverse redundancy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Babbage, C.: On the Mathematical Powers of the Calculating Engine (Unpublished manuscript, December 1837). In: Randell, B. (ed.) The Origins of Digital Computers: Selected Papers, pp. 17–52. Springer, Heidelberg (1974)
Traverse, P.J.: AIRBUS and ATR System Architecture and Specification. In: Voges, U. (ed.) Software diversity in computerized control systems, pp. 95–104. Springer, Heidelberg (1988)
Randell, B.: System Structure for Software Fault-Tolerance. In: International Conference on Reliable Software, Los Angeles, California (April 1975); ACM SIGPLAN Notices 10(6), 437–449 (June 1975)
Lyu, M.R. (ed.): Software Fault Tolerance. Trends in Software Series. Wiley, Chichester (1995)
Avizienis, A., Kelly, J.P.J.: Fault Tolerance by Design Diversity: Concepts and Experiments. IEEE Computer 17(8), 67–80 (1984)
Laprie, J.C., et al.: Definition and Analysis of Hardware-and-Software Fault-Tolerant Architectures. IEEE Computer 23(7), 39–51 (1990)
Voges, U. (ed.): Software diversity in computerized control systems; Avizienis, A., Kopetz, H., Laprie, J.C. (ed.): Dependable Computing and Fault-Tolerance series, vol. 2. Springer, Wien (1988)
Avizienis, A., et al.: The UCLA DEDIX System: A Distributed Testbed for Multiple-Version Software. In: Proc. of 15th IEEE International Symposium on Fault-Tolerant Computing (FTCS-15), Ann Arbor, Michigan, USA, pp. 126–134. IEEE Computer Society Press, Los Alamitos (1985)
Pullum, L.: Software Fault Tolerance Techniques and Implementation, Artech House (2001)
Bernstein, P.A., Hadzilacos, V., Goodman, N.: Concurrency Control and Recovery in Database Systems. Addison-Wesley, Reading (1987)
Sutter, H.: SQL/Replication Scope and Requirements document, in ISO/IEC JTC 1/SC 32 Data Management and Interchange WG3 Database Languages, p. 7 (2000)
Kalyanakrishnam, M., Kalbarczyk, Z., Iyer, R.: Failure Data Analysis of LAN of Windows NT Based Computers. In: Proc. of 18th Symposium on Reliable and Distributed Systems (SRDS 1999), Lausanne, Switzerland, pp. 178–187 (1999)
Schneider, F.: Byzantine generals in action: Implementing fail-stop processors. ACM Transactions on Computing Systems 2(2), 145–154 (1984)
Gashi, I., Popov, P., Strigini, L.: Fault diversity among off-the-shelf SQL database servers. In: Proc. of Inter. Conf. on Dependable Systems and Networks (DSN 2004), Florence, Italy, IEEE Computer Society Press, Los Alamitos (2004) (to appear)
Chandra, S., Chen, P.M.: How fail-stop are programs. In: Proc. of 28th IEEE International Symposium on Fault-Tolerant Computing (FTCS-28), pp. 240–249. IEEE Computer Society Press, Los Alamitos (1998)
Gray, J.: Why do computers stop and what can be done about it? In: Proc. of 5th Symp. on Reliability in Distributed Software and Database Systems (SRDSDS-5), Los Angeles, CA, USA, pp. 3–12. IEEE Computer Society Press, Los Alamitos (1986)
Chandra, S., Chen, P.M.: Whither Generic Recovery from Application Faults? In: A Fault Study using Open-Source Software, in Proc. of Inter. Conf. on Dependable Systems and Networks (DSN 2000), NY, USA, pp. 97–106. IEEE Computer Society Press, Los Alamitos (2000)
Jimenez-Peris, R., et al.: Are Quorums an Alternative for Data Replication? ACM Transactions on Database Systems 28(3), 257–294 (2003)
Jimenez-Peris, R., et al.: How to Select a Replication Protocol According to Scalability, Availability and Communication Overhead. In: Proc. of Int. Symp. on Reliable Distributed Systems (SRDS), New Orleans, Louisiana, pp. 24–33. IEEE Computer Society Press, Los Alamitos (2001)
Kemme, B., Alonso, G.: Don’t be lazy, be consistent: Postgres-R, A new way to implement Database Replication. In: Proc. of Int. Conf. on Very Large Databases (VLDB), Cairo, Egypt (2000)
Anderson, T., Lee, P.A.: Fault Tolerance: Principles and Practice, 2nd Revised edn. Dependable Computing and Fault Tolerant Systems, vol. 3. Springer, Heidelberg (1990)
Gray, J., Reuter, A.: Transaction processing: concepts and techniques. Morgan Kaufmann, San Francisco (1993)
Tso, K.S., Avizienis, A.: Community Error Recovery in N-Version Software: A Design Study with Experimentation. In: Proc. of 17th IEEE International Symposium on Fault- Tolerant Computing (FTCS-17), Pittsburgh, Pennsylvania, July 6-8, pp. 127–133 (1987)
Jimenez-Peris, R., Patino-Martinez, Alonso, G.: An Algorithm for Non-Intrusive, Parallel Recovery of Replicated Data and its Correctness. In: Proc. of 21st IEEE Int. Symp. on Reliable Distributed Systems (SRDS 2002), Osaka, Japan, pp. 150–159 (2002)
Poledna, S.: Replica Determinism in Distributed Real-Time Systems: A Brief Survey. Real-Time Systems Journal 6, 289–316 (1994)
Powell, D.: Delta-4: A Generic Architecture for Dependable Distributed Computing. Springer-Verlag Research Reports ESPRIT. Springer, Heidelberg (1992)
Popov, P., et al.: Software Fault-Tolerance with Off-the-Shelf SQL Servers. In: Kazman, R., Port, D. (eds.) ICCBSS 2004. LNCS, vol. 2959, pp. 117–126. Springer, Heidelberg (2004) (to appear)
Gruber, M.: Mastering SQL. SYBEX (2000)
Melton, J.: (ISO-ANSI Working Draft) Persistent Stored Modules, SQL/PSM (2002), http://www.jtc1sc32.org/sc32/jtc1sc32.nsf/Attachments/9611E99B3901802188256D95005B0184/$FILE/32N1008-WD9075-04-PSM-2003-09.PDF
Microsoft, SQL Server ”Yukon” (2003) http://www.microsoft.com/sql/yukon/productinfo/default.asp
Poledna, S.: Fault-Tolerant Real-Time Systems: The Problem of Replica Determinism. Kluwer Academic Publishers, Dordrecht (1996)
Ammann, P.E., Knight, J.C.: Data Diversity: an Approach to Software Fault-Tolerance. In: Proc. of 17th IEEE International Symposium on Fault-Tolerant Computing (FTCS-17), Pittsburgh, Pennsylvania, USA, pp. 122–126. IEEE Computer Society Press, Los Alamitos (1987)
Chen, P.M., et al.: Raid: High-Performance, Reliable Secondary Storage. ACM Computing Surveys 26(2), 145–185 (1994)
TPC, TPC Benchmark C, Standard Specification, Version 5.0 (2002), http://www.tpc.org/tpcc/
Weismann, M., Pedone, F., Schiper, A.: Database Replication Techniques: a Three Parameter Classification. In: Proc. of 19th IEEE Symposium on Reliable Distributed Systems (SRDS 2000), Nurnberg, Germany, pp. 206–217. IEEE Computer Society Press, Los Alamitos (2000)
Vaysburd, A.: Fault Tolerance in Three-Tier Applications: Focusing on the Database Tier. In: Proc. of 18th IEEE Symposium on Reliable Distributed Systems (SRDS 1999), Lausanne, Switzerland, pp. 322–327. IEEE Computer Society Press, Los Alamitos (1999)
Pedone, F., Frolund, S.: Pronto: A Fast Failover Protocol for Off-the-shelf Commercial Databases. In: Proc. of 19th IEEE Symposium on Reliable Distributed Systems (SRDS 2000), Nurnberg, Germany, pp. 176–185. IEEE Computer Society Press, Los Alamitos (2000)
Jimenez-Peris, R., Patino-Martinez, M.: D5: Transaction Support, ADAPT Middleware Technologies for Adaptive and Composable Distributed Components, pp. 20 (2003)
Patino-Martinez, M., Jimenez-Peris, R., Alonso, G.: Scalable Replication in Database Clusters. In: Herlihy, M.P. (ed.) DISC 2000. LNCS, vol. 1914, pp. 315–329. Springer, Heidelberg (2000)
Jimenez-Peris, R., et al.: Scalable Database Replication Middleware. In: Proc. of 22nd IEEE Int Conf on Distributed Computing Systems, Vienna, Austria, pp. 477–484 (2002)
Kemme, B., Bartoli, A., Babaoglu, O.: Online Reconfiguration in Replicated Databases Based on Group Communication. In: Proc. of Int. Conf. on Dependable Systems and Networks (DSN 2001), Goteborg, Sweden, pp. 117–126. IEEE Computer Society Press, Los Alamitos (2001)
Voas, J.: Deriving Accurate Operational Profiles for Mass-Marketed Software (2000), http://www.cigitallabs.com/resources/papers/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gashi, I., Popov, P., Stankovic, V., Strigini, L. (2004). On Designing Dependable Services with Diverse Off-the-Shelf SQL Servers. In: de Lemos, R., Gacek, C., Romanovsky, A. (eds) Architecting Dependable Systems II. Lecture Notes in Computer Science, vol 3069. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-25939-8_9
Download citation
DOI: https://doi.org/10.1007/978-3-540-25939-8_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23168-4
Online ISBN: 978-3-540-25939-8
eBook Packages: Springer Book Archive