Using semantic correctness in multidatabases to achieve local autonomy, distribute coordination, and maintain global integrity
Introduction
The multidatabase area has received considerable attention [3], [4], [6], [7], [10], [11], [13], [18], [19], [24]. The goal of a multidatabase is to integrate a heterogeneous collection of existing databases, thereby enabling global transactions to span the multiple local databases. Four typically contradictory goals drive the integration: respect for local autonomy, distribution of global transaction management, maintenance of local and global integrity constraints, and correctness of execution histories. No existing multidatabase proposal satisfies all of these goals; but the multidatabase model proposed in this paper does for applications that satisfy a certain set of properties. Our notion of semantic correctness allows us to achieve local autonomy of local databases, distribute coordination of global transactions, and maintain local and global integrity constraints. Before we outline our approach, we elaborate the four multidatabase goals.
Local autonomy. The local databases typically exist long before being organized as a multidatabase. Furthermore the local databases may be owned by separate, and possibly competing, organizations. Thus each local database requires both design autonomy to accommodate its diverse legacy nature and execution autonomy to minimize the outside interference of global transactions with local processing [10], [20].
Standard protocols often violate local autonomy. For example, the two phase commit protocol, which is often used to ensure the atomicity of global transactions, requires each local database to support the prepared state, which clearly violates design autonomy. Since local databases can no longer abort or commit global subtransactions at their discretion, the two-phase commit protocol also violates execution autonomy.
Local autonomy forces the local subtransactions of global transactions to be treated as ordinary local transactions. For example, the local database is free to choose any concurrency control algorithm, and global transactions must accommodate that choice. Mechanisms for managing global transactions must be built on top of – as opposed to inside – the local database.
Distributed coordination. The local databases in a multidatabase may fail, or be disconnected from the network, or simply choose to limit their degree of global cooperation at any time. To provide continued service to subsets of a multidatabase, global transaction processing must be tolerant of single points of failure. For this reason, global transaction processing should be distributed instead of centralized. As we elaborate in the related work section, most multidatabase proposals do not satisfy this criteria.
Global integrity. Local databases are charged with maintaining local integrity constraints. Global integrity constraints are more difficult to maintain, particularly in the absence of a centralized global transaction manager. Some researchers have argued against having global integrity constraints in multidatabases [13], but we take the position in this paper that global constraints are a natural part of some applications, and that means for implementing such constraints should be provided.
Correctness. Syntactic correctness criteria use some notion of serializability. Serializability obligates the mechanism processing global transactions to make consistent ordering decisions at each local database, which is problematic for multidatabases. Most works using the standard serializability notion [5] for multidatabases [7], [18] require a centralized global transaction manager with a relatively high overhead. A decentralized approach that guarantees serializable execution of multidatabase transactions, but requires the local databases to support a visible prepared-to-commit state, was proposed by Batra et al. [6]. Using weaker serializability notions for multidatabases [10], [13], [19] impose severe constraints on which applications can be accommodated. Some multidatabase models [20] require a stronger notion of correctness than serializability due to the possibility that a global transaction will generate output via some local subtransaction and then semantically abort by executing a compensating step. See Section 2 for a detailed review. Semantic notions of correctness have also been applied to multidatabases [24], but without the simultaneous achievement of local autonomy, distributed coordination, and global integrity.
In this paper, we do not use serializability as the correctness criterion, but introduce a semantic correctness model tailored to multidatabases. The result is a property-oriented approach to analyzing applications. The properties are: (1) semantic atomicity: a global transaction commits or compensates for all of its subtransactions, (2) consistent execution: global integrity constraints are restored upon completion of partially executed global transactions, and (3) sensitive transaction isolation: outputs always appear to have been generated from a consistent state. Together, these three properties encompass two of the goals, namely, maintenance of global integrity constraints and correctness. Note that these three new properties replace their syntactic counterparts, namely atomicity, consistency and isolation, which are used in the standard transaction processing model.
We introduce a successor set mechanism for managing global transactions. A fourth property, the (4) valid successor set property, ensures that a successor set description is a valid refinement of the global transactions. We describe the integration of successor set descriptions for global transactions with local transaction processing. The successor set mechanism yields the remaining two goals, namely, local autonomy and distributed coordination of global transactions. Thus, for applications that satisfy the necessary four properties, our approach provides for all four of the desired goals for multidatabases.
There are two possible outcomes to analyzing an application. One outcome is that the application does indeed enjoy the four properties. The other outcome is that the application does not satisfy one or more of the four properties, in which case the application developer must either revise the application or select an alternate approach. In this paper we give assurance whether the properties indeed hold for a given application; if the application does not possess the necessary properties we do not suggest an approach of modifying the application to ensure the satisfaction of the properties. Appendix A illustrates the formalization of an example in the specification language Object Z. The Object Z language [9] was chosen because it allows predicates on histories to be specified as temporal logic formulae. The appendices contain arguments that the four properties hold for the example.
An important issue is how well our model scales up to real world applications. The necessary properties must be demonstrated for applications which must be implemented by our model. For the purpose of this paper we use the Object Z specification language and analyze the specifications by hand. However, for real world applications this may not be feasible; for such applications it is necessary to automate to the extent possible the process of discharging the proof obligations. Therefore, we also examine model checking as an automated approach to verifying the properties. Model checking requires the system to be verified be represented as a finite state machine. Since software systems are in general infinite state machines, finite state abstractions of the software systems must be developed which can then be analyzed by the model checker [26]. We show how the running example can be abstracted into a finite state machine and the properties verified in the abstraction by the SMV model checker [17]. The verification in the abstraction yields informal confidence that the properties hold in the original.
The rest of the paper is organized as follows. Section 2 discusses the related work done in this area. In 3 The model, 4 Applications with global integrity constraints we present our model. Section 5 outlines an efficient mechanism for ensuring semantic correctness. Section 6 discusses the extensions required to support global transactions. Section 7 describes how a model checker can be used to verify semantic correctness for an example application. Finally Section 8 concludes the paper.
Section snippets
Related work
Several schemes [7], [18] have been proposed to ensure global serializability in a multidatabase environment. Unfortunately, these algorithms require the existence of a centralized global transaction manager and have high run time overhead because each requires a site graph to be maintained.
A decentralized mechanism to ensure global serializability has been proposed by Batra et al. [6]. Each site consists of a global transaction manager (GTM) and a set of servers. The site at which the global
The model
A multidatabase consists of a collection of local databases. At any given time, the state of a local database is determined by the values of the objects in the database. A change in the value of a database object changes the state. The global state comprises the states of all the individual local databases. Integrity constraints are predicates defined over the objects. Integrity constraints may be local or global; local constraints are defined over objects belonging to one local database and
Applications with global integrity constraints
In applications without global integrity constraints, the semantic atomicity property is enough to ensure semantic correctness. However, some multidatabase applications have global integrity constraints, in which case the execution of a subtransaction may no longer preserve the global integrity constraints. In such cases we need assurance that users are not displayed inconsistent data and eventually the database returns to a consistent state. This motivates us to propose the sensitive
Developing an efficient mechanism
The previous section outlines how global transactions can be decomposed, and how the interleaving of the decomposed transactions must be controlled to avoid inconsistencies. Although the decomposed specifications are helpful for analysis, it is undesirable to directly implement the decomposed specifications. First, it is expensive to implement the auxiliary variables. Second, the history invariants must be checked before an operation can be dispatched. The checking of history invariants is an
Support for global transactions
A correct successor set history is a semantic history which in turn is a subtransactionwise serial history. For every pair of operations in a subtransactionwise serial history, all operations of one subtransaction must appear before any operation of the other subtransaction. However, if the subtransactions of a transaction execute atomically and without any interleaving, the database makes poor use of system resources. The standard solution, which we adopt, is to increase the class of allowable
Automated verification of the decomposed specifications
To use our approach on real world applications, automated verification of the properties given in 3 The model, 4 Applications with global integrity constraints is desirable. Object Z does not have the tool support necessary to discharge the required proof obligations automatically. Even if Object Z did have state-of-the-art tools, theorem proving is quite difficult and far from `automatic'. The limited success of theorem proving, coupled with dramatic increase in the capabilities of
Conclusion
Our contribution in this paper is to define a model for multidatabases in which the four goals of local autonomy, distributed management of global transactions, maintenance of integrity constraints, and correctness of execution histories are simultaneously achieved. We argue that no other multidatabase proposal meets all of these goals simultaneously as well as our proposal. We begin with a semantic notion of correctness composed of three properties: semantic atomicity, consistent execution,
References (26)
- et al.
A case study in model checking software systems
Science of Computer Programming
(1997) - et al.
State-based model checking of event driven systems requirements
IEEE Transactions on Software Engineering
(1993) - et al.
Applying formal methods to semantic-based decomposition of transactions
ACM Transactions on Database Systems
(1997) - et al.
A. Silberschatz. On rigorous transaction scheduling
IEEE Transactions on Software Engineering
(1991) - et al.
Concurrency Control and Recovery in Database Systems
(1987) - R.K. Batra, M. Rusinkiewicz, D. Georgakopoulos, A decentralized deadlock-free concurrency control method for...
- Y. Breitbart, A. Silberschatz, Multidatabase update issues, in: Proceedings of ACM-SIGMOD International Conference on...
- et al.
Model checking large software specifications
IEEE Transactions on Software Engineering
(1998) - D. Duke, R. Duke, Towards a semantics for Object Z, in: D. Bjorner, C.A.R. Hoare, H. Langmaack (Eds.), VDM'90: VDM and...
Using semantic knowledge for transaction processing in a distributed database
ACM Transactions on Database Systems
Cited by (6)
Real-time update of access control policies
2004, Data and Knowledge EngineeringCitation Excerpt :The use of semantics for increasing concurrency has also been proposed by various researchers [4,10–16,24,26–28]. The use of semantic knowledge for solving other problems, such as, ensuring atomicity of secure multilevel transactions [2,20], and ensuring autonomy of local databases [19,21], have also been investigated by researchers. Real-time update of policy is an important problem for both the commercial and the military sector.
Semantic information assurance for secure distributed knowledge management: A business process perspective
2006, IEEE Transactions on Systems, Man, and Cybernetics Part A:Systems and HumansNetworked database builder and data-mining engine for electronic journal papers
2006, International Journal of Computational Science and EngineeringKnowledge Management for Government-to-Government (G2G) process coordination
2006, Electronic GovernmentApplying semantic knowledge to real-time update of access control policies
2005, IEEE Transactions on Knowledge and Data EngineeringNetworked mining of atomic and molecular data from electronic journal databases on the Internet
2005, Lecture Notes in Computer Science
- 1
CIS Department, University of Michigan-Dearborn, 4901 Evergreen Road, Dearborn, MI 48128, USA.
- 2
Partially supported by National Science Foundation under grant IRI-9633541 and by National Security Agency under grants MDA904-96-1-0103 and MDA904-96-1-0104.