Programming Models and Synchronization Techniques for Disconnected Business Applications

doi:10.1016/S0065-2458(05)67002-7

Advances in Computers

Volume 67, 2006, Pages 85-130

https://doi.org/10.1016/S0065-2458(05)67002-7 Get rights and content

Abstract

Programming models usefully structure the way that programmers approach problems and develop applications. Business applications need properties such as persistence, data sharing, transactions, and security, and various programming models exist—for connected environments—that facilitate the development of applications with these properties. Recently, it has become possible to consider running business applications on disconnected devices. Developers thus confront two areas of concern. The first is to solve the pragmatic problems of how to implement the properties required by business applications in a disconnected environment. The second is to determine whether programming models for disconnected environments exist (as they do for connected environments) that facilitate the development of business applications.

This chapter discusses these two areas of concern. We explain why business applications are particularly hard to “project” to disconnected devices. We then introduce some of the approaches used to solve these problems (focusing especially on data replication and method replay techniques), and the programming models that exist for the disconnected environment. Finally, we analyze whether connected programming models for business applications can be usefully projected to disconnected environments. We compare the data replication and method replay approaches, discuss the features of each, and show that a connected programming model is useful even in a disconnected environment.

Introduction

A business application is characterized by the fact that the application (1) updates state that is shared by multiple users; (2) must perform these updates transactionally [15] to a shared database; and (3) must operate securely. Business applications therefore have requirements that other applications do not: chiefly, to access persistent shared datastores securely and transactionally. Programming models can ease the difficulty of developing complex business logic that meets these requirements. This is typically done by abstracting business application requirements as generic services or middleware that the developer can access in as unobtrusive a manner as possible. Good programming models enable a “separation of concerns” through which the application developer can concentrate on the application-specific logic, and assume that the deployed application will meet the business application requirements. Well-known examples of such programming models include CORBA [4], DCOM [6], and Enterprise JavaBeans (EJBs) [10].

Business applications have traditionally been deployed in connected environments in which the shared database can always be accessed by the application. In contrast, when applications are deployed to mobile devices such as personal digital assistants (PDAs), hand-held computers, and laptop computers, these devices are only intermittently able to interact with the shared database. (In a client/server environment, the shared database resides on the server.) Historically, resource constraints (e.g., memory and CPU) have precluded disconnected devices from running business applications. Ongoing technology trends, however, imply that such resource constraints are disappearing. For example, DB2 Everyplace [7] (a relational database) and WebSphere MQ Everyplace [29] (a secure and dependable messaging system) run on a wide variety of platforms such as PocketPC™, PalmOS™, QNX™, and Linux; they are also compatible with J2ME [22] configurations/profiles such as CDC and Foundation. It seems likely that mobile devices will even be able to host middleware such as an Enterprise JavaBeans container. As a result, business applications that previously required the resources of an “always connected” desktop computer can potentially run on a disconnected device.

Of course, there are non-business applications which do not have these requirements, but we argue that this set is declining in size and importance. For example, even simple mobile applications typically support synchronization of updates back to the user's personal PC. Since the PC copy of the database may be updated by both the synchronization agent and other PC-based applications (e.g., calendaring), the database is, in fact, shared. Also, users will probably be very disappointed to discover that synchronization of updates did not occur transactionally (e.g., if concurrent updates to the same record were not detected and resolved in some way). Finally, security of PDA databases is certainly a concern nowadays.

However, other issues, besides resource constraints, have precluded deployment of business applications on disconnected devices. Fundamental algorithmic and infrastructure problems must also be solved. The algorithmic problems stem from the fact that the application executes while disconnected from the server, but the work performed must later be propagated to the server. To see why this is a problem, consider the fact that business applications, by our definition, are structured as application logic that reads from, and writes to, a transactional database that can be concurrently accessed by other applications. Connected business applications have taken for granted that the transactional database can always be accessed by the application. Even if they are structured so as to access locally cached data for “read” operations, state changes (“updates”) must still applied to the shared, master database [12], [25] at the completion of each user operation. Obviously, the shared-database assumption does not hold when business applications are disconnected: they are then forced to read from, and write to, a database that is not shared by other applications and users. Also, almost inevitably, the disconnected application will execute against data that is out-of-date with respect to the server's version of the data. How can work performed on the disconnected device be merged into the shared database in a manner that preserves the transactional behavior of both disconnected and connected clients? The lock-based concurrency control mechanisms used in connected environments to prevent concurrent updates and other transaction serializability violations are not suitable for disconnectable applications because they unacceptably reduce database availability. Also, lock-based concurrency control is simply not dynamic enough; it is typically impossible to know what needs to be locked before the device disconnects from the server.

Infrastructure must also be developed to deal with the life-cycle of an application deployed to a disconnected device. Data must first be “checked out” (copied) from the shared database; the data are then used by the application; and the committed work must be merged into the server database when the device reconnects. Without middleware that provides replication (from the server to the device) and synchronization (from the device to the server) functions, each application must provide its own implementation of these features. A programming model is therefore needed to facilitate development of business applications for disconnected devices. The programming model must provide constructs that address these algorithmic issues, and must integrate with middleware that provides the services described above.

Comparing programming models is very difficult, because reasonable people can disagree about (1) the correct set of evaluation criteria and (2) how well a given programming model performs with respect to a set of evaluation criteria. Thus, even in environments with which people have much experience, such as connected business applications, discussing the superiority of EJBs versus CORBA versus DCOM can produce much more heat than light. This difficulty is compounded for emerging areas such as disconnected business applications. In addition, programming models may have features that are interesting in their own right, independently of whether applications execute in a connected or disconnected environment.

In this chapter, we shall adopt the following approach. Our chief evaluation criterion is whether, and to what degree, a disconnected programming model is a projection of a connected programming model onto disconnected devices. By “projection,” we acknowledge explicitly that developers will always have to take the disconnected environment into account. The goal, however, is for the programming model to enable application semantics that are identical, or similar, to a connected application. The algorithmic issues discussed above should be solved in a way that is transparent to the developer, who does not have to write more (or different) code than she does for the connected environment. However, we are also interested in what the code “looks like,” and the features that are exposed to developers independently of connection-specific issues. Our criteria here will be more subjective since one person's “feature” is another person's “needless complexity.” The chapter will use snippets of code to give a concrete sense of the programming model.

Much work has been done in the area of transactionally synchronizing work performed on a disconnected client to the server. The implications of disconnection for transactional applications are well known (see [39] for a recent survey of the area of “mobile transactions”). The presentation in this chapter differs in that we focus on whether, and how, a mature programming model can be projected to disconnected devices in a way that takes advantage of such prior algorithmic work. The ability to project existing connected programming models is important because it can reduce an application's development and maintenance costs. Development costs are reduced because developers can use their existing programming model experience to develop disconnected applications. Maintenance costs are reduced because differences between the connected and disconnected versions of an application are minimized.

We shall also refer to a prototype that demonstrates that the Enterprise JavaBeans [10] programming model can be projected onto disconnected devices. Useful work can be performed on the disconnected device (i.e., few constraints are imposed), while the likelihood of synchronization problems is minimized. The prototype is interesting because it shows how middleware can concretely realize the connected programming model on disconnected devices.

Our chapter focuses on how a connected programming model can be projected to disconnected devices. This part of our work is closely related to the area of mobile transactions [39]. However, we believe that a simpler programming model and synchronization algorithm than many proposed in the mobile transactions literature is adequate to project business applications to disconnected devices. We assert that this simpler approach is sufficient for several reasons.

First, we address a programming model for environments that are more robust than is typically assumed for mobile transactions. Much mobile transactions research, for example, assumes that these transactions execute in resource constrained environments. They must therefore address issues related to limited bandwidth capacity, communication costs, and energy consumption. In contrast, we assume that business applications are deployed to (the increasingly more powerful) devices that can locally execute business applications against a transactional database.

Second, we assume that transactions are able to execute entirely on the disconnected device, without assistance from a server. This allows considerable simplification compared to the mobile transactions work designed to support transaction processing in which a client may initiate transactions on servers or may distribute transactions among the mobile client device and servers. Such environments require that transaction processing be supported while the mobile device moves from one networked cell to another or drops its network connections. Mobile transaction models such as Kangaroo Transactions[9] are explicitly designed to operate in such complex environments, whereas the synchronization techniques discussed here can use traditional transaction semantics. Similarly, the synchronization techniques discussed here do not deal with distributed transactions (between the mobile device and the network), nor do they deal with heterogeneous multi-database systems. Our focus, instead, is to jump-start deployment of business applications to disconnected devices in well-controlled environments.

Finally, we do disagree with the assumption made by some research that optimistic (non-locking) concurrency control mechanisms must perform badly for the long disconnect durations typical of mobile transactions. Such research assumes that the classic optimistic algorithms [15] perform well only for short disconnections, and will experience unacceptable abort ratios for long disconnections. Non-traditional transaction models such as pre-write operations[27] and dynamic object clustering replication schemes [32] are designed to increase concurrency by avoiding such aborts. In our experience, however, business processes greatly reduce the actual occurrence of such aborts by implicitly partitioning data among application users. Furthermore, the transform-based approach used by method replay synchronization (Section 3.2) is designed to reduce the size of a transactional footprint, and thus reduces the probability of aborts during synchronization.

Finally, note that method replay synchronization and the synchronization middleware discussed later (Section 4.5) build on earlier work using log-replay in support of long-running transactions [30]. These ideas are similar to the approach taken by the IceCube [17] system, although IceCube does not focus on transactional applications.

The chapter is structured as follows. Section 2 explains one of the main challenges faced by disconnected programming models, namely the need to support the life-cycle of a generic disconnected application. Section 3 introduces two classes of synchronization techniques: data replication and method replay. Section 4 discusses two business application programming models. In one (Enterprise JavaBeans), the programming model was intended for connected environments. In the other (Service Data Objects), the programming model is intended to be used in both connected and disconnected environments. We discuss interesting features of both programming models, and show some of their implication with respect to building business applications. We close, in Section 5 by evaluating various programming models for disconnected applications.

The chapter will use the following “order entry” application to help motivate the discussion. Order Entry enables agents to record customer orders using a stock catalog consisting of line-items and in-stock quantities. If a customer has not previously placed orders, the agent enters information about the new customer into the system.

Figure 1 shows the top-level entities used in the application.

Figure 2 shows the internal structure of an agent and line-item, and Fig. 3 shows the internal structure of the customer entity.

Figure 4 shows how an order has references to both the agent who placed the order, and to the customer for whom the order was placed.

As we shall show, the relative lack of complexity in order entry does not detract from its ability to illustrate some of the key issues in building disconnected business applications.

Section snippets

Life-Cycle of a Disconnected Business Application

In order to be successful, a programming model for disconnected devices must be compatible with the life-cycle of a disconnected application.

Deployment of a disconnectable business application requires that an administrator perform a one-time setup (life-cycle stage 0) of the mobile device's database(s). The key challenge in stage 0 is to replicate sufficient data (from the server to the device) such that the application can execute correctly. This can be a difficult task when an application

Data Replication

The data replication synchronization technique represents a change set as a log of data modifications that were performed on the disconnected device. (The term “modifications” denotes data creation and deletion as well as data changes.) Data replication is used by both DB2e [7] and Lotus Notes [26]. It also underlies the notion of cached RowSets [33] in which the reference implementation uses optimistic concurrency-control. The synchronization process begins by transmitting the data

Disconnected Programming Models: Service Data Objects and Enterprise JavaBeans

A programming model can be viewed as a contract between a computing environment and developers who write programs that execute in that environment. On the one hand, the contract specifies “services” that the computing environment must provide to developers. On the other hand, the contract imposes “constraints” on the type of programs that developers may write. A programming language is thus associated with a programming model. For example, the Java programming model specifies that the Java

Evaluating Disconnected Programming Models

This chapter contends that a connected programming model for business applications can be projected to disconnected devices such that:

•
the semantics of the programming model are (almost) unchanged regardless of whether the application executes in a connected or disconnected environment;
•
the need to synchronize the device's work to the server is hidden from developers by a combination of the programming model and middleware;
•
applications developed with this approach can be usefully deployed to

Summary and Conclusion

In the past, device resource constraints such as CPU and memory precluded even considering whether to execute business applications on disconnected devices. Now that such resource constraints are disappearing, we are forced to determine whether the algorithmic issues related to client synchronization of disconnected work preclude disconnected business applications. Similarly, the feasibility of programming models that facilitate the development of such applications becomes increasingly

Acknowledgements

We thank Steve Brodsky, IBM Software Group, SDO Architect, for important feedback on an early draft of this chapter.

References (44)

Special Issue on Aspect-Oriented Programming
Commun. ACM
(October 2001)
Brodsky S., private communication, dated...
A. Bainbridge et al.
CICS and enterprise JavaBeans
IBM Systems J.
(2001)
J. Siegel
Quick CORBA 3
(2001)
Date C.J., “An architecture for high-level language database extensions”, in: Proc. ACM SIGMOD, 1976, pp. 101–122. Note...
F.E. Redmond
DCOM: Microsoft Distributed Component Object Model
(1997)
IBM DB2 Everyplace
Data Transfer Object
M.H. Dunham et al.
Mobile transaction model that captures both the data and movement behavior
Mobile Networks Appl.
(1997)
J2EE Enterprise JavaBeans Technology

Enterprise JavaBeans Specification Version 1.0

M.J. Franklin et al.

Transactional client-server cache consistency: alternatives and performance

ACM Trans. Database Systems (TODS)

(1997)

E. Gamma

Design Patterns: Elements of Reusable Object-Oriented Software

(1995)

Butrico M.A., et al., “Mobile transaction middleware with Java-object replication”, in: Proc. Third USENIX Conference...

J. Gray et al.

Transaction Processing: Concepts and Techniques

(1993)

Java Message Service (JMS)

Kermarrec A.-M., et al., “The IceCube approach to the reconciliation of diverging replicas”, in: Proc. 20th Annual ACM...

IceCube

IBM IMS Family

T. Greanier

Discover the secrets of the Java serialization API

Java Database Connectivity (JDBC)

Java 2 Platform, Micro Edition (J2ME)

Cited by (1)

Data-bound variables for WS-BPEL executable processes
2012, Computer Languages, Systems and Structures
Citation Excerpt :
We use SDO in our proposed extensions for data-bound variables, as SDO is currently the most appropriate data programming solution for SOA, as identified by several authors [31,33,37,38]. SDO is the only data programming technology that combines support for heterogeneous data source types, disconnected programming model [22], change tracking feature and support for static and dynamic data API. It is also very important that XML schema can be used to define the SDO model and that a subset of XPath 1.0 can be used for traversing through data objects [20].
Standard BPEL (Business Process Execution Language) variables, if used to store the data from a data store, cannot be automatically synchronized with the data source in case other applications change the data during the BPEL process execution, which is a common occurrence particularly for long-running BPEL processes. BPEL also does not provide a mechanism for active monitoring of changes of data that would support automated detection and handling of such changes. This paper proposes a new type of BPEL variables, called data-bound variables. Data-bound variables are automatically synchronized with the data source and thus eliminate the need to implement data synchronization manually. To provide support for data-bound variables, we propose specific extensions to BPEL and the use of appropriate Data Access Services (DAS) that act as data providers. We introduce new BPEL activities to load, create and delete remote data. We also introduce observed properties, observed property groups and a variable handler. Using this mechanism, the BPEL process is able to automatically adapt to changes to data, made inside or outside the process scope, by following the Event, Condition, Action (ECA) paradigm. As a proof-of-concept, we have developed a prototype implementation of our proposed BPEL extensions and tested it by implementing three pilot projects. We have confirmed that our proposed solution decreases BPEL process size and complexity, increases readability and reduces semantic gap between BPMN process model and BPEL.

View full text

Programming Models and Synchronization Techniques for Disconnected Business Applications

Abstract

Introduction

Section snippets

Life-Cycle of a Disconnected Business Application

Data Replication

Disconnected Programming Models: Service Data Objects and Enterprise JavaBeans

Evaluating Disconnected Programming Models

Summary and Conclusion

Acknowledgements

Special Issue on Aspect-Oriented Programming

Commun. ACM

CICS and enterprise JavaBeans

IBM Systems J.

Quick CORBA 3

DCOM: Microsoft Distributed Component Object Model

IBM DB2 Everyplace

Data Transfer Object

Mobile transaction model that captures both the data and movement behavior

Mobile Networks Appl.

J2EE Enterprise JavaBeans Technology