Elsevier

Future Generation Computer Systems

Volume 87, October 2018, Pages 502-513
Future Generation Computer Systems

Towards increasing reliability of clouds environments with RESTful web services

https://doi.org/10.1016/j.future.2017.10.050Get rights and content

Highlights

  • The paper increases reliability of processing in clouds computing with RESTful services.

  • It proposes the recovery consistency model and determines it’s formal requirements.

  • The model exploits semantics of REST operations and structure of resources.

  • Also, recovery protocol increasing RESTful web service fault-tolerance is introduced.

  • The protocol extends ReServE service and guarantees RESTful recovery consistency model.

  • Correctness of proposed solution is formally verified and it’s efficiency is evaluated.

Abstract

Recently, cloud computing has turned out to be a desired standard for development of distributed, large-scale applications. It allows to exploit on-demand various IT resources, ranging from virtual machines, storage, and databases to a broad set of cloud web services. Such services are closely intertwined with service-oriented architectures, and their design and implementation is often supported by Representational State Transfer methodology. Although cloud environments offering RESTful services constitute a desired computing model, their adoption is not free from challenges. Among the others, providing reliability of such environments poses difficulties and affects their usability. Since the simple restart of a failed processing (as a new instance) from the very beginning, is usually insufficient and often leads to unacceptable inconsistency of processing state, in this paper we propose a fault-tolerant mechanism that copes with this problem. In the proposed solution we introduce recovery consistency model that exploits semantics of REST operations and structure of resources, in order to flexibly determine formal requirements ensuring that the recovered processing state is compatible with REST constraints. Next, we put forward an efficient external upport of RESTful web services recovery, by presenting the protocol that extends ReServE rollback-recovery service and guarantees RESTful recovery consistency model. We provide a proof of protocol correctness, and experimentally evaluate efficiency of the proposed approach.

Introduction

Over the past decade, cloud computing has become one of the major paradigm of distributed processing [[1], [2]]. In this paradigm, heterogeneous IT resources come from different vendors, and are exploited in an on-demand basis by clients. Resources may encompass virtual machines, storage, databases, and a broad set of web services, developed with the use of different technology stacks. The complexities of the underlying stacks and protocols should be isolated from the business logic in order to enable cloud services to work efficiently. Therefore, service oriented architecture (SOA) [[3], [4], [5]], having a flexible and modular approach to delivery of services is often used to support implementation of cloud environments [[6], [7]]. Among predominant approaches to develop loosely coupled, autonomous SOA services are Simple Object Access Protocol (SOAP) [[8], [9]], and Representational State Transfer (REST) architectural style. The latter approach seems to be especially promising in case of clouds environments: it specifies constraints that if applied to web services, bring advantages desirable in such systems, e.g. simplicity, performance, and scalability. RESTful services focus on system resources, and the way how they are addressed and exchanged by service clients. Resources are identified by universal identifiers embodied by URIs, and are manipulated through the uniform interfaces, mostly HTTP methods. As a result, any tool that can work with HTTP protocol can utilize a RESTful service.

Despite benefits associated with applying SOA and RESTful services in cloud computing, there are also several challenges that have to be addressed while construction of such environments [[10], [11], [12], [13]]. Among them particularly important is the reliability issue. SOA, like all distributed systems, is highly vulnerable to failures, which can lead to limited availability of RESTful services and their resources. In the consequence, the reliability of the whole system is affected. The correct continuation of the processing after a failure occurrence is feasible only when the restored state is perceived by all participants in a consistent manner. Essentially, the consistent state should reflect the observable system behavior from before the failure [[14], [15], [16]]. To meet this requirement, after the failure occurrence and restart of processing, results of requests issued by clients before the failure should be visible in the service state, and results of execution of these invocations should be reflected in the state of a client.

Although in cloud environments with RESTful services, the statelessness of such services is often emphasized, in fact services maintain state of their resources. Consequently, the processing state, comprising client’s state and representations of resources from various services, may be lost due to the failure of a client or a service, or it can be outdated after state recovery. This may preclude a client to manipulate service resources, especially, when the pre-failure resource state was used by the client to make some decisions, and is to be further used in its upcoming processing. Although, in general, weak consistency is a feature of many web-based resource oriented distributed systems, in some situations the inconsistency described above is intolerable, and may lead to loss of incomes of cloud infrastructure providers. Such a scenario is also unacceptable by clients, who expect uninterrupted business processing. Therefore, designing mechanism increasing fault-tolerance of cloud computing with RESTful services, which enables consistent continuation of processing despite failures, is an open issue.

In many existing SOA systems, the compensation procedure is applied to mask failures and restore the consistent processing state [17]. However, in some cases, carrying out the compensation is either impossible or its cost is too high. Then, rollback-recovery approach [14], used to increase fault-tolerance of distributed systems, may be utilized. The use of rollback-recovery in SOA requires, however, taking into account the characteristics of this environment. In particular, due to the autonomy of services, the recovery of one of them cannot influence the processing of the others, and force them to rollback in order to ensure consistency of overall processing. Thus, there is a need to adjust rollback-recovery mechanisms to SOA-based cloud environments.

For this reason, we introduced ReServE environment [[15], [18], [16]], that increases fault-tolerance of such systems. ReServE is a third-party, external environment, not related to any client or service. It ensures that in the case of failure of one or more system components, a processing state is transparently recovered and consistently perceived by clients and services. The proposed environment takes advantage of the fact that the processing in the SOA is based on exchanging of messages. Thus, ReServE intercepts the communication between participants of the processing, and logs messages exchanged among them. The intercepted messages reflect the history of communication, which is then used to recover the consistent processing state in the case of failure of any system component. The proposed solution does not require involvement of clients and respects the autonomy of services (among the others each service can apply its own reliability policies).

In view of the current trend indicating the resource orientation, in this paper we propose to extend ReServE environment to comply with RESTful approach. The contribution of this paper is threefold. First, we propose a definition of RESTful recovery consistency model that utilizes semantics of REST operations and structure of resources, to indicate when the recovered service state is compatible with REST constraints. Finding the consistent processing state is important for analyzing, testing and verifying properties of RESTful computations. Next, having formally specified and recognized consistency requirements for RESTful processing, we introduce the outline of the recovery protocol that guarantees the proposed RESTful recovery consistency model, and applies it in ReServE environment [[15], [16], [18]], in order to optimize the recovery procedure and decrease its costs. Finally, we provide formal and empirical evaluation of the proposed recovery protocol, and discuss the impact of resource structure on the recovery overhead.

The remainder of the paper is organized as follows. Section 2 discusses the related work. Section 3 describes system model, and Section 4 provides the case study illustrating the usage of the proposed RESTful recovery consistency model, and its formal definition. Section 5 introduces ReServE general idea. Then, in Section 6, the rollback-recovery protocol that provides the proposed recovery consistency model is presented, and its correctness proof is proposed in Section 7. Results of simulation experiments are provided in Section 8. Finally, Section 9 concludes the paper.

Section snippets

Related work

A handful of proposals aimed to enhance the reliability of SOA-based systems can be found in the literature [[19], [20], [21], [22]]. In this Section we examine several representative approaches, including client-side applications increasing fault-tolerance, transaction-based approaches, and fault tolerance frameworks.

The paper [23] introduces the FT-REST architectural framework that supports development and implementation of fault-tolerant, reliable and adaptive clients. FT-REST consists of a

System model

In the considered system model service providers deliver the required functionality in the form of RESTful web services. Such services comprise a set of interconnected resources. Applied hypermedia model regulates relationships among resources and defines their potential state changes. Due to the fact that HTTP protocol is considered as a primary protocol for constructing RESTful services, in the paper we identify resources by their Uniform Resource Identifiers (URIs). Each resource has at

Case study

The problem of potential inconsistency in resource oriented systems occurs in two cases: during the failure-free processing — due to concurrent access by two or more clients to the same resources, and in the result of system components failures, when some components lose (or partially lose) their state, of which depended the state of other components. The first problem can be solved, for example by using the transactions for RESTful systems [[35], [36]], and is orthogonal to the latter problem

ReServE architecture

This section depicts reliable service environment ReServE that increases reliability of processing in service oriented systems. The architecture of the proposed solution is presented in Fig. 2. The core of the ReServE environment consists of a set of Recovery Management Units (RMU) that manage the persistent storage, store incoming requests from clients and responses from services during the failure-free processing, and are responsible for maintaining recovery when needed. RMUs cooperate

Rollback-recovery in a RESTful way

Assuming the use of ReServE service, we utilize messages from the log to restore the consistent processing state compatible with REST constraints.

During the failure-free execution, the proposed RESTful-based recovery protocol, like ReServE, saves obtained messages in the Stable Storage of RMU. However, in the case of a failure, the introduced solution divides the recovery process into two phases. First phase analyzes the consecutive RMU log entries that represent requests performed by the

Protocol safety

We present a sketch of proof, in which we show that the proposed recovery protocol with RESTful recovery consistency model leads to recovery of consistent processing state that meets REST constraints, despite recovering only a subset of requests performed by a service before its failure.

Lemma 1

A RecoverySet contains all requests, which execution leads to creation of the same resources and interdependencies between them, as maintained by this service at the failure moment.

Protocol evaluation

Taking into account that requests of different services are incomparable in terms of time consumption, in the performed experiments we assess the number of requests before the failure and after the recovery.

To evaluate the proposed protocol, a tester application was created. It takes a configuration file (representing a declarative description of an interaction with a service) as an input and creates several data sets as an output. A sample configuration file is described in Listing 1. The

Conclusions

In the last few years, REST has been chosen as a prevalent design model for cloud services. According to [38], RESTful services have accounted for more than 70% of all cloud services, and their number is constantly increasing. Consequently, also the number of applications that use cloud environments with RESTful services is growing. For many of them the reliability aspect is essential. Thus, increasing fault-tolerance of environments that use RESTful services is of paramount importance. One of

Anna Kobusińska received her M.Sc. and Ph.D. degrees in computer science from Poznań University of Technology, in 1999 and 2006, respectively. She currently works as an Associate Professor at the Laboratory of Computing Systems, Institute of Computing Science, Poznań University of Technology, Poland. Her research interests include large-scale distributed systems, service-oriented and cloud computing. She focuses on distributed algorithms, Big Data analysis, replication and consistency models,

References (38)

  • D. Box, D. Ehnebuske, G. Kakivaya, A. Layman, N. Mendelsohn, H.F. Nielsen, S. Thatte, D. Winer, Simple object access...
  • ScribnerK. et al.

    Understanding SOAP: Simple Object Access Protocol

    (2000)
  • DillonT. et al.

    Cloud computing: issues and challenges

  • ZhangQ. et al.

    Cloud computing: state-of-the-art and research challenges

    J. Internet Serv. Appl.

    (2010)
  • ElmootazbellahN. et al.

    A survey of rollback-recovery protocols in message-passing systems

    ACM Comput. Surv.

    (2002)
  • BrzezińskiJ. et al.

    D-reserve: Distributed reliable service environment

  • BrzezińskiJ. et al.

    Towards relaxed rollback-recovery consistency in SOA

  • MaamarZ. et al.

    Towards an approach to sustain web services high-availability using communities of web services

    Int. J. Web Inform. Syst.

    (2009)
  • HołenkoM. et al.

    The impact of service semantics on the consistent recovery in SOA

  • Cited by (18)

    • Spatio-temporal context-aware collaborative QoS prediction

      2019, Future Generation Computer Systems
      Citation Excerpt :

      Web services are self-described software designed to support interoperable machine-to-machine interaction over the Internet via standard interfaces and communication protocols such as SOAP and REST [1].

    • On-board Hybrid Heterogeneous Distributed Computing Resource Virtualization

      2023, 2023 21st International Conference on Optical Communications and Networks, ICOCN 2023
    • Designing an information system to support the business of the taxi service

      2022, 2022 21st International Symposium INFOTEH-JAHORINA, INFOTEH 2022 - Proceedings
    View all citing articles on Scopus

    Anna Kobusińska received her M.Sc. and Ph.D. degrees in computer science from Poznań University of Technology, in 1999 and 2006, respectively. She currently works as an Associate Professor at the Laboratory of Computing Systems, Institute of Computing Science, Poznań University of Technology, Poland. Her research interests include large-scale distributed systems, service-oriented and cloud computing. She focuses on distributed algorithms, Big Data analysis, replication and consistency models, as well as fault-tolerance, specifically checkpointing and rollback recovery techniques.

    She has served and is currently serving as a PC member of several international conferences and workshops. She is also author and co-author of many publications in high quality peer reviewed international conferences and journals. She participated to various research projects supported by national organizations and by EC in collaboration with academic institutions and industrial partners.

    Ching-Hsien Hsu is a professor and the chairman in the CSIE department at Chung Hua University, Taiwan; He was distinguished chair professor at Tianjin University of Technology, China, during 2012–2016. His research includes high performance computing, cloud computing, parallel and distributed systems, big data analytics, ubiquitous/pervasive computing and intelligence. He has published 200 papers in these areas, including top journals such as IEEE TPDS, IEEE TSC, IEEE TCC, IEEE TETC, IEEE T-SUSC, IEEE Systems, IEEE Network, IEEE Communications, ACM TOMM. Dr. Hsu is serving as editorial board for a number of prestigious journals, including IEEE TSC, IEEE TCC, IJCS, JoCS. He has been acting as an author/co-author or an editor/co-editor of 10 books from Elsevier, Springer, IGI Global, World Scientific and McGraw-Hill. Dr. Hsu was awarded six times talent awards from Ministry of Science and Technology, Ministry of Education, and nine times distinguished award for excellence in research from Chung Hua University, Taiwan. He is vice chair of IEEE TCCLD, executive committee of IEEE TCSC, Taiwan Association of Cloud Computing and an IEEE senior member.

    This work was supported by the Polish National Science Center under Grant No. DEC-2011/03/D/ST6/01331.

    View full text