10th EAI International Conference on Performance Evaluation Methodologies and Tools

Research Article

Resiliency Quantification for Large Scale Systems: An IaaS Cloud Use Case

  • @INPROCEEDINGS{10.4108/eai.25-10-2016.2266805,
        author={Rahul Ghosh and Francesco Longo and Vijay Naik and Andrew Rindos and Kishor Trivedi},
        title={Resiliency Quantification for Large Scale Systems: An IaaS Cloud Use Case},
        proceedings={10th EAI International Conference on Performance Evaluation Methodologies and Tools},
        publisher={ACM},
        proceedings_a={VALUETOOLS},
        year={2017},
        month={5},
        keywords={cloud resiliency interacting sub-models non-homogeneous markov chains},
        doi={10.4108/eai.25-10-2016.2266805}
    }
    
  • Rahul Ghosh
    Francesco Longo
    Vijay Naik
    Andrew Rindos
    Kishor Trivedi
    Year: 2017
    Resiliency Quantification for Large Scale Systems: An IaaS Cloud Use Case
    VALUETOOLS
    ACM
    DOI: 10.4108/eai.25-10-2016.2266805
Rahul Ghosh1, Francesco Longo2,*, Vijay Naik3, Andrew Rindos4, Kishor Trivedi5
  • 1: Xerox Research Center India
  • 2: Università degli Studi di Messina, Italy
  • 3: IBM T. J. Watson Research Center, USA
  • 4: IBM, USA
  • 5: Duke University, USA
*Contact email: flongo@unime.it

Abstract

We quantify the resiliency of large scale systems upon changes encountered beyond the normal system behavior. General steps for resiliency quantification are shown and resiliency metrics are defined to quantify the effects of changes. The proposed approach is illustrated through an Infrastructure- as-a-Service (IaaS) Cloud use case. Specifically, we assess the impact of changes in demand and available capacity on the Cloud resiliency using interacting state-space based sub- models where interdependencies are resolved using fixed- point iteration. Since, resiliency quantification involves un- derstanding the transient behavior of the system, fixed-point variables evolve with time leading to non-homogenous Markov chains. In this paper, we present an algorithm for resiliency analysis when dealing with such non-homogenous sub-models. A comparison is shown with our past research, where we quantified the resiliency of IaaS Cloud performance using a one level monolithic model. Numerical results show that the approach proposed in this paper can scale for a real sized Cloud without significantly compromising the accuracy.