Resource management for bursty streams on multi-tenancy cloud environments☆
Introduction
The number of applications that need to process data continuously over long periods of time has increased significantly over recent years. Often the raw data captured from the source is converted into complex events—which are subsequently further analysed. Such applications include weather forecasting and ocean observation [1], text analysis (especially with the growing requirement to analyse social media data, for instance), “Urgent Computing” [2], and more recently data analysis from electricity meters to support “Smart (Power) Grids” [3]. The emerging Internet of Things and Smart Cities scenarios also strongly confirm that increasing deployment of sensor network infrastructures generate large volumes of data that are often required to be processed in real-time. Data streams in such applications can be large-scale, distributed, and generated continuously at a rate that cannot be estimated in advance. Scalability remains a major requirement for such applications, to handle variable event loads efficiently [4].
Multi-tenancy Cloud environments enable such concurrent data streams (with data becoming available at unpredictable times) to be processed using a shared, distributed computing infrastructure. When multiple applications are executed over the same shared elastic infrastructure, each stream must be isolated from the other in order to either: (i) run all instances without violating their particular Quality of Service (QoS) constraints; or (ii) indicate that, given current resources, a particular stream instance cannot be accepted for execution. The QoS demand of each stream is captured in a Service Level Agreement (SLA)—which must be pre-agreed between the stream owner/ generator and the service provider (hosting the analysis capability) a priori. Such an SLA identifies the cost that a user must pay to achieve the required QoS and a penalty that must be paid to the user if the QoS cannot be met [5].
Assuming the maximization of profit as the main Cloud provider’s objective, then it must be decided which streams to accept for storage and analysis; and how many resources to allocate to each stream in order to improve its overall profit. This task is highly challenging with aggregated, unpredictable and bursty data flows that usually make both predictive and simple reactive approaches unsuitable. Even dynamic provisioning of resources may not be useful to provide a profit to the Cloud provider since the delay incurred might be too high—it may take several seconds to add new resources (e.g. instantiate new Virtual Machines (VMs)), and a scaling-up action might generate substantial penalties and overheads.
Our main contributions consist of data admission and control policies to regulate data access and manage the impact of data bursts, and a policy for resource redistribution that tries to minimize the cost of QoS penalty violation, maximizing the overall profit. The rationale behind this latter policy is that current mechanisms for scaling resources in Cloud infrastructures have severe associated delays which may provoke large financial penalties. Overall, our main contributions can be summarized as follows: (i) an improved profit model that takes into account both profit and penalties, (ii) a set of dynamic control actions to manage resources with maximization of a provider’s profit, (iii) a unified token-based resource management model for realizing the profit-oriented actions. This model aims at optimizing the utilization of unused resources, enabling dynamic and consistent re-allocation of resources, (iv) the specification of all the control logic in terms of a Reference-net model, (v) extensive simulations of various scenarios demonstrating the effectiveness of our proposed profit-oriented control mechanism, and (vi) an OpenNebula-based deployment showing how the Reference-net model can be turned into an executable model in a straightforward manner.
Our previous contributions in enforcing QoS on shared Cloud infrastructures were described in [6], [7], [8]. In [9], [10], we proposed a profit-based resource management model for streaming applications over shared Clouds. In [11] we extend this with an improved revenue generation model and identify specific actions to support resource management. In particular, with (i) the re-distribution of unused resources amongst data streams; and (ii) a dynamic re-allocation of resources to streams likely to generate greater revenue for the Cloud provider. This paper extends [10], [11], by combining our previous profit-based resource management model with an OpenNebula-based Cloud deployment. We provide a model in terms of Reference nets—particular type of Petri nets. One of the characteristics of Reference nets is that they can also be interpreted and support Java actions in their transitions, so that the models proposed here become executable directly.
The remaining part of this paper is structured as follows. Section 2 presents the revenue-based model for in-transit analysis and the profit-oriented actions to manage resources with maximization of provider’s profit. Section 3 describes our system architecture based on the token bucket model, the rule-based SLA management of QoS and the unified token-based resource management model for realizing the profit-oriented actions by optimizing the utilization of unused resources and allowing dynamic and consistent re-allocation of resources. Section 4 describes the Reference net model of the control logic used. Section 5 shows our evaluation scenarios and simulation results. Section 6 presents our deployment and experiments on an OpenNebula-based Cloud infrastructure. In Section 7, most closely related work is discussed. Finally, the conclusions and future work are given in Section 8.
Section snippets
Profit-based model
We consider a provider centric view of costs incurred to provide data stream processing services over a number of available computational resources. If we assume the objective of the provider is to maximize revenue, then it must decide: (i) which user streams to accept for storage and analysis; (ii) how many resources (including storage space and computational capacity) to allocate to each stream in order to improve overall profit revenue (generally over a time horizon); and (iii) what actions
System architecture for dynamic management of resources
Our system architecture can process a number of data streams simultaneously, with the main objective of maintaining a negotiated SLA (throughput), while minimizing the number of computational resources involved. The underlying resource management policy is based on a business model: each SLA violation leads to an associated penalization, whereas scaling-up the computational resources involved has an associated cost. Therefore, the system controller main policy is to trigger the action that
System architecture Petri Net specification
The specification of architectural components has been modeled using a Petri Net that uses Java as inscription language. We use Petri Nets as an executable architectural component description language that provides precise and concise specifications of complex concurrent behaviors. The use of Java complements Petri Nets with the modeling of complex data structures and the integration with different Java libraries such as the rule engine or the integration with Cloud infrastructures. In this
Evaluation scenarios by simulation
We propose three evaluation scenarios to show the behavior of our controller. Scenarios: (i) the addition/removal of resources to the queue that provisions “Gold” streams taking resources from “Bronze” streams; (ii) the selective violation of “Silver” data stream SLAs to avoid violations of Gold data streams; and (iii) a final scenario to show the redistribution of unused resources by an additional bucket that collects tokens in excess and redistribute them over the same class as proposed in
OpenNebula-based implementation
In previous sections, we validated our models in terms of simulation. In this section, we want to test the feasibility of our proposal in a real Cloud infrastructure. For such a purpose, we exploit an OpenNebula data centre for a real implementation of our model. The fact that our Reference net models are executable, as they can be interpreted by Renew, allows us to interface directly with OpenNebula from the nets: create and switch on and off Virtual Machines (VMs), transmit data to the data
Related work
Resource provisioning, resource allocation, resource mapping, and resource adaptation in Cloud-based infrastructures have received significant attention over the last years [29]. Three main approaches have been pointed out to scale resources. First, reactive mechanisms mainly use monitored values and apply elasticity rules or threshold-based rules pre-defined by service providers [30], [16], [31]. Second, predictive mechanisms try to learn from previous data history and resource usage to
Conclusion and future work
There is an emerging interest in processing data streams over shared Cloud infrastructures, with data elements being processed at distributed nodes in transit from source to sink. We consider the execution of simultaneous data stream over such infrastructure, with each stream having particular QoS objectives (throughput or latency, for instance), expressed within an SLA. We established three different classes of customers submitting data streams (Gold, Silver and Bronze), with each class
Rafael Tolosana-Calasanz is currently an Associate Professor at the Computer Science and Systems Engineering Department of the University of Zaragoza, Spain. He holds a Ph.D. from the University of Zaragoza and his research interests lie in the intersection of Distributed and Concurrent Systems, Scientific Workflows and Petri nets.
References (49)
- et al.
Enforcing QoS in scientific workflow systems enacted over cloud infrastructures
J. Comput. System Sci.
(2012) - et al.
Cost model based service placement in federated hybrid clouds
Future Gener. Comput. Syst.
(2014) - et al.
Supporting CPU-based guarantees in cloud slas via resource-level QoS metrics
Future Gener. Comput. Syst.
(2012) - et al.
Resource management for infrastructure as a service (IaaS) in cloud computing: a survey
J. Netw. Comput. Appl.
(2014) - et al.
Adaptive resource provisioning for read intensive multi-tier applications in the Cloud
Future Gener. Comput. Syst.
(2011) - et al.
Cyberinfrastructure for coastal hazard prediction
CTWatch Q.
(2008) - et al.
Lead cyberinfrastructure to track real-time storms using spruce urgent computing
CTWatch Q.
(2008) - et al.
Adaptive rate stream processing for smart grid applications on clouds
- et al.
A view of cloud computing
Commun. ACM
(2010) - I. Petri, O. Rana, G.C. Silaghi, Y. Rezgui, Risk assessment in service provider communities, in: The 8th International...
End-to-end QoS on shared clouds for highly dynamic, large-scale sensing data streams
Revenue-based resource management on shared clouds for heterogenous bursty data streams
Revenue creation for rate adaptive stream management in multi-tenancy environments
Event Processing in Action
Rule-based sla management for revenue maximisation in cloud computing markets
Exact admission control for networks with a bounded delay service
IEEE/ACM Trans. Netw.
A survey of envelope processes and their applications in quality of service provisioning
IEEE Commun. Surv. Tutor.
A performance study of event processing systems
Petri nets: properties, analysis and applications
Petri nets as token objects: an introduction to elementary object nets
Cited by (22)
Exploiting data centres energy flexibility in smart cities: Business scenarios
2019, Information SciencesCitation Excerpt :Authors of [3] propose a cost model for federated hybrid clouds through a cost minimization algorithm for service placement in clouds to minimize the spending for computational services. A profit-based re-source management model is presented in [46] based on a policy for resource redistribution that reduces the cost of QoS (Quality of Service) penalty violation, maximizing the overall provider profit in cloud environments. The model uses Petri-nets for modeling system components and is demonstrated using OpenNebula-based Cloud infrastructure [51].
Economics of Computing Services: A literature survey about technologies for an economy of fungible cloud services
2018, Future Generation Computer SystemsModel-driven development of data intensive applications over cloud resources
2018, Future Generation Computer SystemsCitation Excerpt :Additionally, elasticity and resilience are also intrinsically linked with resource management, scheduling and autonomic principles with direct effect on performance and cost [58]. Although this paper focuses on the first four properties, the utility of formal models for the modelling of components supporting the two last properties is shown in previous works: In [59,60,32,61,62] are presented specifications of strategies on cloud for resource management following autonomic principles at the application level for streaming and scientific workflows. It is also important to highlight the work of Brogi et al. [63] that propose to extend TOSCA with the use of Petri nets for modelling management operations of complex applications over heterogeneous clouds.
Cloudy in guifi.net: Establishing and sustaining a community cloud as open commons
2018, Future Generation Computer SystemsCitation Excerpt :Some examples are image processing [42] or sensor data processing [43]. The quality of service needs of these streams will require solutions to cope, for instance, with busy streams or real-time constraints [44,45]. Networking and computing infrastructures and services are critical resource systems for social inclusion and participation.
Distributed data stream processing and edge computing: A survey on resource elasticity and future directions
2018, Journal of Network and Computer ApplicationsCitation Excerpt :Although resource elasticity for stream processing applications has been investigated in previous work, several challenges are not yet fully addressed (Sattler and Beier, 2013). As highlighted by Tolosana-Calasanz et al. (2016), mechanisms for scaling resources in cloud infrastructure can still incur severe delays. For stream processing engines that organise applications as operator graphs, an elastic operation that adds more nodes at runtime may require re-routing the data and migrating stream processing operators.
Efficient Resource Management System Based on 4Vs of Big Data Streams
2017, Big Data ResearchCitation Excerpt :In 2016, Rahman and Graham [32] developed a priority based method for multimedia data processing. But the methods proposed in [31] and [32] are static hybrid algorithms. The authors in [33] argued that it is necessary to predict the volume of streaming big data for efficient node allocation.
Rafael Tolosana-Calasanz is currently an Associate Professor at the Computer Science and Systems Engineering Department of the University of Zaragoza, Spain. He holds a Ph.D. from the University of Zaragoza and his research interests lie in the intersection of Distributed and Concurrent Systems, Scientific Workflows and Petri nets.
José Ángel Bañares was born in Zaragoza, Spain in 1966. He received the M.S. degree in Industrial-Electrical engineering and the Ph.D. degree in Computer Science from the University of Zaragoza, in 1991 and 1996, respectively. He joined the faculty of the University of Zaragoza as an Assistant Professor in 1994, and since 1999 he has been an Associate Professor. His research interests include Petri nets, Artificial Intelligence and Distributed Computing.
Congduc Pham obtained his Ph.D. in Computer Science in July 1997 at the LIP6 Laboratory (Laboratoire d’Informatique de Paris 6), University Pierre and Marie Curie. He also received his Habilitation in 2003 from University Claude Bernard Lyon 1. He received a Diplôme d’Etudes Approfondies (DEA) in computer systems from the University Pierre et Marie Curie in 1993. He also obtained the Magistère d’Informatique Appliquée de l’Ile de France in 1993. He spent one year at the University of California, Los Angeles as a Post-Doctoral Fellow with Professor R. L. Bagrodia. From 1998 to 2005, he was an Associate Professor at the University of Lyon, member of the INRIA RESO project in the LIP laboratory at the ENS Lyon. He is now at the University of Pau and the LIUPPA laboratory. From September 2006 to July 2009, he was Director of the LIUPPA laboratory.
Omer F. Rana is a Professor of Performance Engineering and member of the Distributed Collaborative Computing Group in the School of Computer Science at Cardiff University. He holds a Ph.D. in “Neural Computing and Parallel Architectures” from Imperial College, London University. He has worked as a Software Developer with Marshall BioTechnology Limited (London) and as an Adviser to the US-based “Grid Technology Partners”. He has been Deputy Director of the Welsh eScience Centre. He has participated as a theme leader within a number of European and UK-funded networks. He co-led the “Jini” working group at the Open Grid Forum, and has contributed to the Service Level Agreements efforts (and integration of these with workflow techniques) within the GRAAP working group.
- ☆
This work was partially supported by the OMNIDATA project funded by the Aquitaine–Aragon collaboration research program between Aquitaine region (France) and Aragon region (Spain); and by the Spanish Ministry of Economy under the program “Programa de I+D+i Estatal de Investigación, Desarrollo e innovación Orientada a los Retos de la Sociedad”, project identifier TIN2013-40809-R.