Joint optimization of service request routing and instance placement in the microservice system
Introduction
As a new architecture style for provisioning services, microservice architecture (also called μService) is currently attracting significant attention. Traditionally, all the modules of an application are packaged and deployed as a monolith. However, this approach suffers from several issues regarding software updates, reliability, and scalability (Fazio et al., 2016). For instance, to update a small part of an application, the entire application needs to be redeployed. In contrast, μService refractors an application into a set of small, interconnected microservices. Each microservice is self-contained and can be developed, updated and deployed independently. The business logic is implemented by a series of microservices forming a microservice chain through remote API calls (e.g., REST or message queue). μService creates flexibility, reliability, and speed of software updating and service delivery. With the advent of container technologies such as Docker,1 the new architecture powers modern cloud applications.
However, along with the enhanced flexibility, come potentially increased hardware resource usage (e.g., CPU and memory) and request processing time due to service decomposition (NGINX, ). When we start building a μService system, resource allocation strategies are required to allocate into more microservice instances and also into more complex system management tools (e.g., load balancers and failover software) in such more elaborate system. Careless resource allocation can increase energy consumption. Moreover, a service request is processed across multiple instances through inter-process communication rather than language-level function calls in monolithic applications. Each instance may have unique request processing logic and processing capabilities depending on allocated computing resource. Hence, careless resource allocation, on the other hand, can increase the end-to-end request processing time of microservice chains (hereinafter referred to as service time) which is the major QoS concerned by service providers and users.
To reduce energy consumption and improve QoS, a trade-off resource allocation optimization problem is involved that more computing resources allocated to microservice instances can reduce service time, but at the cost of higher energy consumption, and vice versa. Even without considering energy consumption, inter-chain resource contention exists for their own performance (Niu et al., 2018), which can result in unbalanced QoS provided by different chains. Thus, careful resource allocation to avoid over/under-provisioning is deeply needed, but not trivial for μServices since they are being deployed at a significant scale (e.g., Uber's application is composed of over 1000 instances) (Panda et al., 2017). To this end, a critical problem needs to be addressed: how can resources be efficiently allocated and balanced not only to reduce energy consumption but also to guarantee high QoS when creating and placing microservice instances and routing service requests?
Such resource optimization problem for μServices is similar to the one involved in VM-based clouds which has attracted many solutions (Jennings and Stadler, 2015; Zhan et al., 2015). Unfortunately, these solutions tend to be limited when being applied to μServices. The major difference between μServices and VM-based clouds is that μServices process users’ requests by chain-oriented service provisioning. Thus, the resource optimization for μServices should be governed at the chain level by considering heterogeneous requests and inter-chain resource contentions (Niu et al., 2018). Moreover, since vertical scaling for VMs (on-the-fly changing of assigned resources to VMs) is high-cost or not supported (Vaquero et al., 2011), few of them consider the detailed workload model inside a VM to scale vertically and they usually only optimize the number of running VMs (horizontal scaling) according to fluctuating workload. Similarly, such horizontal optimization is also followed in existing resource allocation solutions for μServices (e.g., (Guerrero et al., 2018; Niu et al., 2018)). But container is a more lightweight virtualization technology and can support dynamic resource allocation inside containers with low operation costs (Kulkarni et al., 2017). Therefore, for such containerized μServices, efficient resource allocations should be redesigned at a fine-grained level (e.g., CPU cycle) rather than instance level, which has not been well studied.
In this paper, we study the resource allocation optimization for provisioning web services with μServices. We focus on the containerized μService system, in which a microservice instance is a container running the microservice. To achieve a fine-grained resource allocation, different from existing works, we model resource usage and system performance at the CPU cycle level in each microservice instance. The optimization problem is formulated as a biobjective optimization problem, with which we jointly optimize resource allocation to reduce both energy consumption and service time for provisioning services. To address this problem, we stratify it into three subproblems to reduce the space of decision searching, and then we propose Lego, a three-stage scheme to find the optimal (or near-optimal) trade-off decisions for Load balancing, energy saving and QoS assurance. Through extensive simulation experiments, we validate that Lego significantly outperforms several state-of-the-art approaches that it can produce higher-quality trade-off decisions and achieve more well-balanced service request routing and microservice instance placement. Our main contributions are summarized as follows:
- ∙
We formulate the resource usage and service time patterns of microservice instances and chains by queueing theory and model energy consumption and QoS assurance as a bi-criteria resource allocation optimization problem.
- ∙
Lego starts with our resource allocation algorithm based on multiobjective particle swarm optimization, which can produce excellent trade-off decisions within a few iterations to create microservice instances.
- ∙
We then propose an efficient heuristic request routing algorithm to route requests into these created instances in a balanced manner.
- ∙
Finally, Lego deploys these instances into servers by our balance-aware instance placement algorithm, which can achieve high balancing performance within a low computational complexity.
- ∙
Our extensive simulation results show that compared to several widely-used optimization algorithms, when maintaining the same service time, Lego achieves a significant performance on reducing overall energy consumption, routing requests and placing instances in a balance manner.
The rest of this paper is structured as follows. In Section 2, we discuss the related works. The system model and our optimization problem are presented in Sections 3 System model, 4 Problem formulation, respectively. Our three-stage scheme is illustrated in Section 5. We present experiments and evaluations in Section 6 and conclude the paper in Section 8.
Section snippets
Motivating scenario
Consider a service provider that provides a cloud infrastructure with a set of microservices (MSs) as the μService system, in which each of MS is encapsulated in a container image. The provider provision services/applications in term of microservice chains (MSCs), which is composed of a set of ordered MS instances and these instances are interacted with REST requests. An API gateway is provided with a set of service APIs to receive service requests from front-end users and to route them to
System model
We now introduce our μService system model with the main terminology used to represent physical infrastructure, microservice, microservice chain, and energy consumption. The basic notations are explained in Table 1.
Problem formulation
Given the system model, we now define the Energy- and QoS-aware service request Routing and instance Placement (EQRP) problem. Specifically, given the infrastructure resource I and a set of service requests N, we aim to find the minimum resource usage for deploying MS instances to reduce power rate, as well as the minimum service time of each MSC. This is a bi-criteria optimization problem in which the two objectives conflict with each other: allocating more resource can reduce service time,
Solution scheme
In this section, we discuss our solution for the EQRP problem. Since it is a bi-criteria decision problem, there is no single decision but a set of trade-off decisions for the problem. These decisions (i.e., optimality) are commonly quantified by the Pareto-dominance relation, in which a decision vector is Pareto-dominated by another decision vector , if and only if, (i) all the objective results achieved by are better than or equivalent to those achieved by , and (ii)
Experiments and evaluations
Experimental evaluations are carried out in this section to validate the performance of Lego. By simulating a μService system with a real workload, we have considered the impact of different factors and conducted various quantitative experiments for our three algorithms and the entire three-stage scheme compared with other methods.
Discussion and future work
Lego achieves a significant performance. However, it also has its limitations. Alleviating them will be important future work for application in production environments.
Poisson process. We follow Poisson process to model system behaviors in the μService system. In fact, the request arrivals in general follow a heavy-tailed (Pareto) distribution (Barabasi, 2005). However, the first-come-first-serve message processing in a software environment leads to uniform Poisson-like distributions. The
Conclusion
In this paper, we studied the service request routing and microservice instance deployment problem in the μService environment. We leveraged the characteristic of container-based μServices to model the μService system and formulate the resource allocation problem to be aware of energy consumption and the QoS assurance. We then proposed a three-stage approach (Lego) to address this problem. In particular, the approach first searches and optimizes trade-off decisions of resource allocation for
Acknowledgment
This work is supported by the Provincial Science & Technology Pillar Program of Hubei under Grant 2017AAA027, 2017AAA042 and 2017AHB048.
Yinbo Yu received B.E. degree in Electronic Information Engineering from School of Electronic Information, Wuhan University, Wuhan, China, in 2014, where he is currently pursuing the Ph.D. degree. He now is also a visiting Ph.D. student with EECS, Northwestern University, Evanston, IL, USA. His research interests include SDN, NFV, cellular network, service management and networking security.
References (46)
- et al.
Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing
Future Gener. Comput. Syst.
(2012) - et al.
Load balancing across microservices
- et al.
Resource allocation algorithms for virtualized service hosting platforms
J. Parallel Distrib. Comput.
(2010) Online holiday shopping trends and traffic report for Europe and North America
- et al.
A measurement-based characterization of the energy consumption in data center servers
IEEE J. Sel. Area. Commun.
(2015) The origin of bursts and heavy tails in human dynamics
Nature
(2005)- et al.
Estimating the distribution of a sum of independent lognormal random variables
IEEE Trans. Commun.
(1995) The output of a queuing system
Oper. Res.
(1956)- et al.
Self-adaptive trade-off decision making for autoscaling cloud-based services
IEEE Trans. Serv. Comput.
(2017) - et al.
Two issues in setting call centre staffing levels
Ann. Oper. Res.
(2001)
Elastic virtual machine for fine-grained cloud resource provisioning
A fast and elitist multiobjective genetic algorithm: NSGA-II
IEEE Trans. Evol. Comput.
Multi-objective particle swarm optimizers: an experimental comparison
Energy conservation policies for web servers
Power provisioning for a warehouse-sized computer
Open issues in scheduling microservices in the cloud
IEEE Cloud Comput.
The convexity of the mean queue size of the M/M/c queue with respect to the traffic intensity
J. Appl. Probab.
Resource optimization of container orchestration: a case study in multi-cloud microservices-based applications
J. Supercomput.
Resource management in clouds: survey and research challenges
J. Netw. Syst. Manag.
The Differencing Method of Set Partitioning
Performance analysis of cloud computing centers using m/g/m/m r queuing systems
IEEE Trans. Parallel Distrib. Syst.
NFVnice: dynamic backpressure and scheduling for NFV service chains
Cited by (27)
Microservice instances selection and load balancing in fog computing using deep reinforcement learning approach
2024, Future Generation Computer SystemsOptimal server and service deployment for multi-tier edge cloud computing
2021, Computer NetworksCitation Excerpt :Many studies in the literature investigate the server placement problem with the objective of minimizing the installation costs [6,7,23,26,29,39]. Others try to minimize the service latency and the total energy consumption of the system and balance the workload among the servers [9,10,12,16,28,30]. However, these studies do not consider a multi-tier architecture where edge and cloud servers work in harmony.
Joint Optimization of Server and Service Selection in Satellite-Terrestrial Integrated Edge Computing Networks
2024, IEEE Transactions on Vehicular TechnologyJoint Deployment and Request Routing for Microservice Call Graphs in Data Centers
2023, IEEE Transactions on Parallel and Distributed Systems
Yinbo Yu received B.E. degree in Electronic Information Engineering from School of Electronic Information, Wuhan University, Wuhan, China, in 2014, where he is currently pursuing the Ph.D. degree. He now is also a visiting Ph.D. student with EECS, Northwestern University, Evanston, IL, USA. His research interests include SDN, NFV, cellular network, service management and networking security.
Jianfeng Yang received his Bachelor, Master and Ph.D. degrees in Information and Communication Engineering from Wuhan University, China, in 1998, 2002 and 2009, respectively. He is currently an associate professor of Wuhan University. He worked as a visiting scholar in Intel Company in 2012 and Northwestern University from 2015 to 2016. His research interests are in security and measurement for networking, edge computing, and real-time wireless communication.
Chengcheng Guo received his Ph.D. degree in the School of Electronic and Information, WuHan University, China. He received his Bachelor and Master degree in the Computer School of Wuhan University, China. He is currently a professor and a Ph.D. supervisor in the School of Electronic Information, Wuhan University, China. His research interests are internet and communication technology, wireless mesh networks, industry control networks, and real-time and reliability communications.
Hong Zheng received his Ph.D. degrees in Photogrammetry and Remote Sensing from Wuhan University, China, in 2000. He worked as a research fellow in Deakin University from 2001 to 2014. He is currently a professor of Wuhan University. His research interests are in machine vision and artificial intelligence.