Joint optimization of service request routing and instance placement in the microservice system

https://doi.org/10.1016/j.jnca.2019.102441Get rights and content

Abstract

Microservice architecture is a promising architectural style. It decomposes monolithic software into a set of loosely coupled containerized microservices and associates them into multiple microservice chains to serve service requests. The new architecture creates flexibility for service provisioning but also introduces increased energy consumption and low service performance. Efficient resource allocation is critical. Unfortunately, existing solutions are designed at a coarse level for virtual machine (VM)-based clouds and not optimized for such chain-oriented service provisioning. In this paper, we study the resource allocation optimization problem for service request routing and microservice instance placement, so as to jointly reduce both resource usage and chains’ end-to-end response time for saving energy and guaranteeing Quality of Service (QoS). We design detailed workload models for microservices and chains and formulate the optimization problem as a bi-criteria optimization problem. To address it, a three-stage scheme is proposed to search and optimize the trade-off decisions, route service requests into instances and deploy instances to servers in a balanced manner. Through numerical evaluations, we show that while assuring the same QoS, our scheme performs significantly better than and faster than benchmarking algorithms on reducing energy consumption and balancing load.

Introduction

As a new architecture style for provisioning services, microservice architecture (also called μService) is currently attracting significant attention. Traditionally, all the modules of an application are packaged and deployed as a monolith. However, this approach suffers from several issues regarding software updates, reliability, and scalability (Fazio et al., 2016). For instance, to update a small part of an application, the entire application needs to be redeployed. In contrast, μService refractors an application into a set of small, interconnected microservices. Each microservice is self-contained and can be developed, updated and deployed independently. The business logic is implemented by a series of microservices forming a microservice chain through remote API calls (e.g., REST or message queue). μService creates flexibility, reliability, and speed of software updating and service delivery. With the advent of container technologies such as Docker,1 the new architecture powers modern cloud applications.

However, along with the enhanced flexibility, come potentially increased hardware resource usage (e.g., CPU and memory) and request processing time due to service decomposition (NGINX, ). When we start building a μService system, resource allocation strategies are required to allocate into more microservice instances and also into more complex system management tools (e.g., load balancers and failover software) in such more elaborate system. Careless resource allocation can increase energy consumption. Moreover, a service request is processed across multiple instances through inter-process communication rather than language-level function calls in monolithic applications. Each instance may have unique request processing logic and processing capabilities depending on allocated computing resource. Hence, careless resource allocation, on the other hand, can increase the end-to-end request processing time of microservice chains (hereinafter referred to as service time) which is the major QoS concerned by service providers and users.

To reduce energy consumption and improve QoS, a trade-off resource allocation optimization problem is involved that more computing resources allocated to microservice instances can reduce service time, but at the cost of higher energy consumption, and vice versa. Even without considering energy consumption, inter-chain resource contention exists for their own performance (Niu et al., 2018), which can result in unbalanced QoS provided by different chains. Thus, careful resource allocation to avoid over/under-provisioning is deeply needed, but not trivial for μServices since they are being deployed at a significant scale (e.g., Uber's application is composed of over 1000 instances) (Panda et al., 2017). To this end, a critical problem needs to be addressed: how can resources be efficiently allocated and balanced not only to reduce energy consumption but also to guarantee high QoS when creating and placing microservice instances and routing service requests?

Such resource optimization problem for μServices is similar to the one involved in VM-based clouds which has attracted many solutions (Jennings and Stadler, 2015; Zhan et al., 2015). Unfortunately, these solutions tend to be limited when being applied to μServices. The major difference between μServices and VM-based clouds is that μServices process users’ requests by chain-oriented service provisioning. Thus, the resource optimization for μServices should be governed at the chain level by considering heterogeneous requests and inter-chain resource contentions (Niu et al., 2018). Moreover, since vertical scaling for VMs (on-the-fly changing of assigned resources to VMs) is high-cost or not supported (Vaquero et al., 2011), few of them consider the detailed workload model inside a VM to scale vertically and they usually only optimize the number of running VMs (horizontal scaling) according to fluctuating workload. Similarly, such horizontal optimization is also followed in existing resource allocation solutions for μServices (e.g., (Guerrero et al., 2018; Niu et al., 2018)). But container is a more lightweight virtualization technology and can support dynamic resource allocation inside containers with low operation costs (Kulkarni et al., 2017). Therefore, for such containerized μServices, efficient resource allocations should be redesigned at a fine-grained level (e.g., CPU cycle) rather than instance level, which has not been well studied.

In this paper, we study the resource allocation optimization for provisioning web services with μServices. We focus on the containerized μService system, in which a microservice instance is a container running the microservice. To achieve a fine-grained resource allocation, different from existing works, we model resource usage and system performance at the CPU cycle level in each microservice instance. The optimization problem is formulated as a biobjective optimization problem, with which we jointly optimize resource allocation to reduce both energy consumption and service time for provisioning services. To address this problem, we stratify it into three subproblems to reduce the space of decision searching, and then we propose Lego, a three-stage scheme to find the optimal (or near-optimal) trade-off decisions for Load balancing, energy saving and QoS assurance. Through extensive simulation experiments, we validate that Lego significantly outperforms several state-of-the-art approaches that it can produce higher-quality trade-off decisions and achieve more well-balanced service request routing and microservice instance placement. Our main contributions are summarized as follows:

  • We formulate the resource usage and service time patterns of microservice instances and chains by queueing theory and model energy consumption and QoS assurance as a bi-criteria resource allocation optimization problem.

  • Lego starts with our resource allocation algorithm based on multiobjective particle swarm optimization, which can produce excellent trade-off decisions within a few iterations to create microservice instances.

  • We then propose an efficient heuristic request routing algorithm to route requests into these created instances in a balanced manner.

  • Finally, Lego deploys these instances into servers by our balance-aware instance placement algorithm, which can achieve high balancing performance within a low computational complexity.

  • Our extensive simulation results show that compared to several widely-used optimization algorithms, when maintaining the same service time, Lego achieves a significant performance on reducing overall energy consumption, routing requests and placing instances in a balance manner.

The rest of this paper is structured as follows. In Section 2, we discuss the related works. The system model and our optimization problem are presented in Sections 3 System model, 4 Problem formulation, respectively. Our three-stage scheme is illustrated in Section 5. We present experiments and evaluations in Section 6 and conclude the paper in Section 8.

Section snippets

Motivating scenario

Consider a service provider that provides a cloud infrastructure with a set of microservices (MSs) as the μService system, in which each of MS is encapsulated in a container image. The provider provision services/applications in term of microservice chains (MSCs), which is composed of a set of ordered MS instances and these instances are interacted with REST requests. An API gateway is provided with a set of service APIs to receive service requests from front-end users and to route them to

System model

We now introduce our μService system model with the main terminology used to represent physical infrastructure, microservice, microservice chain, and energy consumption. The basic notations are explained in Table 1.

Problem formulation

Given the system model, we now define the Energy- and QoS-aware service request Routing and instance Placement (EQRP) problem. Specifically, given the infrastructure resource I and a set of service requests N, we aim to find the minimum resource usage for deploying MS instances to reduce power rate, as well as the minimum service time of each MSC. This is a bi-criteria optimization problem in which the two objectives conflict with each other: allocating more resource can reduce service time,

Solution scheme

In this section, we discuss our solution for the EQRP problem. Since it is a bi-criteria decision problem, there is no single decision but a set of trade-off decisions for the problem. These decisions (i.e., optimality) are commonly quantified by the Pareto-dominance relation, in which a decision vector d1 is Pareto-dominated by another decision vector d2, if and only if, (i) all the objective results achieved by d2 are better than or equivalent to those achieved by d1, and (ii) d2

Experiments and evaluations

Experimental evaluations are carried out in this section to validate the performance of Lego. By simulating a μService system with a real workload, we have considered the impact of different factors and conducted various quantitative experiments for our three algorithms and the entire three-stage scheme compared with other methods.

Discussion and future work

Lego achieves a significant performance. However, it also has its limitations. Alleviating them will be important future work for application in production environments.

Poisson process. We follow Poisson process to model system behaviors in the μService system. In fact, the request arrivals in general follow a heavy-tailed (Pareto) distribution (Barabasi, 2005). However, the first-come-first-serve message processing in a software environment leads to uniform Poisson-like distributions. The

Conclusion

In this paper, we studied the service request routing and microservice instance deployment problem in the μService environment. We leveraged the characteristic of container-based μServices to model the μService system and formulate the resource allocation problem to be aware of energy consumption and the QoS assurance. We then proposed a three-stage approach (Lego) to address this problem. In particular, the approach first searches and optimizes trade-off decisions of resource allocation for

Acknowledgment

This work is supported by the Provincial Science & Technology Pillar Program of Hubei under Grant 2017AAA027, 2017AAA042 and 2017AHB048.

Yinbo Yu received B.E. degree in Electronic Information Engineering from School of Electronic Information, Wuhan University, Wuhan, China, in 2014, where he is currently pursuing the Ph.D. degree. He now is also a visiting Ph.D. student with EECS, Northwestern University, Evanston, IL, USA. His research interests include SDN, NFV, cellular network, service management and networking security.

References (46)

  • W. Dawoud et al.

    Elastic virtual machine for fine-grained cloud resource provisioning

  • K. Deb et al.

    A fast and elitist multiobjective genetic algorithm: NSGA-II

    IEEE Trans. Evol. Comput.

    (2002)
  • J.J. Durillo et al.

    Multi-objective particle swarm optimizers: an experimental comparison

  • M. Elnozahy et al.

    Energy conservation policies for web servers

  • X. Fan et al.

    Power provisioning for a warehouse-sized computer

  • M. Fazio et al.

    Open issues in scheduling microservices in the cloud

    IEEE Cloud Comput.

    (2016)
  • M.R. Garey et al.
    (2002)
  • W. Grassmann

    The convexity of the mean queue size of the M/M/c queue with respect to the traffic intensity

    J. Appl. Probab.

    (1983)
  • C. Guerrero et al.

    Resource optimization of container orchestration: a case study in multi-cloud microservices-based applications

    J. Supercomput.

    (2018)
  • B. Jennings et al.

    Resource management in clouds: survey and research challenges

    J. Netw. Syst. Manag.

    (2015)
  • N. Karmarkar et al.

    The Differencing Method of Set Partitioning

    (1982)
  • H. Khazaei et al.

    Performance analysis of cloud computing centers using m/g/m/m r queuing systems

    IEEE Trans. Parallel Distrib. Syst.

    (2012)
  • S.G. Kulkarni et al.

    NFVnice: dynamic backpressure and scheduling for NFV service chains

  • Cited by (27)

    • Optimal server and service deployment for multi-tier edge cloud computing

      2021, Computer Networks
      Citation Excerpt :

      Many studies in the literature investigate the server placement problem with the objective of minimizing the installation costs [6,7,23,26,29,39]. Others try to minimize the service latency and the total energy consumption of the system and balance the workload among the servers [9,10,12,16,28,30]. However, these studies do not consider a multi-tier architecture where edge and cloud servers work in harmony.

    • Joint Deployment and Request Routing for Microservice Call Graphs in Data Centers

      2023, IEEE Transactions on Parallel and Distributed Systems
    View all citing articles on Scopus

    Yinbo Yu received B.E. degree in Electronic Information Engineering from School of Electronic Information, Wuhan University, Wuhan, China, in 2014, where he is currently pursuing the Ph.D. degree. He now is also a visiting Ph.D. student with EECS, Northwestern University, Evanston, IL, USA. His research interests include SDN, NFV, cellular network, service management and networking security.

    Jianfeng Yang received his Bachelor, Master and Ph.D. degrees in Information and Communication Engineering from Wuhan University, China, in 1998, 2002 and 2009, respectively. He is currently an associate professor of Wuhan University. He worked as a visiting scholar in Intel Company in 2012 and Northwestern University from 2015 to 2016. His research interests are in security and measurement for networking, edge computing, and real-time wireless communication.

    Chengcheng Guo received his Ph.D. degree in the School of Electronic and Information, WuHan University, China. He received his Bachelor and Master degree in the Computer School of Wuhan University, China. He is currently a professor and a Ph.D. supervisor in the School of Electronic Information, Wuhan University, China. His research interests are internet and communication technology, wireless mesh networks, industry control networks, and real-time and reliability communications.

    Hong Zheng received his Ph.D. degrees in Photogrammetry and Remote Sensing from Wuhan University, China, in 2000. He worked as a research fellow in Deakin University from 2001 to 2014. He is currently a professor of Wuhan University. His research interests are in machine vision and artificial intelligence.

    View full text