Elsevier

Future Generation Computer Systems

Volume 51, October 2015, Pages 120-131
Future Generation Computer Systems

Workload balancing and adaptive resource management for the swift storage system on cloud

https://doi.org/10.1016/j.future.2014.11.006Get rights and content

Highlights

  • We propose a workload balancing and adaptive resource management framework for Swift.

  • We implement optimization algorithms of dynamic workload balancing for Swift.

  • We conduct an experiment to demonstrate the effectiveness of this framework.

Abstract

The demand for big data storage and processing has become a challenge in today’s industry. To meet the challenge, there is an increasing number of enterprises adopting distributed storage systems. Frequently, in these systems, storage nodes intensively holding hotspot data could become system bottlenecks while storage nodes without hotspot data might result in low utilization of computing resource. This stems from the fact that almost all the typical distributed storage systems only provide data-amount-oriented balancing mechanisms without considering the different access load of data. To eliminate the system bottlenecks and optimize the resource utilization, there is a demand for such distributed storage systems to employ a workload balancing and adaptive resource management framework. In this paper, we propose a framework of workload balancing and resource management for Swift, a widely used and typical distributed storage system on cloud. In this framework, we design workload monitoring and analysis algorithms for discovering overloaded and underloaded nodes in the cluster. To balance the workload among those nodes, Split, Merge and Pair Algorithms are implemented to regulate physical machines while Resource Reallocate Algorithm is designed to regulate virtual machines on cloud. In addition, by leveraging the mature architecture of distributed storage systems, the framework resides in the hosts and operates through API interception. To demonstrate its effectiveness, we conduct experiments to evaluate it. And the experimental results show the framework can achieve its goals.

Section snippets

Introduce

With the increasing demand for big data processing, traditional centralized storage systems cannot meet the requirement. Therefore, distributed storage systems become more and more popular because of their horizontal scalability and high robustness. Generally, a distributed storage system contains numbers of storage nodes. Data can be equally distributed to those nodes by some techniques, such as hashing. By distributing data to different storage nodes, the load of accessing data is also

Related work

As a well-known and typical distributed storage system, Swift is playing an important role in cloud storage. It is the storage component of OpenStack. Therefore, it plays a significant role in the storage of cloud platform. It stores the data of thousands of virtual machines. And it can be used independently as a kind of object storage tool. It can store big data and handle massive request. It is reliable and extensible. Because of its advantages, it is popular and typical. In Swift, there are

Motivation

This section mainly describes the motivating experiment to show the poor balancing performance of the default balancing mechanism of Swift Storage System when hotspot data exist as well as the analysis of the problem. We conduct the experiment in the cloud platform based on XenServer.

Design of framework

This section mainly discusses the design of our framework as well as the algorithms we have incorporated it. Before introducing the design of our framework, we first present the necessary background of virtualization technology and the basic idea of our framework. Then, we present the architecture and the algorithms addressing different requirements for workload balancing and adaptive resource management in different layers.

Algorithms

Currently, our framework mainly integrates algorithms for physical layer and virtual layer. Other algorithms are applicable to the framework architecture as well.

Experiment settings

To demonstrate the effectiveness of the proposed framework, we build an experiment environment with 4 physical nodes and 15 virtual nodes.

In the physical nodes, XenServer is set up as the virtualization server. And in each virtual node, the framework is deployed and interacts with the API of XenServer.

In the virtual nodes, Ubuntu 12.04 Server is installed in each node. And Swift storage nodes are deployed in these virtual nodes.

The details of the physical machines used to set up the experiment

Conclusions

As discussed above, dynamic workload aware balancing is important to storage application with hotspot data. It should be indispensably complementary to the traditional data amount balancing mechanisms. To achieve workload aware balancing, the proposed framework is effective for tuning system performance when hotspot data exist.

We can draw some conclusions about the advantages and disadvantages of the proposed framework. The advantages of the proposed framework are as follows.

—The proposed

Zhenhua Wang He received his B.E. degree in Software Engineering at the School of Software in Shanghai Jiao Tong University in 2013. His research focuses on Cloud Computing, Distributed Computing System, Resource Management, Big data, etc.

References (30)

  • P. Barham et al.

    Xen and the art of virtualization

  • ...
  • H. Yamamoto et al.

    Replication methods for load balancing on distributed storages in P2P networks

  • D.K. Madathil et al.

    A static data placement strategy towards perfect load-balancing for distributed storage clusters

  • Tin-Yu Wu et al.

    Improving accessing efficiency of cloud storage using de-duplication and feedback schemes

  • Tan Zhipeng et al.

    DLBS: Duplex loading balancing strategy on object storage system

  • Qingsong Wei et al.

    DifferStore: A differentiated storage service in object-based storage system

  • S. Tabirca et al.

    Static workload balance scheduling; continuous case

  • Y. Deng et al.

    Heat diffusion based dynamic load balancing for distributed virtual environments

  • Y. Liu et al.

    Research on the improvement of MongoDB auto-sharding in cloud environment

  • MongoDB,...
  • L. Yang, J.M. Schopf, I. Foster, Conservative Scheduling: Using Predicted Variance toImprove Scheduling Decisions in...
  • O. Pearce et al.

    Quantifying the effectiveness of load balance algorithms

  • Shicong Meng et al.

    Reliable state monitoring in cloud datacenters

  • C. Rathfelder, S. Becker, K. Krogmann, R. Reussner, Workload-aware system monitoring using performance predictions...
  • Cited by (33)

    • An adaptive scheduling approach based on integrated best-worst and VIKOR for cloud computing

      2020, Computers and Industrial Engineering
      Citation Excerpt :

      Their simulation results indicate that their approach produces optimal throughput and turnaround time. In the same context, Wang et al. (2015) applied a workload balancing and adaptive resource management framework, specifically for cloud based swift storage system and found that the system bottlenecks can be eliminated and optimized the resource utilization and response time comparison with the original swift storage framework. Here, it should be noted that in properly selecting the task assignment operation to assure the QoS is not considered in the existing adaptive algorithms.

    • A comprehensive survey for scheduling techniques in cloud computing

      2019, Journal of Network and Computer Applications
      Citation Excerpt :

      There are lots of heuristic algorithms, been proposed in cloud environment that solves the workflow scheduling problem as well as independent tasks or applications. Here, we have discussed various heuristic algorithms by differentiating (divide) them into different categories based upon the prime keyword in the article such as Heterogeneous Earliest Finish Time (HEFT) (Topcuogluet al., 2002; Chopra and Singh, 2014; Dubeyet al., 2018; Bala and Singh, 2014), min-min (Chen et al., 2013), max-min (Mao et al., 2014; Elzekiet al., 2012; Kanani and Maniyar, 2015; Xiaofanget al., 2014), first come first serve (FCFS) (Li and Shi, 2009), shortest job first (SJF) (Mondal et al., 2015), round robin (RR) (Samal and Mishra, 2013; Devi and Uthariaraj, 2016), bin-packing (BP) (Sheikhalishahiet al., 2016; Carliet al., 2015; Hadji and Zeghlache, 2012), deadline based scheduling algorithm (Conincket al., 2016; Zhu and Li, 2016; Abrishami and Naghibzadeh, 2012; Abrishamiet al., 2013; Vecchiolaet al., 2012; Chintapalli, 2011; Nayak and Tripathy, 2016; Bochenina, 2014; Visheratinet al., 2016; Khorsandet al., 2017), agent& credit based scheduling algorithm (Garcia and Nafarrate, 2015; Singhet al., 2015; Thomaset al., 2015), best fit (Shyam and Manvi, 2015; Yiet al., 2013; Casalicchioet al., 2013; Beloglazovet al., 2012), Dynamic resource scheduling and allocation techniques based upon QoS parameters (Arunaraniet al., 2018; Endo, 2011; Sajid and Raza, 2016; Chirkinet al., 2017; Mashayekhyet al., 2016; Razaqueet al., 2016; Beyet al., 2015; Wanget al., 2015a; Yanget al., 2014; Genezet al., 2013; Abdullah and Othma, 2013; Bosscheet al., 2013; Malawskiet al, 2015; Malawskiet al, 2012; Leeet al., 2012; Kumar and Saxena, 2015; Carrascoet al., 2018; Chenet al., 2018; Gaoet al., 2013; Xiaoet al., 2013; Kessaciet al, 2014; Fu and Zhou, 2015; Horriet al., 2014; Dinget al., 2015; Xuet al., 2016; Zhaoet al., 2016a; Singhet al., 2016; Juarezet al., 2016; Chaabouni and Khemakhem, 2017; Zhuet al., 2017; Linet al., 2018; Zhouet al., 2018; Khomh and Abtahizadeh, 2018; Fernández-Cereroet al., 2018; Shenet al., 2011; Nettoet al., 2014; Yanget al., 2013; Coutinhoet al., 2015; Li and Cai, 2017; Smaraet al., 2017; Fanet al., 2017; Kumar et al., 2018), load balancing techniques (Renet al., 2012; Bhatiaet al., 2012; Renet al., 2011; Abdelkader and Omara, 2012; Acharet al., 2013; Naha and Othman, 2014; Pillaet al., 2014; Bala and Chana, 2016; Naha and Othman, 2016; Zhaoet al., 2016b; Xinet al, 2017; Chenet al., 2017; Kumar and Sharma, 2017b), priority based scheduling algorithm (Dubeyet al, 2015; Kumar and Sharma, 2016; Sudarsan and Ribbens, 2016) and adaptive scheduling techniques (Somasundaramet al., 2013; Wei and Blake, 2016; Chandakanna and Vatsavayi, 2015; Wanget al., 2015b; Wanget al., 2015c; Wanget al., 2018). These algorithms deployed the tasks at the virtual machine using different scheduling approach and tried to optimize the various QoS parameters.

    • Resource scheduling algorithm with load balancing for cloud service provisioning

      2019, Applied Soft Computing Journal
      Citation Excerpt :

      A fine-grained access control was also said to be carried out with any user in the group in the cloud and revoked users failed to access the cloud. A workload balancing and resource management for Swift was presented by Wang et al. [26] in distributed storage system on the cloud. The workload monitoring and analysis algorithms were designed for evaluating overloaded and underloaded nodes.

    • A taxonomic survey on load balancing in cloud

      2017, Journal of Network and Computer Applications
      Citation Excerpt :

      This section will facilitate the researchers to quickly analyze various state-of-the-art load balancing algorithms discussed in this paper and select the appropriate algorithms to accomplish different desired objectives. For instance, if the number of computing resources are limited then algorithms which provide high resource utilization (Gao et al., 2013; Dhinesh Babu and Krishna, 2013; Ramezani et al., 2013; Liu and Wang, 2012; Sheikhalishahi et al., 2016; Gutierrez-Garcia and Ramirez-Nafarrate, 2015; Zuo et al., 2016; Mohamed et al., 2013; Wang et al., 2015; Patil and Gopal, 2017; Calheiros et al., 2014) can be used. In case, if energy budget is limited or if the cost of energy is high, then algorithms that minimize energy consumption (Farahnakian et al., 2014; Gao et al., 2013; Ramezani et al., 2013; Carli et al., 2015; Gutierrez-Garcia and Ramirez-Nafarrate, 2015; Wang et al., 2015) can be implemented to conserve energy.

    View all citing articles on Scopus

    Zhenhua Wang He received his B.E. degree in Software Engineering at the School of Software in Shanghai Jiao Tong University in 2013. His research focuses on Cloud Computing, Distributed Computing System, Resource Management, Big data, etc.

    Haopeng Chen He received his Ph.D degree from Department of Computer Science and Engineering, Northwestern Polytechnical University in 2001. He has worked in School of Software, Shanghai Jiao Tong University since 2004 after he finished his two-year postdoctoral research job in Department of Computer Science and Engineering, Shanghai Jiao Tong University. He got the position of Associate Professor in 2008. In 2010, he studied and researched in Georgia Institute of Technology as a visiting scholar. His research group focuses on Distributed Computing and Software Engineering. They have kept researching on Web Services, Web 2.0, Java EE, .NET, and SOA for several years. Recently, they are also interested in cloud computing and researching on the relevant areas, such as cloud federation, resource management, dynamic scaling up and down, and so on.

    Ying Fu She received her B.E. degree in Software Engineering at the School of Software in Shanghai Jiao Tong University in 2013. She is concentrating on Big data, Data Sharding, SaaS.

    Delin Liu He received his B.E. degree in Software Engineering at the School of Software in Shanghai Jiao Tong University in 2014.

    His current research direction is massive data storage.

    Yunmeng Ban She received her B.E. degree in Software Engineering at the School of Software in Shanghai Jiao Tong University in 2013. Now, she is a Ph.D candidate of University of Massachusetts at Amherst. Her research interests are distributed computing, massive data storage and processing.

    View full text