Skip to main content
Log in

A multi-parameter scheduling method of dynamic workloads for big data calculation in cloud computing

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Workload scheduling in cloud computing is currently an active research field. Scheduling plays an important role in cloud computing performance, especially when the platform is used for big data analysis and as less predictable workloads dynamically enter the clouds. Finding the optimized scheduling solution with different parameters in different environments is still a challenging issue. In dynamic environments such as cloud, scheduling strategies should feature rapid altering to be able to adapt more easily to the changes in input workloads. However, achieving an optimized solution is an important issue, which has a trade-off with the speed of finding the solution. In this article, an ordinal optimization method is proposed that considers the volume of workloads, load balancing and the volume of exchanged messages among virtual clusters, considering the replications. The algorithm in the present paper is based on ordinal optimization (OO) and evolutionary OO. In any time periods, a criterion is calculated to determine the similarity of workloads in two-consequence time periods, which is appropriate for timely changes in the scheduling procedure. In this paper, considering more than one parameter, a proper scheduling would be created for each time period. This scheduler is an organization for the number of virtual machines for each virtual cluster, but if there is a desirable similarity between workloads of two-consequence time periods, this procedure would be ignored. The results show that a more optimized solution is obtained in comparison with the rated methods, such as blind pink, OO, Monte Carlo and eOO in a reasonable time. The suggested method is flexible and it is possible to change the weight ratio of the proposed criteria in different environments to be consistent with different environmental conditions. The results show that proposed method achieved up to 28% performance improvement in comparison with eOO.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Abbreviations

OO:

Ordinal optimization

eOO:

Evolutionary ordinal optimization

VM:

Virtual machine

LAN:

Local area network

JAWS:

Job-aware workload scheduling

PB:

Petabytes

RFOH:

Resource fault occurrence history

IOO:

Iterative ordinal optimization

MOS:

Multi-objective scheduling

DVS:

Dynamic voltage scaling

VC:

Virtual cluster

BIG:

Big workflow generation

VBIG:

Very big workflow generation

HUGE:

Huge workflow generation

MEOO:

Multi-parameter evolutionary algorithm

FIFO:

First in first out

References

  1. Schomm F, Stahl F, Vossen G (2013) Marketplaces for data: an initial survey. ACM SIGMOD Rec 42(1):15–26

    Article  Google Scholar 

  2. Assunção MD et al (2015) Big data computing and clouds: trends and future directions. J Parallel Distrib Comput 79:3–15

    Article  Google Scholar 

  3. Gartner I (2008) Gartner says contrasting views on cloud computing are creating confusion. http://www.gartner.com/newsroom/id/766215. Accessed on 9 July 2015

  4. Kambatla K et al (2014) Trends in big data analytics. J Parallel Distrib Comput 74(7):2561–2573

    Article  Google Scholar 

  5. Djebbar EI, Belalem G (2013) Optimization of tasks scheduling by an efficacy data placement and replication in cloud computing. In: Aversa R, Kolodziej J, Zhang J, Amato F, Fortino G (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2013. Lecture Notes in Computer Science, vol 8286. Springer, Cham, pp 22–29

  6. Vecchiola C, Pandey S, Buyya R (2009) High-performance cloud computing: a view of scientific applications. In: 2009 10th International Symposium on Pervasive Systems, Algorithms, and Networks (ISPAN). IEEE

  7. Ismail L, Barua R (2013) Implementation and performance evaluation of a distributed conjugate gradient method in a cloud computing environment. Softw Pract Exp 43(3):281–304

    Article  Google Scholar 

  8. Piraghaj SF et al (2016) Virtual machine customization and task mapping architecture for efficient allocation of cloud data center resources. Comput J 59(2):208–224

    Article  Google Scholar 

  9. Yang C et al (2017) Big data and cloud computing: innovation opportunities and challenges. Int J Dig Earth 10(1):13–53

    Article  Google Scholar 

  10. Zhang F, Cao J, Tan W, Khan SU, Li K, Zomaya AY (2014) Evolutionary scheduling of dynamic multitasking workloads for big-data analytics in elastic cloud. IEEE Trans Emerg Top Comput 2(3):338–351

    Article  Google Scholar 

  11. Ho Y-C, Zhao Q-C, Jia Q-S (2008) Ordinal optimization: soft optimization for hard problems. Springer Science & Business Media, Berlin

    MATH  Google Scholar 

  12. Hanani A, Nourossana S, Javadi H, Rahmani AM (2010) Solving the scheduling problem in multi-processor systems with communication cost and precedence using bee colony system. In: 2010 3rd International Conference on Advanced Computer Theory and Engineering (ICACTE), vol 5. IEEE, pp V5–V464

  13. Moon S, Lee J-W (2016) Multi-residential demand response scheduling with multi-class appliances in smart grid. IEEE Trans Smart Grid. doi:10.1109/TSG.2016.2614546

  14. Mansouri N, Dastghaibyfard GH, Mansouri E (2013) Combination of data replication and scheduling algorithm for improving data availability in data grids. J Netw Comput Appl 36(2):711–722

    Article  Google Scholar 

  15. Rahmati B, Rahmani AM, Rezaei A (2017) Data replication-based scheduling in cloud computing environment. J Adv Comput Eng Technol

  16. Wang K et al (2016) Load-balanced and locality-aware scheduling for data-intensive workloads at extreme scales. Concurr Comput Pract Exp 28(1):70–94

    Article  Google Scholar 

  17. Liu C et al (2016) HKE-BC: hierarchical key exchange for secure scheduling and auditing of big data in cloud computing. Concurr Comput Pract Exp 28(1):646–660

    Article  Google Scholar 

  18. Jiang C, Wang C, Liu X, Zhao Y (2007) Adaptive replication based security aware and fault tolerant job scheduling for grids. In: SNPD 2007. 8th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing, vol 2. IEEE, pp 597–602

  19. Gai K, Qiu M, Zhao H (2016) Security-aware efficient mass distributed storage approach for cloud systems in big data. In: 2016 IEEE 2nd International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing (HPSC), and IEEE International Conference on Intelligent Data and Security (IDS). IEEE

  20. Jiang J, Xu G, Wei X (2006) An enhanced data-aware scheduling algorithm for batch-mode dataintensive jobs on data grid. In: International Conference on Hybrid Information Technology, 2006. ICHIT’06, vol 1. IEEE

  21. Mei J, Li K, Li K (2014) A resource-aware scheduling algorithm with reduced task duplication on heterogeneous computing systems. J Supercomput 68(3):1347–1377

    Article  Google Scholar 

  22. Wang X, Perlman E, Burns R, Malik T, Budavári T, Meneveau C, Szalay A (2010) Jaws: job-aware workload scheduling for the exploration of turbulence simulations. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE Computer Society, pp 1–11

  23. Khanli LM, Far ME, Rahmani AM(2010) RFOH: a new fault tolerant job scheduler in grid computing. In: 2010 Second International Conference on Computer Engineering and Applications (ICCEA), vol 1. IEEE

  24. Kazem AAP, Rahmani AM, Aghdam HH (2008) A modified simulated annealing algorithm for static task scheduling in grid computing. In: International Conference on Computer Science and Information Technology, 2008. ICCSIT’08. IEEE

  25. Zhang F, Cao J, Hwang K, Li K, Khan S (2015) Adaptive workflow scheduling on cloud computing platforms with iterative ordinal optimization. IEEE Trans Cloud Comput 3(2):156–168

    Article  Google Scholar 

  26. Zhang F, Cao J, Li K, Khan SU, Hwang K (2014) Multi-objective scheduling of many tasks in cloud platforms. Future Gener Comput Syst 37:309–320

    Article  Google Scholar 

  27. Nanduri R, Maheshwari N, Reddyraja A, Varma V (2011) Job aware scheduling algorithm for mapreduce framework. In: 2011 IEEE 3rd International Conference on Cloud Computing Technology and Science (CloudCom). IEEE, pp 724–729

  28. Navimipour JN et al (2014) Job scheduling in the expert cloud based on genetic algorithms. Kybernetes 43(8):1262–1275

    Article  Google Scholar 

  29. Li J et al (2012) Online optimization for scheduling preemptable tasks on IaaS cloud systems. J Parallel Distrib Comput 72(5):666–677

    Article  Google Scholar 

  30. Mezmaz M et al (2011) A parallel bi-objective hybrid metaheuristic for energy-aware scheduling for cloud computing systems. J Parallel Distrib Comput 71(11):1497–1508

    Article  Google Scholar 

  31. Omara FA, Arafa MM (2010) Genetic algorithms for task scheduling problem. J Parallel Distrib Comput 70(1):13–22

    Article  MATH  Google Scholar 

  32. Abouelela M, El-Darieby M (2016) Scheduling big data applications within advance reservation framework in optical grids. Appl Soft Comput 38:1049–1059

    Article  Google Scholar 

  33. Lin B et al (2016) A pretreatment workflow scheduling approach for big data applications in multicloud environments. IEEE Trans Netw Serv Manag 13(3):581–594

    Article  Google Scholar 

  34. Somasundaram TS, Govindarajan K, Kumar VS (2016) Swarm intelligence (SI) based profiling and scheduling of big data applications. In: 2016 IEEE International Conference on Big Data (Big Data). IEEE

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amir Masoud Rahmani.

Appendices

Appendix 1: Performance improvement of proposed method

See Tables 34 and 5.

Table 3 Performance improvement of the proposed algorithm compared to the others in BIG scenarios
Table 4 Performance improvement of the proposed algorithm compared to the others in VERY BIG scenarios
Table 5 Performance improvement of the proposed algorithm compared to the others in HUGE scenarios

Appendix 2: Overhead reduction of proposed algorithm

See Tables 67 and 8.

Table 6 Overhead reduction of the proposed algorithm compared to the others in BIG scenarios
Table 7 Overhead reduction of the proposed algorithm compared to the others in VBIG scenarios
Table 8 Overhead reduction of the proposed algorithm compared to the others in HUGE scenarios

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hanani, A., Rahmani, A.M. & Sahafi, A. A multi-parameter scheduling method of dynamic workloads for big data calculation in cloud computing. J Supercomput 73, 4796–4822 (2017). https://doi.org/10.1007/s11227-017-2050-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-017-2050-6

Keywords

Navigation