Skip to main content

Scheduling Data-Intensive Workloads in Large-Scale Distributed Systems: Trends and Challenges

  • Chapter
  • First Online:
Modeling and Simulation in HPC and Cloud Systems

Part of the book series: Studies in Big Data ((SBD,volume 36))

Abstract

With the explosive growth of big data, workloads tend to get more complex and computationally demanding. Such applications are processed on distributed interconnected resources that are becoming larger in scale and computational capacity. Data-intensive applications may have different degrees of parallelism and must effectively exploit data locality. Furthermore, they may impose several Quality of Service requirements, such as time constraints and resilience against failures, as well as other objectives, like energy efficiency. These features of the workloads, as well as the inherent characteristics of the computing resources required to process them, present major challenges that require the employment of effective scheduling techniques. In this chapter, a classification of data-intensive workloads is proposed and an overview of the most commonly used approaches for their scheduling in large-scale distributed systems is given. We present novel strategies that have been proposed in the literature and shed light on open challenges and future directions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Adam, T.L., Chandy, K.M., Dickson, J.R.: A comparison of list schedules for parallel processing systems. Commun. ACM 17(12), 685–690 (1974)

    Article  MATH  Google Scholar 

  2. Apache: Apache Hadoop (2017). http://hadoop.apache.org/. Accessed 19 Jun 2017

  3. Beloglazov, A., Abawajy, J., Buyya, R.: Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing. Futur. Gener. Comput. Syst. 28(5), 755–768 (2012)

    Article  Google Scholar 

  4. Bonomi, F., Milito, R., Natarajan, P., Zhu, J.: Fog Computing: A Platform for Internet of Things and Analytics, pp. 169–186. Springer, Berlin (2014)

    Google Scholar 

  5. Buttazzo, G.C.: Hard Real-Time Computing Systems: Predictable Scheduling Algorithms and Applications, 3rd edn. Springer, Berlin (2011)

    Book  MATH  Google Scholar 

  6. Calheiros, R.N., Buyya, R.: Energy-efficient scheduling of urgent bag-of-tasks applications in clouds through DVFS. In: Proceedings of the 6th IEEE International Conference on Cloud Computing Technology and Science (CloudCom’14), pp. 342–349 (2014)

    Google Scholar 

  7. Chen, J.J., Yang, C.Y., Kuo, T.W.: Slack reclamation for real-time task scheduling over dynamic voltage scaling multiprocessors. In: Proceedings of the 2006 IEEE International Conference on Sensor Networks, Ubiquitous and Trustworthy Computing (SUTC’06), pp. 358–365 (2006)

    Google Scholar 

  8. Cheng, B.C., Stoyenko, A.D., Marlowe, T.J., Baruah, S.K.: LSTF: a new scheduling policy for complex real-time tasks in multiple processor systems. Automatica 33(5), 921–926 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  9. Cisco: Fog computing and the internet of things: extend the cloud to where the things are. Technical Report C11-734435-00 04/15, San Jose, CA (2015)

    Google Scholar 

  10. Coffman Jr., E.G., Csirik, J., Galambos, G., Martello, S., Vigo, D.: Bin Packing Approximation Algorithms: Survey and Classification, pp. 455–531. Springer, Berlin (2013)

    Google Scholar 

  11. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Google Scholar 

  12. Ekanayake, J., Fox, G.: High performance parallel computing with clouds and cloud technologies. In: Proceedings of the First International Conference on Cloud Computing (CloudComp’09), pp. 20–38 (2009)

    Google Scholar 

  13. Foster, I., Zhao, Y., Raicu, I., Lu, S.: Cloud computing and grid computing 360-degree compared. In: Proceedings of the 2008 Grid Computing Environments Workshop (GCE’08), pp. 1–10 (2008)

    Google Scholar 

  14. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman and Company, New York (1979)

    MATH  Google Scholar 

  15. Gkoutioudi, K.Z., Karatza, H.D.: Multi-criteria job scheduling in grid using an accelerated genetic algorithm. J Grid Comput. 10(2), 311–323 (2012)

    Article  Google Scholar 

  16. Hashem, I.A.T., Yaqoob, I., Anuar, N.B., Mokhtar, S., Gani, A., Khan, S.U.: The rise of big data on cloud computing: review and open research issues. Inf. Syst. 47, 98–115 (2015)

    Article  Google Scholar 

  17. Jiang, H.J., Huang, K.C., Chang, H.Y., Gu, D.S., Shih, P.J.: Scheduling concurrent workflows in HPC cloud through exploiting schedule gaps. In: Proceedings of the 11th International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP’11), pp. 282–293 (2011)

    Google Scholar 

  18. Karatza, H.D.: The impact of critical sporadic jobs on gang scheduling performance in distributed systems. Simul.: Trans. Soc. Model Simul. Int. 84(2–3), 89–102 (2008)

    Article  Google Scholar 

  19. Karatza, H.D.: Scheduling jobs with different characteristics in distributed systems. In: Proceedings of the 2014 International Conference on Computer, Information and Telecommunication Systems (CITS’14), pp. 1–5 (2014)

    Google Scholar 

  20. Kolodziej, J.: Evolutionary Hierarchical Multi-Criteria Metaheuristics for Scheduling in Large-Scale Grid Systems. Springer, Berlin (2012)

    Book  Google Scholar 

  21. Kruatrachue, B., Lewis, T.G.: Duplication scheduling heuristic, a new precedence task scheduler for parallel systems. Technical Report. 87-60-3, Oregon State University, Corvallis, OR (1987)

    Google Scholar 

  22. Lin, K.J., Natarajan, S., Liu, J.W.S.: Imprecise results: utilizing partial computations in real-time systems. In: Proceedings of the 8th IEEE Real-Time Systems Symposium (RTSS’87), pp. 210–217 (1987)

    Google Scholar 

  23. Liu, C.L., Layland, J.W.: Scheduling algorithms for multiprogramming in a hard real-time environment. J. ACM 20(1), 46–61 (1973)

    Article  MathSciNet  MATH  Google Scholar 

  24. Maheswaran, M., Ali, S., Siegel, H.J., Hensgen, D., Freund, R.F.: Dynamic mapping of a class of independent tasks onto heterogeneous computing systems. J. Parallel Distrib. Comput. 59(2), 107–131 (1999)

    Article  Google Scholar 

  25. Manickam, V., Aravind, A.: A fair and efficient gang scheduling algorithm for multicore processors. In: Proceedings of the 6th International Conference on Information Processing (ICIP’12), pp. 467–476 (2012)

    Google Scholar 

  26. Mizotani, K., Hatori, Y., Kumura, Y., Takasu, M., Chishiro, H., Yamasaki, N.: An integration of imprecise computation model and real-time voltage and frequency scaling. In: Proceedings of the 30th International Conference on Computers and Their Applications (CATA’15), pp. 63–70 (2015)

    Google Scholar 

  27. Mok, A.K.: Fundamental design problems of distributed systems for the hard real-time environment. PhD thesis, Massachusetts Institute of Technology, Cambridge, MA (1983)

    Google Scholar 

  28. Moschakis, I.A., Karatza, H.D.: Multi-criteria scheduling of bag-of-tasks applications on heterogeneous interlinked clouds with simulated annealing. J. Syst. Softw. 101, 1–14 (2015)

    Article  Google Scholar 

  29. Oldfield, R.A., Arunagiri, S., Teller, P.J., Seelam, S., Varela, M.R., Riesen, R., Roth, P.C.: Modeling the impact of checkpoints on next-generation systems. In: Proceedings of the 24th IEEE Conference on Mass Storage Systems and Technologies (MSST’07), pp. 30–46 (2007)

    Google Scholar 

  30. Papazachos, Z.C., Karatza, H.D.: Performance evaluation of gang scheduling in a two-cluster system with migrations. In: Proceeding 23rd IEEE International Parallel and Distributed Processing Symposium (IPDPS’09), pp. 1–8 (2009)

    Google Scholar 

  31. Russom, P.: Big data analytics. Technical Report TDWI Best Pract. Rep., Fourth Quart., TDWI Research (2011)

    Google Scholar 

  32. Stankovic, J.A., Spuri, M., Ramamritham, K., Buttazzo, G.C.: Deadline Scheduling for Real-Time Systems: EDF and Related Algorithms. Kluwer Academic Publishers, Dordrecht (1998)

    Book  MATH  Google Scholar 

  33. Stavrinides, G.L., Karatza, H.D.: Performance evaluation of gang scheduling in distributed real-time systems with possible software faults. In: Proceedings of the 2008 International Symposium on Performance Evaluation of Computer and Telecommunication Systems (SPECTS’08), pp. 1–7 (2008)

    Google Scholar 

  34. Stavrinides, G.L., Karatza, H.D.: Fault-tolerant gang scheduling in distributed real-time systems utilizing imprecise computations. Simul.: Trans. Soc. Model Simul. Int. 85(8), 525–536 (2009)

    Article  Google Scholar 

  35. Stavrinides, G.L., Karatza, H.D.: Scheduling multiple task graphs with end-to-end deadlines in distributed real-time systems utilizing imprecise computations. J. Syst. Softw. 83(6), 1004–1014 (2010)

    Article  Google Scholar 

  36. Stavrinides, G.L., Karatza, H.D.: The impact of input error on the scheduling of task graphs with imprecise computations in heterogeneous distributed real-time systems. In: Proceedings of the 18th International Conference on Analytical and Stochastic Modeling Techniques and Applications (ASMTA’11), pp. 273–287 (2011)

    Google Scholar 

  37. Stavrinides, G.L., Karatza, H.D.: Scheduling multiple task graphs in heterogeneous distributed real-time systems by exploiting schedule holes with bin packing techniques. Simul. Model. Pract. Theor. 19(1), 540–552 (2011)

    Article  Google Scholar 

  38. Stavrinides, G.L., Karatza, H.D.: Scheduling real-time DAGs in heterogeneous clusters by combining imprecise computations and bin packing techniques for the exploitation of schedule holes. Futur. Gener. Comput. Syst. 28(7), 977–988 (2012)

    Google Scholar 

  39. Stavrinides, G.L., Karatza, H.D.: The impact of resource heterogeneity on the timeliness of hard real-time complex jobs. In: Proceedings of the 7th International Conference on PErvasive Technologies Related to Assistive Environments (PETRA’14), Workshop on Distributed Sensor Systems for Assistive Environments (Di-Sensa), pp. 65:1–65:8 (2014)

    Google Scholar 

  40. Stavrinides, G.L., Karatza, H.D.: Scheduling real-time jobs in distributed systems-simulation and performance analysis. In: Proceedings of the 1st International Workshop on Sustainable Ultrascale Computing Systems (NESUS’14), pp. 13–18 (2014)

    Google Scholar 

  41. Stavrinides, G.L., Karatza, H.D.: A cost-effective and QoS-aware approach to scheduling real-time workflow applications in PaaS and SaaS clouds. In: Proceedings of the 3rd International Conference on Future Internet of Things and Cloud (FiCloud’15), pp. 231–239 (2015)

    Google Scholar 

  42. Stavrinides, G.L., Karatza, H.D.: Scheduling different types of applications in a saas cloud. In: Proceedings of the 6th International Symposium on Business Modeling and Software Design (BMSD’16), pp. 144–151 (2016)

    Google Scholar 

  43. Stavrinides, G.L., Karatza, H.D.: Scheduling real-time parallel applications in saas clouds in the presence of transient software failures. In: Proceedings of the 2016 International Symposium on Performance Evaluation of Computer and Telecommunication Systems (SPECTS’16), pp. 1–8 (2016)

    Google Scholar 

  44. Stavrinides, G.L., Karatza, H.D.: The effect of workload computational demand variability on the performance of a SaaS cloud with a multi-tier SLA. In: Proceedings of the IEEE 5th International Conference on Future Internet of Things and Cloud (FiCloud’17), pp. 10–17 (2017)

    Google Scholar 

  45. Stavrinides, G.L., Karatza, H.D.: Periodic scheduling of mixed workload in distributed systems. In: Proceedings of the 23rd ICE/IEEE International Conference on Engineering, Technology and Innovation (ICE’17) (2017, in press)

    Google Scholar 

  46. Stavrinides, G.L., Karatza, H.D.: Scheduling real-time bag-of-tasks applications with approximate computations in SaaS clouds. Concurr. Comput. Pract. Exp. (2017, in press)

    Google Scholar 

  47. Stavrinides, G.L., Karatza, H.D.: Simulation-based performance evaluation of an energy-aware heuristic for the scheduling of HPC applications in large-scale distributed systems. In: Proceedings of the 8th ACM/SPEC International Conference on Performance Engineering (ICPE’17), 3rd International Workshop on Energy-aware Simulation (ENERGY-SIM’17), pp. 49–54 (2017)

    Google Scholar 

  48. Stavrinides, G.L., Duro, F.R., Karatza, H.D., Blas, J.G., Carretero, J.: Different aspects of workflow scheduling in large-scale distributed systems. Simul. Model. Pract. Theor. 70, 120–134 (2017)

    Article  Google Scholar 

  49. Sun, R., Yang, J., Gao, Z., He, Z.: A virtual machine based task scheduling approach to improving data locality for virtualized hadoop. In: Proceedings of the 2014 IEEE/ACIS 13th International Conference on Computer and Information Science (ICIS’14), pp. 297–302 (2014)

    Google Scholar 

  50. Tabak, E.K., Cambazoglu, B.B., Aykanat, C.: Improving the performance of independent task assignment heuristics minmin, maxmin and sufferage. IEEE Trans. Parallel. Distrib. Syst. 25(5), 1244–1256 (2014)

    Article  Google Scholar 

  51. Talia, D.: Clouds for scalable big data analytics. Computer 46(5), 98–101 (2013)

    Article  Google Scholar 

  52. Terzopoulos, G., Karatza, H.D.: Bag-of-task scheduling on power-aware clusters using a DVFS-based mechanism. In: Proceedings of the 28th IEEE International Parallel & Distributed Processing Symposium (IPDPS’14), 10th Workshop on High-Performance, Power-Aware Computing (HPPAC’14), pp. 833–840 (2014)

    Google Scholar 

  53. Topcuoglu, H., Hariri, S., Wu, M.Y.: Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel. Distrib. Syst. 13(3), 260–274 (2002)

    Article  Google Scholar 

  54. Valentini, G.L., Lassonde, W., Khan, S.U., Allah, N.M., Madani, S.A., Li, J., Zhang, L., Wang, L., Ghani, N., Kolodziej, J., Li, H., Zomaya, A.Y., Xu, C.Z., Balaji, P., Vishnu, A., Pinel, F., Pecero, J.E., Kliazovich, D., Bouvry, P.: An overview of energy efficiency techniques in cluster computing systems. Clust. Comput. 16(1), 3–15 (2013)

    Article  Google Scholar 

  55. Wang, L., Tao, J., Ranjan, R., Marten, H., Streit, A., Chen, J., Chen, D.: G-Hadoop: MapReduce across distributed data centers for data-intensive computing. Futur. Gener. Comput. Syst. 29(3), 739–750 (2013)

    Google Scholar 

  56. Weng, C., Lu, X.: Heuristic scheduling for bag-of-tasks applications in combination with QoS in the computational grid. Futur. Gener. Comput. Syst. 21(2), 271–280 (2005)

    Google Scholar 

  57. Yang, T., Gerasoulis, A.: DSC: scheduling parallel tasks on an unbounded number of processors. IEEE Trans. Parallel. Distrib. Syst. 5(9), 951–967 (1994)

    Google Scholar 

  58. Zaharia, M., Borthakur, D., Sen Sarma, J., Elmeleegy, K., Shenker, S., Stoica, I.: Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. In: Proceedings of the 5th European Conference on Computer Systems (EuroSys’10), pp. 265–278 (2010)

    Google Scholar 

  59. Zhao, J., Wang, L., Tao, J., Chen, J., Sun, W., Ranjan, R., Kolodziej, J., Streit, A., Georgakopoulos, D.: A security framework in G-Hadoop for big data computing across distributed cloud data centres. J. Comp. Syst. Sci. 80(5), 994–1007 (2014)

    Google Scholar 

Download references

Acknowledgements

The second author of this chapter, Helen D. Karatza, has been invited as a trainer to the cHiPSet Training School 2016 “New Trends in Modeling and Simulation in HPC Systems”, held in Bucharest, Romania, 21–23 September 2016, and has been supported by the IC1406 Horizon 2020 grant.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Georgios L. Stavrinides .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Stavrinides, G.L., Karatza, H.D. (2018). Scheduling Data-Intensive Workloads in Large-Scale Distributed Systems: Trends and Challenges. In: Kołodziej, J., Pop, F., Dobre, C. (eds) Modeling and Simulation in HPC and Cloud Systems. Studies in Big Data, vol 36. Springer, Cham. https://doi.org/10.1007/978-3-319-73767-6_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-73767-6_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-73766-9

  • Online ISBN: 978-3-319-73767-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics