Skip to main content

Optimization of Cloud Workflow Scheduling Based on Balanced Clustering

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 10581))

Abstract

Scientific workflow applications consist of many fine-grained computational tasks with dependencies, whose runtime varies widely. When executing these fine-grained tasks in a cloud computing environment, significant scheduling overheads are generated. Task clustering is a key technology to reduce scheduling overhead and optimize process execution time. Unfortunately, the attempts of task clustering often cause the problems of runtime and dependency imbalance. However, the existing task clustering strategies mainly focus on how to avoid the runtime imbalance, but rarely deal with the data dependency between tasks. Without considering the data dependency, task clustering will lead to the poor degree of parallelism during task execution due to the introduced data locality. In order to address the problem of dependency imbalance, we propose Dependency Balance Clustering Algorithm (DBCA), which defines the concept of dependency correlation to measure the similarity between tasks in terms of data dependencies. The tasks with high dependency correlation are clustered together so as to avoid the dependency imbalance. We conducted the experiments on the WorkflowSim platform and compared our method with the existing task clustering method. The results showed that it significantly reduced the execution time of the whole workflow.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Da Silva, R.F., Juve, G., Deelman, E., Glatard, T., Desprez, F., Thain, D., Tovar, B., Livny, M.: Toward fine-grained online task characteristics estimation in scientific workflows. In: WORKS@ SC, pp. 58–67 (2013)

    Google Scholar 

  2. Stratan, C., Iosup, A., Epema, D.H.: A performance study of grid workflow engines. In: Proceedings of IEEE/ACM 9th International Conference on Grid Computing, pp. 25–32. IEEE Computer Society (2008)

    Google Scholar 

  3. Chen, W., Deelman, E.: Workflow overhead analysis and optimizations. In: Proceedings of the 6th workshop on Workflows in Support of Large-Scale Science, pp. 11–20. ACM (2011)

    Google Scholar 

  4. Muthuvelu, N., Liu, J., Soe, N.L., Venugopal, S., Sulistio, A., Buyya, R.: A dynamic job grouping-based scheduling for deploying applications with fine-grained tasks on global grids. In: Proceedings of the 2005 Australasian Workshop on Grid Computing and e-Research, vol. 44, pp. 41–48. Australian Computer Society, Inc. (2005)

    Google Scholar 

  5. Muthuvelu, N., Chai, I., Eswaran, C.: An adaptive and parameterized job grouping algorithm for scheduling grid jobs. In: Proceedings of ICACT 10th International Conference on Advanced Communication Technology, pp. 975–980. IEEE (2008)

    Google Scholar 

  6. Muthuvelu, N., Vecchiola, C., Chai, I., Chikkannan, E., Buyya, R.: Task granularity policies for deploying bag-of-task applications on global grids. Future Gener. Comput. Syst. 29, 170–181 (2013)

    Article  Google Scholar 

  7. Ang, T., Ng, W., Ling, T., Por, L., Liew, C.: A bandwidth-aware job grouping-based scheduling on grid environment. Inf. Technol. J. 8, 372–377 (2009)

    Article  Google Scholar 

  8. Liu, Q., Liao, Y.: Grouping-based fine-grained job scheduling in grid computing. In: Proceedings of the 1st International Workshop on Education Technology and Computer Science, ETCS 2009, pp. 556–559. IEEE (2009)

    Google Scholar 

  9. Zhao, E., Qi, Y., Xiang, X., Chen, Y.: A data placement strategy based on genetic algorithm for scientific workflows. In: Proceedings of 2012 8th International Conference on Computational Intelligence and Security (CIS), pp. 146–149. IEEE (2012)

    Google Scholar 

  10. Deng, K., Ren, K., Song, J., Yuan, D., Xiang, Y., Chen, J.: A clustering based coscheduling strategy for efficient scientific workflow execution in cloud computing. Concurrency Comput. Pract. Exp. 25, 2523–2539 (2013)

    Article  Google Scholar 

  11. Li, X., Zhang, L., Wu, Y., Liu, X., Zhu, E., Yi, H., Wang, F., Zhang, C., Yang, Y.: A novel workflow-level data placement strategy for data-sharing scientific cloud workflows. IEEE Trans. Serv. Comput., 1 (2016)

    Google Scholar 

  12. Chen, W., da Silva, R.F., Deelman, E., Sakellariou, R.: Using imbalance metrics to optimize task clustering in scientific workflow executions. Future Gener. Comput. Syst. 46, 69–84 (2015)

    Article  Google Scholar 

  13. Sahni, J., Vidyarthi, D.P.: Workflow-and-platform aware task clustering for scientific workflow execution in cloud environment. Future Gener. Comput. Syst. 64, 61–74 (2016)

    Article  Google Scholar 

  14. Chen, W., Da Silva, R.F., Deelman, E., Sakellariou, R.: Balanced task clustering in scientific workflows. In: Proceedings of IEEE 9th International Conference on E-Science (e-Science), pp. 188–195. IEEE (2013)

    Google Scholar 

  15. Chen, W., Deelman, E.: Workflowsim: a toolkit for simulating scientific workflows in distributed environments. In: Proceedings of IEEE 8th International Conference on E-Science (e-Science), pp. 1–8. IEEE (2012)

    Google Scholar 

  16. Bharathi, S., Chervenak, A., Deelman, E., Mehta, G., Su, M.-H., Vahi, K.: Characterization of scientific workflows. In: Proceedings of the third Workshop on Workflows in Support of Large-Scale Science, pp. 1–10. IEEE (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dongjin Yu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Zhang, L., Yu, D., Zheng, H. (2017). Optimization of Cloud Workflow Scheduling Based on Balanced Clustering. In: Wen, S., Wu, W., Castiglione, A. (eds) Cyberspace Safety and Security. CSS 2017. Lecture Notes in Computer Science(), vol 10581. Springer, Cham. https://doi.org/10.1007/978-3-319-69471-9_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-69471-9_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-69470-2

  • Online ISBN: 978-3-319-69471-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics