Hypergraph-Based Data Reduced Scheduling Policy for Data-Intensive Workflow in Clouds

Hu, Zhigang; Li, Jia; Zheng, Meiguang; Zhang, Xinxin; Kang, Hui; Tao, Yong; Yang, Jiao

doi:10.1007/978-981-10-6388-6_28

Zhigang Hu¹⁵,
Jia Li¹⁵,
Meiguang Zheng¹⁵,
Xinxin Zhang¹⁵,
Hui Kang¹⁵,
Yong Tao¹⁵ &
…
Jiao Yang¹⁵

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 728))

Included in the following conference series:

International Conference of Pioneering Computer Scientists, Engineers and Educators

1698 Accesses
1 Citations

Abstract

Data-intensive computing is expected to be the next-generation IT computing paradigm. Data-intensive workflows in clouds are becoming more and more popular. How to schedule data-intensive workflow efficiently has become the key issue. In this paper, first, we build a directed hypergraph model for data-intensive workflow, since Hypergraphs can more accurately model communication volume and better represent asymmetric problems, and the cut metric of hypergraphs is well suited for minimizing the total volume of communication. Second, we propose a concept data supportive ability to help the presentation of data-intensive workflow application and provide the merge operation details considering the data supportive ability. Third, we present an optimized hypergraph multi-level partitioning algorithm. Finally we bring a data reduced scheduling policy HEFT-P for data-intensive workflow. Through simulation, we compare HEFT-P with three typical workflow scheduling policies. The results indicate that HEFT-P could obtain reduced data scheduling and reduce the makespan of executing data-intensive workflows.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chen, C.L.P., Zhang, C.Y.: Data-intensive applications, challenges, techniques and technologies: a survey on Big Data. Inf. Sci. 275, 314–347 (2014)
Article Google Scholar
Armbrust, M., Fox, A., Griffith, R.: A view of cloud computing. Commun. ACM 53(4), 50–58 (2010)
Article Google Scholar
Gong, X., Jin, C.Q., Wang, X.L.: Data-intensive science and engineer: requirements and challenges. J. Comput. Sci. 35(8), 1563–1578 (2012)
Google Scholar
Topcuoglu, H., Hariri, S., Wu, M.: Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 13(3), 260–274 (2002)
Article Google Scholar
Xiao, P., Hu, Z.G., Qu, X.L.: Energy-aware scheduling policy for data-intensive workflow. J. Commun. 36(1), 17-2015017 (2015)
Google Scholar
Ahmad, S.G., Liew, C.S., Munir, E.U.: A hybrid genetic algorithm for optimization of scheduling workflow applications in heterogeneous computing systems. Journal of Parallel and Distributed Computing 87, 80–90 (2016)
Article Google Scholar
Hu, M., Luo, J., Wang, Y.: Adaptive scheduling of task graphs with dynamic resilience. IEEE Trans. Comput. 66(1), 17–23 (2017)
Article MathSciNet MATH Google Scholar
Shin, K.S., Cha, M.J., Jang, M.S.: Task scheduling algorithm using minimized duplications in homogeneous systems. J. Parallel Distrib. Comput. 68(8), 1146–1156 (2008)
Article MATH Google Scholar
Catalyurek, U.V., Boman, E.G., Devine, K.D.: Hypergraph-based dynamic load balancing for adaptive scientific computations. In: IEEE International Parallel and Distributed Processing Symposium, pp. 1–11. IEEE (2012)
Google Scholar
Zhou, D., Huang, J., Schölkopf, B.: Learning with hypergraphs: clustering, classification, and embedding. In: Advances in Neural Information Processing Systems, pp. 1601–1608 (2010)
Google Scholar
Zhao, H., Liu, X.: Hypergraph-based task-bundle scheduling towards efficiency and fairness in heterogeneous distributed systems. In: Parallel & Distributed Processing, pp. 1–12. IEEE (2010)
Google Scholar
Çatalyürek, Ü., Aykanat, C.: PaToH (partitioning tool for hypergraphs). In: Padua, D. (ed.) Encyclopedia of Parallel Computing, pp. 1479–1487. Springer, New York (2011)
Google Scholar
Çatalyürek, Ü.V., Deveci, M., Kaya, K.: Multithreaded clustering for multi-level hypergraph partitioning. In: 2012 IEEE 26th International Parallel & Distributed Processing Symposium (IPDPS), pp. 848–859. IEEE (2012)
Google Scholar
Biswal, P., Lee, J.R., Rao, S.: Eigenvalue bounds, spectral partitioning, and metrical deformations via flows. J. ACM 57(3), 13 (2010)
Article MathSciNet MATH Google Scholar
Devine, K.D., Boman, E.G., Heaphy, R.T.: Parallel hypergraph partitioning for scientific computing. In: Proceedings of 20th IEEE International Parallel & Distributed Processing Symposium, p. 10. IEEE (2010)
Google Scholar
Selvitopi, O., Acer, S., Aykanat, C.: A recursive hypergraph bipartitioning framework for reducing bandwidth and latency costs simultaneously. IEEE Trans. Parallel Distrib. Syst. 28(2), 345–358 (2016)
Google Scholar
Sun Xuedong, F., Xu Xiaofei, S., Wang Gang, T.: Directed hypergraph based and re-source constrained enterprise process structure optimization. J. Softw. 17(1), 59–67 (2006)
Article MATH Google Scholar
Laura, L., Nanni, U., Temperini, M.: The organization of large-scale repositories of learning objects with directed hypergraphs. In: Cao, Y., Väljataga, T., Tang, J., Leung, H., Laanpere, M. (eds.) ICWL 2014. LNCS, vol. 8699, pp. 23–33. Springer, Cham (2014). doi:10.1007/978-3-319-13296-9_3
Google Scholar
Lengauer, T.: Combinatorial Algorithms for Integrated Circuit Layout. Springer Science & Business Media, Heidelberg (2012)
MATH Google Scholar
Çatalyürek, Ü.V., Aykanat, C., Uçar, B.: On two-dimensional sparse matrix partitioning: models, methods, and a recipe. SIAM J. Sci. Comput. 32(2), 656–683 (2010)
Article MathSciNet MATH Google Scholar
Ümit, V.Ç., Mehmet, D., Kamer, K.: Multithreaded clustering for multi-level hypergraph partitioning. In: 6th IEEE International Parallel and Distributed Processing Symposium, IPDPS, Shanghai, China, pp. 848–859 (2012)
Google Scholar
Pegasus Team: Workflow Generator. https://confluence.pegasus.isi.edu/display/pegasus/WorkflowGenerator

Download references

Acknowledgements

The authors warmly thank the reviewers for their insightful comments which helped to improve this work. This work was supported in part by National Natural Science Foundation of China (NSFC), project 61602525 and project 61572525.

Author information

Authors and Affiliations

School of Software, Central South University, Changsha, 410075, China
Zhigang Hu, Jia Li, Meiguang Zheng, Xinxin Zhang, Hui Kang, Yong Tao & Jiao Yang

Authors

Zhigang Hu
View author publications
You can also search for this author in PubMed Google Scholar
Jia Li
View author publications
You can also search for this author in PubMed Google Scholar
Meiguang Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Xinxin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Hui Kang
View author publications
You can also search for this author in PubMed Google Scholar
Yong Tao
View author publications
You can also search for this author in PubMed Google Scholar
Jiao Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Meiguang Zheng .

Editor information

Editors and Affiliations

Central South University, Changsha, China
Beiji Zou
Harbin Engineering University, Harbin, China
Qilong Han
Harbin University of Science and Technology, Harbin, China
Guanglu Sun
Northeast Forestry University, Harbin, China
Weipeng Jing
Huaihua University, Huaihua, Hunan, China
Xiaoning Peng
Sciences of Country Tripod Institute of Data Science, Harbin, China
Zeguang Lu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hu, Z. et al. (2017). Hypergraph-Based Data Reduced Scheduling Policy for Data-Intensive Workflow in Clouds. In: Zou, B., Han, Q., Sun, G., Jing, W., Peng, X., Lu, Z. (eds) Data Science. ICPCSEE 2017. Communications in Computer and Information Science, vol 728. Springer, Singapore. https://doi.org/10.1007/978-981-10-6388-6_28

Download citation

DOI: https://doi.org/10.1007/978-981-10-6388-6_28
Published: 16 September 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6387-9
Online ISBN: 978-981-10-6388-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics