Reasoning task dependencies for robust service selection in data intensive workflows

Wang, Mingzhong; Zhu, Liehuang; Ramamohanarao, Kotagiri

doi:10.1007/s00607-013-0381-6

Reasoning task dependencies for robust service selection in data intensive workflows

Published: 28 December 2013

Volume 97, pages 337–355, (2015)
Cite this article

Computing Aims and scope Submit manuscript

Mingzhong Wang¹,
Liehuang Zhu¹ &
Kotagiri Ramamohanarao²

299 Accesses
2 Citations
Explore all metrics

Abstract

Selecting appropriate services for task execution in workflows should not only consider budget and deadline constraints, but also ensure the best probability that workflow will succeed and minimize the potential loss in case of exceptions. This requirement is more critical for data-intensive applications in grids or clouds since any failure is costly. Therefore, we design a fine-grained risk evaluation model customized for workflows to precisely compute the cost of failure for selected services. In comparison with current course-grained model, ours takes the relation of task dependency into consideration and assigns higher impact factor to tasks at the end. Thereafter, we design the utility function with the model and apply a genetic algorithm to find the optimized service allocations, thereby maximizing the robustness of the workflow while minimizing the possible risk of failure. Experiments and analysis show that the application of customized risk evaluation model into service selection can generally improve the successful probability of a workflow while reducing its exposure to the risk.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Healthcare scheduling in optimization context: a review

Article 10 April 2021

A survey of Kubernetes scheduling algorithms

Article Open access 13 June 2023

Review of job shop scheduling research and its new perspectives under Industry 4.0

Article 21 August 2017

References

Cardoso J, Sheth A, Miller J, Arnold J, Kochut K (2004) Quality of service for workflows and web service processes. Web Semant Sci Serv Agents World Wide Web 1(3):281–308
Article Google Scholar
Deelman E, Gannon D, Shields M, Taylor I (2009) Workflows and e-science: an overview of workflow system features and capabilities. Futur Gener Comput Syst 25(5):528–540
Article Google Scholar
Hoffa C, Mehta G, Freeman T, Deelman E, Keahey K, Berriman B, Good J (2008) On the use of cloud computing for scientific workflows. In: Proceedings of the 2008 fourth IEEE international conference on eScience. IEEE computer society, Washington, DC, USA, pp 640–645
Kllapi H, Sitaridi E, Tsangaris MM, Ioannidis Y (2011) Schedule optimization for data processing flows on the cloud. In: Proceedings of the 2011 ACM international conference on management of data. ACM, New York, pp 289–300
Kokash N, D’Andrea V (2007) Evaluating quality of web services: a risk-driven approach. In: Abramowicz W (ed) Business information systems. Lecture Notes in Computer Science, vol 4439. Springer, Berlin, pp 180–194
Kolisch R, Sprecher A, Drexl A (2005) PSPLIB—project scheduling problem library V2.1. http://129.187.106.231/psplib/. Accessed 28 Mar 2013
Lin C, Lu S (2011) Scheduling scientific workflows elastically for cloud computing. In: Proceedings of 2011 IEEE international conference on cloud, Computing, pp 746–747
Ma H, Schewe KD, Thalheim B, Wang Q (2009) A theory of data-intensive software services. Serv Orient Comput Appl 3(4):263–283
Article Google Scholar
Meffert K, Rotstan N, Knowles C, Sangiorgi UB (2012) JGAP—Java genetic algorithms and genetic programming package V3.6. http://jgap.sourceforge.net/. Accessed 28 Mar 2013
Olston C, Chiou G, Chitnis L, Liu F, Han Y, Larsson M, Neumann A, Rao VB, Sankarasubramanian V, Seth S, Tian C, ZiCornell T, Wang X (2011) Nova: continuous pig/hadoop workflows. In: Proceedings of the 2011 ACM international conference on management of data. ACM, New York,, pp 1081–1090
Pettifer S, Ison J, Kalas M, Thorne D, McDermott P, Jonassen I, Liaquat A, Fernandez JM, Rodriguez JM, Partners I, Pisano DG, Blanchet C, Uludag M, Rice P, Bartaseviciute E, Rapacki K, Hekkelman M, Sand O, Stockinger H, Clegg AB, Bongcam-Rudloff E, Salzemann J, Breton V, Attwood TK, Cameron G, Vriend G (2010) The embrace web service collection. Nucleic Acids Res 38:683–688
Article Google Scholar
Qi L, Lin W, Dou W, Jiang J, Chen J (2011) A QoS-aware exception handling method in scientific workflow execution. Concurr Comput Pract Exp 23(16):1951–1968
Article Google Scholar
Rahman M, Ranjan R, Buyya R (2010) Reputation-based dependable scheduling of workflow applications in peer-to-peer grids. Comput Netw 54:3341–3359
Article Google Scholar
Skene J, Raimondi F, Emmerich W (2010) Service-level agreements for electronic services. IEEE Trans Softw Eng 36(2):288–304
Article Google Scholar
Vanhatalo J, Völzer H, Leymann F, Moser S (2008) Automatic workflow graph refactoring and completion. In: Proceedings of the 6th international conference on service-oriented computing. Springer, Berlin, pp 100–115
Wang M, Ramamohanarao K, Chen J (2009) Trust-based robust scheduling and runtime adaptation of scientific workflow. Concurr Comput Pract Exp 21(16):1982–1998
Article Google Scholar
Wang X, Yeo CS, Buyya R, Su J (2011) Optimizing the makespan and reliability for workflow applications with reputation and a look-ahead genetic algorithm. Futur Gener Comput Syst 27(8):1124–1134
Article Google Scholar
Weißbach M, Zimmermann W (2010) Termination analysis of business process workflows. In: Proceedings of the 5th international workshop on enhanced web service technologies. ACM, New York, pp 18–25
Yeo CS, Buyya R (2007) Integrated risk analysis for a commercial computing service. In: IEEE international parallel and distributed processing symposium, pp 1–10.
Zhang X, Liu C, Nepal S, Chen J (2013a) An efficient quasi-identifier index based approach for privacy preservation over incremental data sets on cloud. J Comput Syst Sci 79(5):542–555
Article MATH MathSciNet Google Scholar
Zhang X, Liu C, Nepal S, Pandey S, Chen J (2013b) A privacy leakage upper-bound constraint based approach for cost-effective privacy preserving of intermediate datasets in cloud. IEEE Trans Parallel Distrib Syst 24(6):1192–1202
Article Google Scholar
Zhang X, Yang LT, Liu C, Chen J (2013c), A scalable two-phase top-down specialization approach for data anonymization using mapreduce on cloud. IEEE Trans Parallel Distrib Syst 99 (PrePrints)

Download references

Acknowledgments

The research work reported in this paper is supported by National Science Foundation of China under Grant No. 61100172 and No. 61272512. A preliminary version of this paper appeared in 2012 IPDPS Workshop of Large Scale Distributed Service-oriented Systems.

Author information

Authors and Affiliations

School of Computer Science, Beijing Institute of Technology, Beijing, 100081, China
Mingzhong Wang & Liehuang Zhu
Department of Computing and Information Systems, The University of Melbourne, Victoria, 3010, Australia
Kotagiri Ramamohanarao

Authors

Mingzhong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Liehuang Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Kotagiri Ramamohanarao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mingzhong Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, M., Zhu, L. & Ramamohanarao, K. Reasoning task dependencies for robust service selection in data intensive workflows. Computing 97, 337–355 (2015). https://doi.org/10.1007/s00607-013-0381-6

Download citation

Received: 30 March 2013
Accepted: 18 December 2013
Published: 28 December 2013
Issue Date: April 2015
DOI: https://doi.org/10.1007/s00607-013-0381-6

Keywords

Mathematics Subject Classification (2010)

68M14 Distributed systems

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reasoning task dependencies for robust service selection in data intensive workflows

Abstract

Access this article

Similar content being viewed by others

Healthcare scheduling in optimization context: a review

A survey of Kubernetes scheduling algorithms

Review of job shop scheduling research and its new perspectives under Industry 4.0

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2010)

Navigation

Reasoning task dependencies for robust service selection in data intensive workflows

Abstract

Access this article

Similar content being viewed by others

Healthcare scheduling in optimization context: a review

A survey of Kubernetes scheduling algorithms

Review of job shop scheduling research and its new perspectives under Industry 4.0

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Search

Navigation