ABSTRACT
Task characteristics estimations such as runtime, disk space, and memory consumption, are commonly used by scheduling algorithms and resource provisioning techniques to provide successful and efficient workflow executions. These methods assume that accurate estimations are available, but in production systems it is hard to compute such estimates with good accuracy. In this work, we first profile three real scientific workflows collecting fine-grained information such as process I/O, runtime, memory usage, and CPU utilization. We then propose a method to automatically characterize workflow task needs based on these profiles. Our method estimates task runtime, disk space, and memory consumption based on the size of tasks input data. It looks for correlations between the parameters of a dataset, and if no correlation is found, the dataset is divided into smaller subsets by using a clustering technique. Task behavior estimates are done based on the ratio parameter/input data size if they are correlated, or based on the mean value. However, task dependencies in scientific workflows lead to a chain of estimation errors. To correct such errors, we propose an online estimation process based on the MAPE-K loop where task executions are constantly monitored and estimates are updated accordingly. Experiment results show that our online estimation process yields much more accurate predictions than an offline approach, where all task needs are estimated at once.
- I. Taylor, E. Deelman, D. Gannon, and M. Shields, Workflows for e-Science. Springer, 2007.Google ScholarCross Ref
- J. D. Ullman, "Np-complete scheduling problems," J. Comput. Syst. Sci., vol. 10, no. 3, pp. 384--393, Jun. 1975. Google ScholarDigital Library
- M. Maheswaran, S. Ali, H. J. Siegel, D. Hensgen, and R. F. Freund, "Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems," in Proceedings of the Eighth Heterogeneous Computing Workshop, ser. HCW '99. Washington, DC, USA: IEEE Computer Society, 1999, pp. 30--. Google ScholarDigital Library
- H. Topcuouglu, S. Hariri, and M.-y. Wu, "Performance-effective and low-complexity task scheduling for heterogeneous computing," IEEE Trans. Parallel Distrib. Syst., vol. 13, no. 3, pp. 260--274, Mar. 2002. Google ScholarDigital Library
- M. Rahman, R. Hassan, R. Ranjan, and R. Buyya, "Adaptive workflow scheduling for dynamic grid and cloud computing environment," Concurrency and Computation: Practice and Experience, pp. n/a--n/a, 2013.Google Scholar
- S. Su, J. Li, Q. Huang, X. Huang, K. Shuang, and J. Wang, "Cost-efficient task scheduling for executing large programs in the cloud," Parallel Computing vol. 39, no. 4âĂŞ5, pp. 177--188, 2013.Google Scholar
- K. Bessai, S. Youcef, A. Oulamara, C. Godart, and S. Nurcan, "Bi-criteria workflow tasks allocation and scheduling in cloud computing environments," in Cloud Computing (CLOUD), 2012 IEEE 5th International Conference on, 2012, pp. 638--645. Google ScholarDigital Library
- S. Verboven, P. Hellinckx, F. Arickx, and J. Broeckhove, "Runtime prediction based grid scheduling of parameter sweep jobs," in Asia-Pacific Services Computing Conference, 2008. APSCC '08. IEEE, 2008, pp. 33--38. Google ScholarDigital Library
- C. Yan, H. Luo, Z. Hu, X. Li, and Y. Zhang, "Deadline guarantee enhanced scheduling of scientific workflow applications in grid," Journal of Computers, vol. 8, no. 4, 2013.Google ScholarCross Ref
- J. O. Gutierrez-Garcia and K. M. Sim, "A family of heuristics for agent-based elastic cloud bag-of-tasks concurrent scheduling," Future Generation Computer Systems, vol. 29, no. 7, pp. 1682--1699, 2013. Google ScholarDigital Library
- G. Juve, A. Chervenak, E. Deelman, S. Bharathi, G. Mehta, and K. Vahi, "Characterizing and profiling scientific workflows," Future Generation Computer Systems, vol. 29, no. 3, pp. 682--692, 2013, special Section: Recent Developments in High Performance Computing and Security. Google ScholarDigital Library
- O. Sonmez, N. Yigitbasi, A. Iosup, and D. Epema, "Trace-based evaluation of job runtime and queue wait time predictions in grids," in Proceedings of the 18th ACM international symposium on High performance distributed computing, ser. HPDC '09. New York, NY, USA: ACM, 2009, pp. 111--120. Google ScholarDigital Library
- D. Martinez-Rego and M. Pontil, "Multi-task averaging via task clustering," in Similarity-Based Pattern Recognition, ser. Lecture Notes in Computer Science, E. Hancock and M. Pelillo, Eds. Springer Berlin Heidelberg, 2013, vol. 7953, pp. 148--159. Google ScholarDigital Library
- J. Kephart and D. Chess, "The vision of autonomic computing," Computer, vol. 36, no. 1, pp. 41--50, 2003. Google ScholarDigital Library
- E. Deelman, G. Singh, M.-H. Su, J. Blythe, Y. Gil, C. Kesselman, G. Mehta, K. Vahi, G. B. Berriman, J. Good, A. Laity, J. C. Jacob, and D. S. Katz, "Pegasus: A framework for mapping complex scientific workflows onto distributed systems," Sci. Program., vol. 13, no. 3, pp. 219--237, Jul. 2005. Google ScholarDigital Library
- J. s. VŽckler, G. Mehta, Y. Zhao, E. Deelman, and M. Wilde, "Kickstarting remote applications," in 2nd International Workshop on Grid Computing Environments, 2006.Google Scholar
- M. Albrecht, P. Donnelly, P. Bui, and D. Thain, "Makeflow: a portable abstraction for data intensive computing on clusters, clouds, and grids," in Proceedings of the 1st ACM SIGMOD Workshop on Scalable Workflow Execution Engines and Technologies, ser. SWEET '12. New York, NY, USA: ACM, 2012, pp. 1:1--1:13. Google ScholarDigital Library
- T. Fahringer, A. Jugravu, S. Pllana, R. Prodan, C. Seragiotto, Jr., and H.-L. Truong, "Askalon: a tool set for cluster and grid computing: Research articles," Concurr. Comput.: Pract. Exper., vol. 17, no. 2--4, pp. 143--169, Feb. 2005. Google ScholarDigital Library
- T. Oinn, M. Greenwood, M. Addis, M. N. Alpdemir, J. Ferris, K. Glover, C. Goble, A. Goderis, D. Hull, D. Marvin, P. Li, P. Lord, M. R. Pocock, M. Senger, R. Stevens, A. Wipat, and C. Wroe, "Taverna: lessons in creating a workflow environment for the life sciences: Research articles," Concurr. Comput.: Pract. Exper., vol. 18, no. 10, pp. 1067--1100, Aug. 2006. Google ScholarDigital Library
- G. B. Berriman, E. Deelman, J. C. Good, J. C. Jacob, D. S. Katz, C. Kesselman, A. C. Laity, T. A. Prince, G. Singh, and M.-H. Su, "Montage: a grid-enabled engine for delivering custom science-grade mosaics on demand," vol. 5493, pp. 221--232, 2004.Google Scholar
- H.-P. Kriegel, P. Kröger, J. Sander, and A. Zimek, "Density-based clustering," Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 1, no. 3, pp. 231--240, 2011.Google ScholarCross Ref
- M. Ester, H. P. Kriegel, J. Sander, and X. Xu, "A density-based algorithm for discovering clusters in large spatial databases with noise," in Second International Conference on Knowledge Discovery and Data Mining, 1996, pp. 226--231.Google Scholar
- R. Ferreira da Silva, T. Glatard, and F. Desprez, "Self-healing of workflow activity incidents on distributed computing infrastructures," Future Generation Computer Systems, p. in press, 2013. Google ScholarDigital Library
- Parallel workloads archive. {Online}. Available: www.cs.huji.ac.il/labs/parallel/workload/Google Scholar
- A. Iosup, H. Li, M. Jan, S. Anoep, C. Dumitrescu, L. Wolters, and D. H. J. Epema, "The grid workloads archive," Future Gener. Comput. Syst., vol. 24, no. 7, pp. 672--686, 2008. Google ScholarDigital Library
- C. Germain-Renaud, A. Cady, P. Gauron, M. Jouvin, C. Loomis, J. Martyniak, J. Nauroy, G. Philippon, and M. Sebag, "The grid observatory," IEEE International Symposium on Cluster Computing and the Grid, pp. 114--123, 2011. Google ScholarDigital Library
- Workflow gallery. {Online}. Available: http://pegasus.isi.edu/workflow_galleryGoogle Scholar
- R. Ferreira da Silva and T. Glatard, "A science-gateway workload archive to study pilot jobs, user activity, bag of tasks, task sub-steps, and workflow executions," in Euro-Par 2012: Parallel Processing Workshops, ser. Lecture Notes in Computer Science, I. Caragiannis, M. Alexander, R. Badia, M. Cannataro, A. Costan, M. Danelutto, F. Desprez, B. Krammer, J. Sahuquillo, S. Scott, and J. Weidendorfer, Eds. Springer Berlin Heidelberg, 2013, vol. 7640, pp. 79--88. Google ScholarDigital Library
- Workflow generator. {Online}. Available: http://confluence.pegasus.isi.edu/display/pegasus/WorkflowGeneratorGoogle Scholar
- L. Ramakrishnan and D. Gannon, "A survey of distributed workflow characteristics and resource requirements," Indiana University, 2008.Google Scholar
- S. Ostermann, R. Prodan, T. Fahringer, A. Iosup, and D. Epema, "On the characteristics of grid workflows," in CoreGRID Symposium - Euro-Par 2008, 2008.Google Scholar
- S. Ostermann, R. Prodan, T. Fahringer, A. Iosup, and D. Epema, "A trace-based investigation of the characteristics of grid workflows," in From Grids to Service and Pervasive Computing, T. Priol and M. Vanneschi, Eds. Springer US, 2008, pp. 191--203.Google Scholar
- A. Iosup and D. Epema, "Grid computing workloads," Internet Computing, IEEE, vol. 15, no. 2, pp. 19--26, 2011. Google ScholarDigital Library
- D. L. Hart, "Measuring teragrid: workload characterization for a high-performance computing federation," International Journal of High Performance Computing Applications, vol. 25, no. 4, pp. 451--465, 2011. Google ScholarDigital Library
- Z. Ren, X. Xu, J. Wan, W. Shi, and M. Zhou, "Workload characterization on a production hadoop cluster: A case study on taobao," in Workload Characterization (IISWC), 2012 IEEE International Symposium on, 2012, pp. 3--13. Google ScholarDigital Library
- S. Mahambre, P. Kulkarni, U. Bellur, G. Chafle, and D. Deshpande, "Workload characterization for capacity planning and performance management in iaas cloud," in Cloud Computing in Emerging Markets (CCEM), 2012 IEEE International Conference on, 2012, pp. 1--7.Google Scholar
- S. Madougou, S. Shahand, M. Santcroos, B. van Schaik, A. Benabdelkader, A. van Kampen, and S. Olabarriaga, "Characterizing workflow-based activity on a production e-infrastructure using provenance data," Future Generation Computer Systems, vol. 29, no. 8, pp. 1931--1942, 2013. Google ScholarDigital Library
- S. Pacheco-Sanchez, G. Casale, B. Scotney, S. McClean, G. Parr, and S. Dawson, "Markovian workload characterization for qos prediction in the cloud," in Cloud Computing (CLOUD), 2011 IEEE International Conference on, 2011, pp. 147--154. Google ScholarDigital Library
- A. Khan, X. Yan, S. Tao, and N. Anerousis, "Workload characterization and prediction in the cloud: A multiple time series approach," in Network Operations and Management Symposium (NOMS), 2012 IEEE, 2012, pp. 1287--1294.Google Scholar
- R. Duan, F. Nadeem, J. Wang, Y. Zhang, R. Prodan, and T. Fahringer, "A hybrid intelligent method for performance modeling and prediction of workflow activities in grids," in Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, ser. CCGRID '09. Washington, DC, USA: IEEE Computer Society, 2009, pp. 339--347. Google ScholarDigital Library
- E.-K. Byun, Y.-S. Kee, E. Deelman, K. Vahi, G. Mehta, and J.-S. Kim, "Estimating resource needs for time-constrained workflows," in eScience, 2008. eScience '08. IEEE Fourth International Conference on, 2008, pp. 31--38. Google ScholarDigital Library
- R. Huang, H. Casanova, and A. A. Chien, "Automatic resource specification generation for resource selection," in Proceedings of the 2007 ACM/IEEE conference on Supercomputing, ser. SC '07. New York, NY, USA: ACM, 2007, pp. 11:1--11:11. Google ScholarDigital Library
- F. Nadeem and T. Fahringer, "Using templates to predict execution time of scientific workflow applications in the grid," in Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, ser. CCGRID '09. Washington, DC, USA: IEEE Computer Society, 2009, pp. 316--323. Google ScholarDigital Library
Recommendations
Fine-Grain Interoperability of Scientific Workflows in Distributed Computing Infrastructures
Today there exist a wide variety of scientific workflow management systems, each designed to fulfill the needs of a certain scientific community. Unfortunately, once a workflow application has been designed in one particular system it becomes very hard ...
Approaches to Distributed Execution of Scientific Workflows in Kepler
Scalable Workflow Enactment Engines and TechnologyThe Kepler scientific workflow system enables creation, execution and sharing of workflows across a broad range of scientific and engineering disciplines while also facilitating remote and distributed execution of workflows. In this paper, we present ...
Task Exception Handling in the VIEW Scientific Workflow System
SCC '10: Proceedings of the 2010 IEEE International Conference on Services ComputingScientific workflows have been widely used by scientists to accelerate research experiments and achieve scientific discoveries. Due to the nature of science, scientific workflows often involve complex workflow design and distributed computation ...
Comments