Abstract
Scientific applications are often structured as workflows that execute a series of distributed software modules to analyze large data sets. Such workflows are typically constructed using general-purpose scripting languages to coordinate the execution of the various modules and to exchange data sets between them. While such scripts provide a cost-effective approach for simple workflows, as the workflow structure becomes complex and evolves, the scripts quickly become complex and difficult to modify. This makes them a major barrier to easily and quickly deploying new algorithms and exploiting new, scalable hardware platforms. In this paper, we describe the MeDICi Workflow technology that is specifically designed to reduce the complexity of workflow application development, and to efficiently handle data intensive workflow applications. MeDICi integrates standard component-based and service-based technologies, and employs an efficient integration mechanism to ensure large data sets can be efficiently processed. We illustrate the use of MeDICi with a climate data processing example that we have built, and describe some of the new features we are creating to further enhance MeDICi Workflow applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Kouzes, R.T., Anderson, G.A., Elbert, S.T., Gorton, I., Gracio, D.K.: The Changing Paradigm of Data-Intensive Computing. Computer 42(1), 26–34 (2009)
Ludäscher, B., Altintas, I., Berkley, C., Higgins, D., Jaeger, E., Jones, M., Lee, E.A., Tao, J., Zhao, Y.: Scientific workflow management and the Kepler system. Concurrency and Computation: Practice and Experience 18(10), 1039–1065 (2006)
Goble, C.A., Oinn, T., Greenwood, M., Addis, M.: Taverna: Lessons in creating a workflow environment for the life sciences. Concurrency and Computation: Practice and Experience (Special Issue on Workflow in Grid Systems) 18(10), 1067–1100 (2005)
Shah, A.R., Singhal, M., Gibson, T.D., Sivaramakrishnan, C., Waters, K.M., Gorton, I.: An Extensible, Scalable Architecture for Managing Bioinformatics Data and Analyses. In: IEEE Fourth International Conference on eScience 2008, December 7-12 , pp. 190–197 (2008)
Gorton, I., Greenfield, P., Szalay, A., Williams, R.: Data-Intensive Computing in the 21st Century. Computer 41(4), 30–32 (2008)
Gorton, I., Wynne, A., Almquist, J., Chatterton, J.: The MeDICi Integration Framework: A Platform for High Performance Data Streaming Applications. In: Seventh Working IEEE/IFIP Conference on Software Architecture (WICSA 2008), Vancouver, Canada, pp. 95–104 (2008)
Barker, A., van Hemert, J.: Scientific Workflow: A Survey and Research Directions. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Wasniewski, J. (eds.) PPAM 2007. LNCS, vol. 4967, pp. 746–753. Springer, Heidelberg (2008)
Butchart, B., Cameron, N., Chen, L., Wassermann, B., Emmerich, W., Patel, J.: Sedna: A BPEL-based environment for visual scientific workflow modeling. In: Workflows for eScience. Springer, Heidelberg (2007)
Akram, A., Meredith, D., Allan, R.: Evaluation of BPEL to Scientific Workflows. In: CCGRID 2006. Sixth IEEE International Symposium on Cluster Computing and the Grid, vol. 1, pp. 269–274 (2006)
Wynne, A., Gorton, I., Almquist, J., Chatterton, J., Thurman, D.: A Flexible, High Performance Service-Oriented Architecture for Detecting Cyber Attacks. In: Hawaiian International Conference on Systems Science (HICSS 2008). IEEE, Los Alamitos (2008)
Lee, K., Paton, N.W., Sakellariou, R., Deelman, E., Fernandes, A., Mehta, G.: Adaptive Workflow Processing and Execution. In: Pegasus 3rd International Workshop on Workflow Management and Applications in Grid Environments (WaGe08), Proceedings of the Third International Conference on Grid and Pervasive Computing Symposia/Workshops, Kunming, China, May 25-28, pp. 99–106 (2008)
Brown, J., Ferner, C., Hudson, T., Stapleton, A., Vetter, R., Carland, T., Martin, A., Martin, J., Rawls, A., Shipman, W., Wood, M.: GridNexus: A Grid Services Scientific Workflow System. International Journal of Computer Information Science (IJCIS) 6(2), 72–82 (2005)
Couvares, P., et al.: Workflow Management in Condor. In: Taylor, I., et al. (eds.) Workflows in e-Science. Springer, Heidelberg (2006)
Emmerich, W., Butchart, B., Chen, L., Wassermann, B., Price, S.: Grid Service Orchestration Using the Business Process Execution Language (BPEL). J. Grid Comput. 3(3-4), 283–304 (2005)
Barker, A., Weissman, J.B., van Hemert, J.I.: Orchestrating Data-Centric Workflows. In: Procs. Int. Sym. On Cluster Computer and the Grid, pp. 210–217. IEEE, Los Alamitos (2008)
Barker, A., Weissman, J.B., van Hemert, J.: Eliminating the Middle Man: Peer-to-Peer Dataflow. In: HPDC 2008: Proceedings of the 17th International Symposium on High Performance Distributed Computing, pp. 55–64 (June 2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gorton, I., Chase, J., Wynne, A., Almquist, J., Chappell, A. (2009). Services + Components = Data Intensive Scientific Workflow Applications with MeDICi. In: Lewis, G.A., Poernomo, I., Hofmeister, C. (eds) Component-Based Software Engineering. CBSE 2009. Lecture Notes in Computer Science, vol 5582. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02414-6_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-02414-6_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02413-9
Online ISBN: 978-3-642-02414-6
eBook Packages: Computer ScienceComputer Science (R0)