Skip to main content

Measuring the Effectiveness of Throttled Data Transfers on Data-Intensive Workflows

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7327))

Abstract

In data intensive workflows, which often involve files, transfer between tasks is typically accomplished as fast as the network links allow, and once transferred, the files are buffered/stored at their destination. Where a task requires multiple files to execute (from different previous tasks), it must remain idle until all files are available. Hence, network bandwidth and buffer/storage within a workflow are often not used effectively. In this paper, we are quantitatively measuring the impact that applying an intelligent data movement policy can have on buffer/storage in comparison with existing approaches. Our main objective is to propose a metric that considers a workflow structure expressed as a Directed Acyclic Graph (DAG), and performance information collected from historical past executions of the considered workflow. This metric is intended for use at the design-stage, to compare various DAG structures and evaluate their potential for optimisation (of network bandwidth and buffer use).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Park, S.M., Humphrey, M.: Data Throttling for Data-Intensive Workflows. In: IEEE International Symposium on Parallel and Distributed Processing, pp. 1–11 (April 2008)

    Google Scholar 

  2. van der Aalst, W., van Hee, K.: Workflow Management: Models, Methods, and Systems. MIT Press Books, vol. 1. The MIT Press (2004)

    Google Scholar 

  3. van der Aalst, W.M.P., Hirnschall, A., Verbeek, H.M.W.: An Alternative Way to Analyze Workflow Graphs. In: Pidduck, A.B., Mylopoulos, J., Woo, C.C., Ozsu, M.T. (eds.) CAiSE 2002. LNCS, vol. 2348, pp. 535–552. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  4. Filgueira, R., Carretero, J., Singh, D.E., Calderón, A., Nuñez, A.: Dynamic-compi: dynamic optimization techniques for mpi parallel applications. The Journal of Supercomputing 59(1), 361–391 (2012)

    Article  Google Scholar 

  5. Yu, J., Buyya, R.: A Taxonomy of Workflow Management Systems for Grid Computing. CoRR 34(3), 44–49 (2005)

    Google Scholar 

  6. Oinn, T., Greenwood, M., Addis, M., Alpdemir, M.N., Ferris, J., Glover, K., Goble, C., Goderis, A., Hull, D., Marvin, D., Li, P., Lord, P., Pocock, M.R., Senger, M., Stevens, R., Wipat, A., Wroe, C.: Taverna: lessons in creating a workflow environment for the life sciences: Research Articles. Concurr. Comput.: Pract. Exper. 18(10), 1067–1100 (2006)

    Article  Google Scholar 

  7. Deelman, E., Mehta, G., Singh, G., Su, M., Vahi, K.: Pegasus: Mapping Large-Scale Workflows to Distributed Resources. In: Workflows for eScience, pp. 376–394. Springer (2007)

    Google Scholar 

  8. Rodríguez, R.J., Tolosana-Calasanz, R., Rana, O.F.: Automating Data-Throttling Analysis for Data-Intensive Workflows. In: Proceedings of CCGrid (accepted for publication, 2012)

    Google Scholar 

  9. Murata, T.: Petri Nets: Properties, Analysis and Applications. Proceedings of the IEEE 77, 541–580 (1989)

    Article  Google Scholar 

  10. Molloy, M.: Performance Analysis Using Stochastic Petri Nets. IEEE Transactions on Computers C-31(9), 913–917 (1982)

    Article  Google Scholar 

  11. Rodríguez, R.J., Júlvez, J.: Accurate Performance Estimation for Stochastic Marked Graphs by Bottleneck Regrowing. In: Aldini, A., Bernardo, M., Bononi, L., Cortellessa, V. (eds.) EPEW 2010. LNCS, vol. 6342, pp. 175–190. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  12. Campos, J., Silva, M.: Embedded Product-Form Queueing Networks and the Improvement of Performance Bounds for Petri Net Systems. Performance Evaluation 18(1), 3–19 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  13. Berriman, G.B., Deelman, E., Good, J., Jacob, J.C., Katz, D.S., Laity, A.C., Prince, T.A., Singh, G., Su, M.H.: Generating Complex Astronomy Workflows. In: Taylor, I.J., Deelman, E., Gannon, D.B., Shields, M. (eds.) Workflows for e-Science, pp. 19–38. Springer, London (2007)

    Chapter  Google Scholar 

  14. Casanova, H., Legrand, A., Quinson, M.: SimGrid: a Generic Framework for Large-Scale Distributed Experiments. In: 10th IEEE International Conference on Computer Modeling and Simulation (March 2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rodríguez, R.J., Tolosana-Calasanz, R., Rana, O.F. (2012). Measuring the Effectiveness of Throttled Data Transfers on Data-Intensive Workflows. In: Jezic, G., Kusek, M., Nguyen, NT., Howlett, R.J., Jain, L.C. (eds) Agent and Multi-Agent Systems. Technologies and Applications. KES-AMSTA 2012. Lecture Notes in Computer Science(), vol 7327. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30947-2_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-30947-2_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-30946-5

  • Online ISBN: 978-3-642-30947-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics