Skip to main content

Adaptive Workload Partitioning and Allocation for Data Intensive Scientific Applications

  • Chapter
  • First Online:
Cloud Computing for Data-Intensive Applications
  • 1497 Accesses

Abstract

Scientific applications are becoming data intensive, and traditional load-balance solutions require reconsideration for scaling data and computation in various parallel systems. This chapter examines state-transition applications, which is a representative scientific application that handles grand-challenging problems (e.g., weather forecasting and ocean prediction) and relates to intensive data. We propose an adaptive workload partitioning and allocation scheme for parallelizing state-transition applications in various parallel systems. Existing schemes insufficiently balance both computation of complicated scientific algorithms and increasing volumes of scientific data simultaneously. Our solution addresses this problem by introducing a time metric to unify the workloads of computation and data. System profiles in terms of CPU and I/O speeds are considered for embracing system diversity, suggesting accurate estimation of workload. The solution consists of two major components: (1) an adaptive decomposition scheme that uses the quad-tree structure to break up workload and manage data dependency; and (2) a decentralized scheme for distributing workload across processors. Experimental results from real-world weather data demonstrate that the solution outperforms other partitioning schemes, and can be readily ported to diverse systems with satisfactory performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. M. Fisher, J. Nocedal, Y. Trémolet, and S. Wright, “Data assimilation in weather forecasting: a case study in pde-constrained optimization,” Optimization and Engineering, vol. 10, no. 3, pp. 409–426, 2009.

    Article  MATH  MathSciNet  Google Scholar 

  2. A. Robinson and P. Lermusiaux, “Overview of data assimilation,” Harvard reports in physical/interdisciplinary ocean science, vol. 62, 2000.

    Google Scholar 

  3. M. Berger and S. Bokhari, “A partitioning strategy for nonuniform problems on multiprocessors,” ToC, vol. 100, no. 5, pp. 570–580, 1987.

    Google Scholar 

  4. D. Nicol, “Rectilinear partitioning of irregular data parallel computations,” DTIC Document, Tech. Rep., 1991.

    Google Scholar 

  5. F. Manne and T. Sørevik, “Partitioning an array onto a mesh of processors,” Applied Parallel Computing Industrial Computation and Optimization, pp. 467–477, 1996.

    Google Scholar 

  6. O. Beaumont, V. Boudet, F. Rastello, and Y. Robert, “Matrix multiplication on heterogeneous platforms,” TPDS, vol. 12, no. 10, pp. 1033–1051, 2001.

    MathSciNet  Google Scholar 

  7. E. Saule, E. O. Bas, and U. V. Catalyurek, “Partitioning spatially located computations using rectangles,” in IPDPS. IEEE, 2011.

    Google Scholar 

  8. N. Wright, S. Smallen, C. Olschanowsky, J. Hayes, and A. Snavely, “Measuring and understanding variation in benchmark performance,” in DoD High Performance Computing Modernization Program Users Group Conference (HPCMP-UGC), 2009. IEEE, 2009, pp. 438–443.

    Google Scholar 

  9. T. Zou, G. Wang, M. Salles, D. Bindel, A. Demers, J. Gehrke, and W. White, “Making time-stepped applications tick in the cloud,” in SoCC. ACM, 2011, p. 20.

    Google Scholar 

  10. G. Wang and T. Ng, “The impact of virtualization on network performance of amazon ec2 data center,” in INFOCOM. IEEE, 2010, pp. 1–9.

    Google Scholar 

  11. H. Samet, “The quadtree and related hierarchical data structures,” ACM Computing Surveys (CSUR), vol. 16, no. 2, pp. 187–260, 1984.

    Article  MathSciNet  Google Scholar 

  12. B. H. A. Hoekstra and R. Williams, “High-performance computing and networking.”

    Google Scholar 

  13. X. Yang, Z. Yu, M. Li, and X. Li, “Mammoth: autonomic data processing framework for scientific state-transition applications,” in Proceedings of the 2013 ACM Cloud and Autonomic Computing Conference. ACM, 2013, p. 13.

    Google Scholar 

  14. J.-R. Sack and J. Urrutia, Handbook of computational geometry. North Holland, 1999.

    Google Scholar 

  15. B. Aspvall, M. M Halldórsson, and F. Manne, “Approximations for the general block distribution of a matrix,” Theoretical computer science, vol. 262, no. 1, pp. 145–160, 2001.

    Article  MATH  MathSciNet  Google Scholar 

  16. M. Xue, D. Wang, J. Gao, K. Brewster, and K. Droegemeier, “The Advanced Regional Prediction System (ARPS), storm-scale numerical weather prediction and data assimilation,” Meteorology and Atmospheric Physics, vol. 82, no. 1, pp. 139–170, 2003.

    Article  Google Scholar 

  17. R. Van der Wijngaart and P. Wong, “Nas parallel benchmarks version 2.4,” NAS technical report, NAS-02-007, Tech. Rep., 2002.

    Google Scholar 

  18. “IOR HPC Benchmark,” http://sourceforge.net/projects/ior-sio/.

  19. H. P. F. Form, “High performance fortran language specification,” 1993.

    Google Scholar 

  20. M. Grigni and F. Manne, “On the complexity of the generalized block distribution,” Parallel Algorithms for Irregularly Structured Problems, pp. 319–326, 1996.

    Google Scholar 

  21. M. J. Berger and J. Oliger, “Adaptive mesh refinement for hyperbolic partial differential equations,” Journal of computational Physics, vol. 53, no. 3, pp. 484–512, 1984.

    Article  MATH  MathSciNet  Google Scholar 

  22. X. Li and M. Parashar, “Hybrid runtime management of space-time heterogeneity for parallel structured adaptive applications,” TPDS, pp. 1202–1214, 2007.

    Google Scholar 

  23. Y. Kwon, M. Balazinska, B. Howe, and J. Rolia, “Skew-resistant parallel processing of feature-extracting scientific user-defined functions,” in SoCC. ACM, 2010, pp. 75–86.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xin Yang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this chapter

Cite this chapter

Yang, X., Li, X. (2014). Adaptive Workload Partitioning and Allocation for Data Intensive Scientific Applications. In: Li, X., Qiu, J. (eds) Cloud Computing for Data-Intensive Applications. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-1905-5_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-1905-5_6

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4939-1904-8

  • Online ISBN: 978-1-4939-1905-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics