Abstract
In this work, we tackle several important issues to improve the throughput of runtime-adaptive applications on state-of-the-art HPC systems. A first issue is the, in general, missing information about the actual impact of unforeseeable workload by adaptivity and of the unknown number of time steps or iterations on the runtime of adaptive applications. Another issue is that resource scheduling on HPC systems is currently done before an application is started and remains unchanged afterwards, even in case of varying requirements. Furthermore, an application cannot be started after another running application allocated all resources. We combine addressing these issues with the design of algorithms that adapt their use of resources during runtime, by releasing or requesting compute cores, for example. If concurrent applications compete for resources, this requires the implementation of an appropriate resource management.
We show a solution for these issues by using invasive paradigms to start applications and schedule resources during runtime. The scheduling of the distribution of cores to the applications is achieved by a global resource manager. We introduce scalability graphs to improve load balancing of multiple applications. For adaptive simulations, several scalability graphs exist to consider different phases of scalability due to changing workload.
For a proof-of-concept, we show runtime/throughput results for a fully adaptive shallow-water simulation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Borkar, S., Chien, A.A.: The future of microprocessors. Commun. ACMÂ 54 (2011)
Kumar, V., Gupta, A.: Analysis of scalability of parallel algorithms and architectures: a survey. In: Proc. of the 5th Int. Conf. on Supercomp. (1991)
Isard, M., Prabhakaran, V., Currey, J., Wieder, U., Talwar, K., Goldberg, A.: Quincy: fair scheduling for distributed computing clusters. In: Proc. of the ACM SIGOPS 22nd Symp. on Operating System Principles (2009)
Armstrong, T., Zhang, Z., Katz, D., Wilde, M., Foster, I.: Scheduling many-task workloads on supercomputers: Dealing with trailing tasks. In: 2010 IEEE Workshop on Many-Task Computing on Grids and Supercomputers, MTAGS (2010)
Reinders, J.: Intel Threading Building Blocks: Outfitting C++ for Multi-Core Processor Parallelism. O’Reilly Media (2007)
Duran, A., Perez, J.M., Ayguadé, E., Badia, R.M., Labarta, J.: Extending the OpenMP Tasking Model to Allow Dependent Tasks. In: Eigenmann, R., de Supinski, B.R. (eds.) IWOMP 2008. LNCS, vol. 5004, pp. 111–122. Springer, Heidelberg (2008)
Kim, S., Chandra, D., Solihin, Y.: Fair cache sharing and partitioning in a chip multiprocessor architecture. In: Proc. of the 13th Int. Conf. on Par. Arch. and Compilation Techniques (2004)
Beckman, P., Nadella, S., Trebon, N., Beschastnikh, I.: SPRUCE: A System for Supporting Urgent High-Performance Computing. In: Gaffney, P.W., Pool, J.C.T. (eds.) Grid-Based Problem Solving Environments. IFIP, vol. 239, pp. 295–311. Springer, Boston (2007)
Teich, J., Henkel, J., Herkersdorf, A., Schmitt-Landsiedel, D., Schröder-Preikschat, W., Snelting, G.: Invasive computing: An overview. In: Multiprocessor System-on-Chip – Hardware Design and Tool Integration. Springer (2011)
Kobbe, S., Bauer, L., Lohmann, D., Schröder-Preikschat, W., Henkel, J.: Distrm: distr. rm for on-chip many-core systems. In: Proc. of the 7th IEEE/ACM/IFIP int. conf. on Hardware/Software Codesign and Syst. Synth. (2011)
OpenMP Arch. Review Board: OpenMP Appl. Progr. Interf. Version 3.0 (2008)
Alba, E.: Parallel evolutionary algorithms can achieve super-linear performance. Information Processing Letters 82(1) (2002)
Fletcher, R., Powell, M.J.D.: A rapidly convergent descent method for minimization. The Computer Journal 6(2), 163–168 (1963)
Schreiber, M., Bungartz, H.J., Bader, M.: Shared Memory Parallelization of Fully-Adaptive Simulations Using a Dynamic Tree-Split and -Join Approach. In: 19th Annual International Conference on High Performance Computing (2012)
Bader, M., Schraufstetter, S., Vigh, C., Behrens, J.: Memory Efficient Adaptive Mesh Generation and Implementation of Multigrid Algorithms Using Sierpinski Curves. Int. J. of Computat. Science and Engineering 4(1) (2008)
Bader, M., Böck, C., Schwaiger, J., Vigh, C.A.: Dynamically Adaptive Simulations with Minimal Memory Requirement - Solving the Shallow Water Equations Using Sierpinski Curves. SIAM Journal of Scientific Computing 32(1) (2010)
Bader, M., Bungartz, H.J., Gerndt, M., Hollmann, A., Weidendorfer, J.: Invasive programming as a concept for HPC. In: Proc. of the 10h IASTED Int. Conf. on Parallel and Distr. Comp. and Netw, PDCN (2011)
Sakae, Y., Sato, M., Matsuoka, S., Harada, H.: Preliminary Evaluation of Dynamic Load Balancing Using Loop Re-partitioning on Omni/SCASH. In: Proc. of the 3rd Int. Symp. on Cluster Computing and the Grid (2003)
Corbaln, J., Duran, A., Labarta, J.: Dynamic Load Balancing of MPI+OpenMP Applications. In: ICPP (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Bader, M., Bungartz, HJ., Schreiber, M. (2013). Invasive Computing on High Performance Shared Memory Systems. In: Keller, R., Kramer, D., Weiss, JP. (eds) Facing the Multicore-Challenge III. Lecture Notes in Computer Science, vol 7686. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35893-7_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-35893-7_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35892-0
Online ISBN: 978-3-642-35893-7
eBook Packages: Computer ScienceComputer Science (R0)