Integrating Dynamic Memory Placement with Adaptive Load-Balancing for Parallel Codes on NUMA Multiprocessors

Slavin, Paul; Freeman, Len

doi:10.1007/978-3-540-85451-7_30

Integrating Dynamic Memory Placement with Adaptive Load-Balancing for Parallel Codes on NUMA Multiprocessors

Paul Slavin¹ &
Len Freeman¹

Conference paper

704 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5168))

Abstract

This Paper describes and evaluates a system of dynamic memory migraton for codes executing in a Non-Uniform Memory Access environment. This system of migration applies information about the load-imbalance within a workload in order to determine the affinity between threads of the application and regions of memory. This information then serves as the basis of migration decisions, with the object of minimising the NUMA distance between code and the memory it accesses. Results are presented which demonstrate the effectiveness of this technique in reducing the runtime of a set of representative HPC kernels.

Download to read the full chapter text

Chapter PDF

References

Jiang, D., Singh, J.P.: Scaling application performance on a cache-coherent multiprocessor. In: ISCA 1999: Proceedings of the 26th annual international symposium on Computer architecture, pp. 305–316. IEEE Computer Society, Los Alamitos (1999)
Google Scholar
Nordén, M., Löf, H., Rantakokko, J., Holmgren, S.: Geographical locality and dynamic data migration for OpenMP implementations of adaptive PDE solvers. In: Müller, M.S., Chapman, B.M., de Supinski, B.R., Malony, A.D., Voss, M. (eds.) IWOMP 2005 and IWOMP 2006. LNCS, vol. 4315. Springer, Heidelberg (2008)
Chapter Google Scholar
Scheurich, C., Dubois, M.: Dynamic page migration in multiprocessors with distributed global memory. IEEE Trans. Comput. 38(8), 1154–1163 (1989)
Article Google Scholar
Bull, J.M.: Feedback guided dynamic loop scheduling: Algorithms and experiments. In: Pritchard, D., Reeve, J.S. (eds.) Euro-Par 1998. LNCS, vol. 1470, pp. 377–382. Springer, Heidelberg (1998)
Chapter Google Scholar
Bartal, Y., Charikar, M., Indyk, P.: On page migration and other relaxed task systems. Theoretical Computer Science 268(1), 43–66 (2001)
Article MATH MathSciNet Google Scholar
Nikolopoulos, D.S., Papatheodorou, T.S., Polychronopoulos, C.D., Labarta, J., Ayguado, E.: A case for user-level dynamic page migration. In: ICS 2000: Proceedings of the 14th international conference on Supercomputing, pp. 119–130. ACM Press, New York (2000)
Chapter Google Scholar
Corbalan, J., Martorell, X., Labarta, J.: Evaluation of the memory page migration influence in the system performance: the case of the SGI Origin 2000. In: ICS 2003: Proceedings of the 17th annual International Conference on Supercomputing, pp. 121–129. ACM Press, New York (2003)
Chapter Google Scholar
LaRowe Jr., R.P., Wilkes, J.T., Ellis, C.S.: Exploiting operating system support for dynamic page placement on a NUMA shared memory multiprocessor. In: Proceedings of the 3rd ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming, Williamsburg, VA, April 1991, vol. 26(7), pp. 122–132 (1991)
Google Scholar
Chandra, R., Devine, S., Verghese, B., Gupta, A., Rosenblum, M.: Scheduling and page migration for multiprocessor compute servers. In: ASPLOS-VI: Proceedings of the sixth international conference on Architectural support for programming languages and operating systems, pp. 12–24. ACM Press, New York (1994)
Chapter Google Scholar
SGI Incorporated: Speedshop user’s guide. Technical Report 007-3311-003, SGI, Mountain View, CA (2003)
Google Scholar
Verghese, B., Devine, S., Gupta, A., Rosenblum, M.: Operating system support for improving data locality on ccNUMA compute servers. In: ASPLOS-VII: Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, pp. 279–289. ACM Press, New York (1996)
Chapter Google Scholar
Black, D., Sleator, D.: Competitive algorithms for replication and migration problems. Technical Report CMU-CS-89-201, Department of Computer Science, Carnegie-Mellon University (1989)
Google Scholar
Petersen, K., Li, K.: An evaluation of multiprocessor cache coherence based on virtual memory support. In: Proceedings of the 8th International Symposium on Parallel Processing, pp. 158–164. IEEE Computer Society, Los Alamitos (1994)
Chapter Google Scholar
Tikir, M.M., Hollingsworth, J.K.: Using hardware counters to automatically improve memory performance. In: SC 2004: Proceedings of the ACM/IEEE SC2004 Conference (SC 2004), p. 46. IEEE Computer Society, Los Alamitos (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Novel Computing, School of Computer Science, The University of Manchester, Manchester, M13 9PL
Paul Slavin & Len Freeman

Authors

Paul Slavin
View author publications
You can also search for this author in PubMed Google Scholar
Len Freeman
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Emilio Luque Tomàs Margalef Domingo Benítez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Slavin, P., Freeman, L. (2008). Integrating Dynamic Memory Placement with Adaptive Load-Balancing for Parallel Codes on NUMA Multiprocessors. In: Luque, E., Margalef, T., Benítez, D. (eds) Euro-Par 2008 – Parallel Processing. Euro-Par 2008. Lecture Notes in Computer Science, vol 5168. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85451-7_30

Download citation

DOI: https://doi.org/10.1007/978-3-540-85451-7_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85450-0
Online ISBN: 978-3-540-85451-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics