skip to main content
10.1145/3092255.3092273acmconferencesArticle/Chapter ViewAbstractPublication PagesismmConference Proceedingsconference-collections
research-article

RTHMS: a tool for data placement on hybrid memory system

Published: 18 June 2017 Publication History

Abstract

Traditional scientific and emerging data analytics applications require fast, power-efficient, large, and persistent memories. Combining all these characteristics within a single memory technology is expensive and hence future supercomputers will feature different memory technologies side-by-side. However, it is a complex task to program hybrid-memory systems and to identify the best object-to-memory mapping. We envision that programmers will probably resort to use default configurations that only require minimal interventions on the application code or system settings. In this work, we argue that intelligent, fine-grained data placement can achieve higher performance than default setups.
We present an algorithm for data placement on hybrid-memory systems. Our algorithm is based on a set of single-object allocation rules and global data placement decisions. We also present RTHMS, a tool that implements our algorithm and provides recommendations about the object-to-memory mapping. Our experiments on a hybrid memory system, an Intel Knights Landing processor with DRAM and HBM, show that RTHMS is able to achieve higher performance than the default configuration. We believe that RTHMS will be a valuable tool for programmers working on complex hybrid-memory systems.

References

[1]
The CORAL Benchmarks. https://asc.llnl.gov/CORALbenchmarks/, 2017. {Online; accessed 15-Janurary-2017}. Rodinia:Accelerating Compute-Intensive Applications with Accelerators. https://www.cs.virginia.edu/~skadron/wiki/rodinia/ index.php/, 2017. {Online; accessed 15-Janurary-2017}. The DGEMM Benchmark. http://www.nersc.gov/research-anddevelopment/apex/apex-benchmarks/dgemm/, 2017. {Online; accessed 15-Janurary-2017}. Knights Landing (KNL) Testing & Development Platform. http://www. archer.ac.uk/documentation/knl-guide/, 2017.
[2]
{Online; accessed 15-Janurary-2017}. The Graph500 Benchmark. http://www.graph500.org/, 2017. {Online; accessed 15-Janurary-2017}. Multithreaded Transposition of Square Matrices with Common Code for Intel Xeon Processors and Intel Xeon Phi Coprocessors. https: //colfaxresearch.com/multithreaded-transpositionof-square-matrices-with-common-code-for-intel-xeonprocessors-and-intel-xeon-phi-coprocessors/, 2017.
[3]
{Online; accessed 15-Janurary-2017}. Xsbench: The Monte Carlo macroscopic cross section lookup benchmark. https://github.com/ANL-CESAR/XSBench, 2017. {Online; accessed 01-January-2017}. J. Absar and F. Catthoor. Analysis of scratch-pad and data-cache performance using statistical methods. In Asia and South Pacific Conference on Design Automation, 2006., pages 6 pp.–, 2006.
[4]
N. Chatterjee, M. Shevgoor, R. Balasubramonian, A. Davis, Z. Fang, R. Illikkal, and R. Iyer. Leveraging heterogeneity in dram main memories to accelerate critical word access. In Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-45, pages 13–24, 2012.
[5]
S. R. Dulloor, A. Roy, Z. Zhao, N. Sundaram, N. Satish, R. Sankaran, J. Jackson, and K. Schwan. Data tiering in heterogeneous memory systems. In Proceedings of the Eleventh European Conference on Computer Systems, page 15. ACM, 2016.
[6]
A. Hassan, H. Vandierendonck, and D. S. Nikolopoulos. Software-managed energy-efficient hybrid DRAM/NVM main memory. In Proceedings of the 12th ACM International Conference on Computing Frontiers, CF ’15, pages 23:1–23:8, 2015.
[7]
I. Karlin, A. Bhatele, J. Keasler, B. L. Chamberlain, J. Cohen, Z. Devito, R. Haque, D. Laney, E. Luke, F. Wang, et al. Exploring traditional and emerging parallel programming models using a proxy application. In Parallel & Distributed Processing (IPDPS), 2013 IEEE 27th International Symposium on, pages 919–932. IEEE, 2013.
[8]
G. Kestor, R. Gioiosa, D. J. Kerbyson, and A. Hoisie. Quantifying the energy cost of data movement in scientific applications. In 2013 IEEE International Symposium on Workload Characterization (IISWC), pages 56–65, 2013.
[9]
D. Li, J. S. Vetter, G. Marin, C. McCurdy, C. Cira, Z. Liu, and W. Yu. Identifying opportunities for byte-addressable non-volatile memory in extreme-scale scientific applications. In Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium, IPDPS ’12, pages 945–956, 2012.
[10]
C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In Acm sigplan notices, volume 40, pages 190–200. ACM, 2005.
[11]
J. A. Mandelman, R. H. Dennard, G. B. Bronner, J. K. DeBrosse, R. Divakaruni, Y. Li, and C. J. Radens. Challenges and future directions for the scaling of dynamic random-access memory (dram). IBM J. Res. Dev., 46(2-3):187–212, 2002.
[12]
J. D. McCalpin. A survey of memory bandwidth and machine balance in current high performance computers. IEEE TCCA Newsletter, pages 19– 25, 1995.
[13]
I. B. Peng, S. Markidis, E. Laure, G. Kestor, and R. Gioiosa. Exploring application performance on emerging hybrid-memory supercomputers. In High Performance Computing and Communications; 2016 IEEE 18th International Conference on, pages 473–480. IEEE, 2016.
[14]
I. B. Peng, R. Gioiosa, G. Kestor, P. Cicotti, E. Laure, and S. Markidis. Exploring the performance benefit of hybrid memory system on HPC environments. In Parallel and Distributed Processing Symposium Workshops, 2017 IEEE International. IEEE, 2017.
[15]
L. E. Ramos, E. Gorbatov, and R. Bianchini. Page placement in hybrid memory systems. In Proceedings of the international conference on Supercomputing, pages 85–95. ACM, 2011.
[16]
D. Shen, X. Liu, and F. X. Lin. Characterizing emerging heterogeneous memory. In Proceedings of the 2016 ACM SIGPLAN International Symposium on Memory Management, pages 13–23. ACM, 2016.
[17]
B. Wang, B. Wu, D. Li, X. Shen, W. Yu, Y. Jiao, and J. S. Vetter. Exploring hybrid memory for gpu energy efficiency through software-hardware codesign. In Proceedings of the 22nd international conference on Parallel architectures and compilation techniques, pages 93–102. IEEE Press, 2013.
[18]
W. Wei, D. Jiang, S. A. McKee, J. Xiong, and M. Chen. Exploiting program semantics to place data in hybrid memory. In Proceedings of the 2015 International Conference on Parallel Architecture and Compilation (PACT), PACT ’15, pages 163–173, 2015.

Cited By

View all
  • (2025)The ECP SICM projectInternational Journal of High Performance Computing Applications10.1177/1094342024128824339:1(193-207)Online publication date: 1-Jan-2025
  • (2023)Flexible and Effective Object Tiering for Heterogeneous Memory SystemsProceedings of the 2023 ACM SIGPLAN International Symposium on Memory Management10.1145/3591195.3595277(163-175)Online publication date: 6-Jun-2023
  • (2022)Characterizing and Optimizing Hybrid DRAM-PM Main Memory System with Application Awareness2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE54114.2022.9774718(879-884)Online publication date: 14-Mar-2022
  • Show More Cited By

Index Terms

  1. RTHMS: a tool for data placement on hybrid memory system

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        ISMM 2017: Proceedings of the 2017 ACM SIGPLAN International Symposium on Memory Management
        June 2017
        127 pages
        ISBN:9781450350440
        DOI:10.1145/3092255
        © 2017 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 18 June 2017

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. data placement
        2. heterogeneous memory systems
        3. performance metrics

        Qualifiers

        • Research-article

        Funding Sources

        • the DOE Office of Science Advanced Scientific Computing Research through the ARGO project
        • the DOE Office of Science Advanced Scientific Computing Research through the CENATE project
        • the European Commission through the SAGE project

        Conference

        ISMM '17
        Sponsor:

        Acceptance Rates

        Overall Acceptance Rate 72 of 156 submissions, 46%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)37
        • Downloads (Last 6 weeks)1
        Reflects downloads up to 25 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2025)The ECP SICM projectInternational Journal of High Performance Computing Applications10.1177/1094342024128824339:1(193-207)Online publication date: 1-Jan-2025
        • (2023)Flexible and Effective Object Tiering for Heterogeneous Memory SystemsProceedings of the 2023 ACM SIGPLAN International Symposium on Memory Management10.1145/3591195.3595277(163-175)Online publication date: 6-Jun-2023
        • (2022)Characterizing and Optimizing Hybrid DRAM-PM Main Memory System with Application Awareness2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE54114.2022.9774718(879-884)Online publication date: 14-Mar-2022
        • (2022)Online Application Guidance for Heterogeneous Memory SystemsACM Transactions on Architecture and Code Optimization10.1145/353385519:3(1-27)Online publication date: 6-Jul-2022
        • (2021)NumaPerfProceedings of the 35th ACM International Conference on Supercomputing10.1145/3447818.3460361(52-62)Online publication date: 3-Jun-2021
        • (2021)Optimizing large-scale plasma simulations on persistent memory-based heterogeneous memory with effective data placement across memory hierarchyProceedings of the 35th ACM International Conference on Supercomputing10.1145/3447818.3460356(203-214)Online publication date: 3-Jun-2021
        • (2021)Resource abstraction and data placement for distributed hybrid memory poolFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-020-9448-715:3Online publication date: 1-Jun-2021
        • (2020)Generalized data placement strategies for racetrack memoriesProceedings of the 23rd Conference on Design, Automation and Test in Europe10.5555/3408352.3408693(1502-1507)Online publication date: 9-Mar-2020
        • (2020)Generalized Data Placement Strategies for Racetrack Memories2020 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE48585.2020.9116245(1502-1507)Online publication date: Mar-2020
        • (2020)Performance Potential of Mixed Data Management Modes for Heterogeneous Memory Systems2020 IEEE/ACM Workshop on Memory Centric High Performance Computing (MCHPC)10.1109/MCHPC51950.2020.00007(10-16)Online publication date: Nov-2020
        • Show More Cited By

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media