skip to main content
10.1145/3609308.3625266acmconferencesArticle/Chapter ViewAbstractPublication PagessospConference Proceedingsconference-collections
research-article

A Full-System Perspective on UPMEM Performance

Published:23 October 2023Publication History

ABSTRACT

Recently, UPMEM has introduced the first commercially available processing in memory (PIM) platform. Its key feature are DRAM memory chips with built-in RISC CPUs for in-memory data processing. Naturally, this has sparked interest in the research community, which previously was limited to PIM simulators and custom FPGA prototypes. One result of this is the PrIM benchmark suite that combines an in-depth analysis of PIM performance with benchmarks that measure the speedup of PIM over processing on conventional CPUs and GPUs [10]. However, the current generation of UPMEM PIM faces limitations such as memory interleaving, and as such does not provide true in-memory computing. Applications must store data in DRAM and transfer it to/from UPMEM modules for processing, which behave just like computational offloading engines from this perspective. This paper examines the ramifications of treating them as such in comparative performance benchmarks. By extending the PrIM suite to address the challenges that computational offloading benchmarks face, we show that such a full-system perspective can drastically alter offloading recommendations, with 9 of 11 previously UPMEM-friendly benchmarks now performing best on a conventional server CPU.

References

  1. D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, L. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber, H. D. Simon, V. Venkatakrishnan, and S. K. Weeratunga. 1991. The NAS Parallel Benchmarks---summary and Preliminary Results. In Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Albuquerque, New Mexico, USA) (Supercomputing '91). Association for Computing Machinery, New York, NY, USA, 158--165. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Alexander Baumstark, Muhammad Attahir Jibril, and Kai-Uwe Sattler. 2023. Accelerating Large Table Scan using Processing-In-Memory Technology. In BTW 2023. Gesellschaft für Informatik e.V., Bonn, 797--814. Google ScholarGoogle ScholarCross RefCross Ref
  3. Stefano Corda, Madhurya Kumaraswamy, Ahsan Javed Awan, Roel Jordans, Akash Kumar, and Henk Corporaal. 2021. NMPO: Near-Memory Computing Profiling and Offloading. In 2021 24th Euromicro Conference on Digital System Design (DSD). 259--267. Google ScholarGoogle ScholarCross RefCross Ref
  4. Stefano Corda, Gagandeep Singh, Ahsan Jawed Awan, Roel Jordans, and Henk Corporaal. 2019. Platform Independent Software Analysis for Near Memory Computing. In 2019 22nd Euromicro Conference on Digital System Design (DSD). 606--609. Google ScholarGoogle ScholarCross RefCross Ref
  5. Andrew Davison. 1995. Twelve Ways to Fool the Masses When Giving Performance Results on Parallel Computers. Supercomputing Review (August 1995), 54--55.Google ScholarGoogle Scholar
  6. Fabrice Devaux. 2019. The true Processing In Memory accelerator. In 2019 IEEE Hot Chips 31 Symposium (HCS). 1--24. Google ScholarGoogle ScholarCross RefCross Ref
  7. François Duhem, Fabrice Muller, and Philippe Lorenzini. 2011. FaRM: Fast Reconfiguration Manager for Reducing Reconfiguration Time Overhead on FPGA. In Reconfigurable Computing: Architectures, Tools and Applications, Andreas Koch, Ram Krishnamurthy, John McAllister, Roger Woods, and Tarek El-Ghazawi (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 253--260.Google ScholarGoogle Scholar
  8. Scott Grauer-Gray, Lifan Xu, Robert Searles, Sudhee Ayalasomayajula, and John Cavazos. 2012. Auto-tuning a high-level language targeted to GPU codes. In 2012 Innovative Parallel Computing (InPar). 1--10. Google ScholarGoogle ScholarCross RefCross Ref
  9. Khronos OpenCL Working Group. 2023. The OpenCL specification version 3.0.14. (2023). https://registry.khronos.org/OpenCL/specs/3.0-unified/pdf/OpenCL_API.pdfGoogle ScholarGoogle Scholar
  10. Juan Gómez-Luna, Izzat El Hajj, Ivan Fernandez, Christina Giannoula, Geraldo F. Oliveira, and Onur Mutlu. 2022. Benchmarking a New Paradigm: Experimental Analysis and Characterization of a Real Processing-in-Memory System. IEEE Access 10 (2022), 52565--52608. Google ScholarGoogle ScholarCross RefCross Ref
  11. Torsten Hoefler and Roberto Belli. 2015. Scientific Benchmarking of Parallel Computing Systems: Twelve Ways to Tell the Masses When Reporting Performance Results. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (Austin, Texas) (SC '15). Association for Computing Machinery, New York, NY, USA, Article 73, 12 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Cheol-Ho Hong, Ivor Spence, and Dimitrios S. Nikolopoulos. 2017. GPU Virtualization and Scheduling Methods: A Comprehensive Survey. ACM Comput. Surv. 50, 3, Article 35 (jun 2017), 37 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Nina Ihde, Paula Marten, Ahmed Eleliemy, Gabrielle Poerwawinata, Pedro Silva, Ilin Tolovski, Florina M. Ciorba, and Tilmann Rabl. 2022. A Survey of Big Data, High Performance Computing, and Machine Learning Benchmarks. In Performance Evaluation and Benchmarking, Raghunath Nambiar and Meikel Poess (Eds.). Springer International Publishing, Cham, 98--118.Google ScholarGoogle Scholar
  14. Donghun Lee, Andrew Chang, Minseon Ahn, Jongmin Gim, Jungmin Kim, Jaemin Jung, Kang-Woo Choi, Vincent Pham, Oliver Rebholz, Krishna T. Malladi, and Yang-Seok Ki. 2020. Optimizing Data Movement with Near-Memory Acceleration of In-memory DBMS. In Proceedings of the 23rd International Conference on Extending Database Technology, EDBT 2020, Copenhagen, Denmark, March 30 - April 02, 2020, Angela Bonifati, Yongluan Zhou, Marcos Antonio Vaz Salles, Alexander Böhm, Dan Olteanu, George H. L. Fletcher, Arijit Khan, and Bin Yang (Eds.). OpenProceedings.org, 371--374. Google ScholarGoogle ScholarCross RefCross Ref
  15. Victor W. Lee, Changkyu Kim, Jatin Chhugani, Michael Deisher, Daehyun Kim, Anthony D. Nguyen, Nadathur Satish, Mikhail Smelyanskiy, Srinivas Chennupaty, Per Hammarlund, Ronak Singhal, and Pradeep Dubey. 2010. Debunking the 100X GPU vs. CPU Myth: An Evaluation of Throughput Computing on CPU and GPU. SIGARCH Comput. Archit. News 38, 3 (jun 2010), 451--460. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Kyprianos Papadimitriou, Apostolos Dollas, and Scott Hauck. 2011. Performance of Partial Reconfiguration in FPGA Systems: A Survey and a Cost Model. ACM Trans. Reconfigurable Technol. Syst. 4, 4, Article 36 (dec 2011), 24 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Albert Reuther, Peter Michaleas, Michael Jones, Vijay Gadepally, Siddharth Samsi, and Jeremy Kepner. 2019. Survey and Benchmarking of Machine Learning Accelerators. In 2019 IEEE High Performance Extreme Computing Conference (HPEC). 1--9. Google ScholarGoogle ScholarCross RefCross Ref
  18. Robert Schmid, Max Plauth, Lukas Wenzel, Felix Eberhardt, and Andreas Polze. 2020. Accessible Near-Storage Computing with FPGAs. In Proceedings of the Fifteenth European Conference on Computer Systems (Heraklion, Greece) (EuroSys '20). Association for Computing Machinery, New York, NY, USA, Article 28, 12 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Janet Tseng, Ren Wang, James Tsai, Yipeng Wang, and Tsung-Yuan Charlie Tai. 2017. Accelerating Open VSwitch with Integrated GPU. In Proceedings of the Workshop on Kernel-Bypass Networks (Los Angeles, CA, USA) (KBNets '17). Association for Computing Machinery, New York, NY, USA, 7--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Yash Ukidave, Fanny Nina Paravecino, Leiming Yu, Charu Kalra, Amir Momeni, Zhongliang Chen, Nick Materise, Brett Daley, Perhaad Mistry, and David Kaeli. 2015. NUPAR: A Benchmark Suite for Modern GPU Architectures. In Proceedings of the 6th ACM/SPEC International Conference on Performance Engineering (Austin, Texas, USA) (ICPE '15). Association for Computing Machinery, New York, NY, USA, 253--264. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. UPMEM. 2023. UPMEM SDK. https://sdk.upmem.com/ version 2023.1.0.Google ScholarGoogle Scholar

Index Terms

  1. A Full-System Perspective on UPMEM Performance

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          DIMES '23: Proceedings of the 1st Workshop on Disruptive Memory Systems
          October 2023
          64 pages
          ISBN:9798400703003
          DOI:10.1145/3609308

          Copyright © 2023 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 23 October 2023

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          DIMES '23 Paper Acceptance Rate8of17submissions,47%Overall Acceptance Rate8of17submissions,47%

          Upcoming Conference

          SOSP '24
        • Article Metrics

          • Downloads (Last 12 months)164
          • Downloads (Last 6 weeks)26

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader