skip to main content
10.1145/3173162.3173171acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article
Public Access

In-Memory Data Parallel Processor

Published:19 March 2018Publication History

ABSTRACT

Recent developments in Non-Volatile Memories (NVMs) have opened up a new horizon for in-memory computing. Despite the significant performance gain offered by computational NVMs, previous works have relied on manual mapping of specialized kernels to the memory arrays, making it infeasible to execute more general workloads. We combat this problem by proposing a programmable in-memory processor architecture and data-parallel programming framework. The efficiency of the proposed in-memory processor comes from two sources: massive parallelism and reduction in data movement. A compact instruction set provides generalized computation capabilities for the memory array. The proposed programming framework seeks to leverage the underlying parallelism in the hardware by merging the concepts of data-flow and vector processing. To facilitate in-memory programming, we develop a compilation framework that takes a TensorFlow input and generates code for our in-memory processor. Our results demonstrate 7.5x speedup over a multi-core CPU server for a set of applications from Parsec and 763x speedup over a server-class GPU for a set of Rodinia benchmarks.

References

  1. Martin Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Man, Rajat Monga, Sherry Moore, Derek Murray, Jon Shlens, Benoit Steiner, Ilya Sutskever, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Oriol Vinyals, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. http://download.tensorflow.org/paper/whitepaper2015.pdfGoogle ScholarGoogle Scholar
  2. Nilmini Abeyratne, Reetuparna Das, Qingkun Li, Korey Sewell, Bharan Giridhar, Ronald G Dreslinski, David Blaauw, and Trevor Mudge. 2013. Scaling towards kilo-core processors with asymmetric high-radix topologies High Performance Computer Architecture (HPCA2013), 2013 IEEE 19th International Symposium on. IEEE, 496--507. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Shaizeen Aga, Supreet Jeloka, Arun Subramaniyan, Satish Narayanasamy, David Blaauw, and Reetuparna Das. 2017. Compute Caches 2017 IEEE International Symposium on High Performance Computer Architecture, HPCA 2017, Austin, TX, USA, February 4--8, 2017. IEEE.Google ScholarGoogle Scholar
  4. J. Ahn, S. Hong, S. Yoo, O. Mutlu, and K. Choi. 2015. A scalable processing-in-memory accelerator for parallel graph processing 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA). 105--117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Chris Yakopcic and Tarek M Taha. 2013. Energy efficient perceptron pattern recognition using segmented memristor crossbar arrays. In Neural Networks (IJCNN), The 2013 International Joint Conference on. IEEE, 1--8.Google ScholarGoogle Scholar
  6. Dongping Zhang, Nuwan Jayasena, Alexander Lyashevsky, Joseph L. Greathouse, Lifan Xu, and Michael Ignatowski. 2014. TOP-PIM: Throughput-oriented Programmable Processing in Memory Proceedings of the 23rd International Symposium on High-performance Parallel and Distributed Computing (HPDC '14). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Qiuling Zhu, B. Akin, H.E. Sumbul, F. Sadi, J.C. Hoe, L. Pileggi, and F. Franchetti. 2013. A 3D-stacked logic-in-memory accelerator for application-specific data intensive computing. In 3D Systems Integration Conference (3DIC), 2013 IEEE International.Google ScholarGoogle Scholar

Index Terms

  1. In-Memory Data Parallel Processor

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems
        March 2018
        827 pages
        ISBN:9781450349116
        DOI:10.1145/3173162
        • cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 53, Issue 2
          ASPLOS '18
          February 2018
          809 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/3296957
          Issue’s Table of Contents

        Copyright © 2018 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 19 March 2018

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        ASPLOS '18 Paper Acceptance Rate56of319submissions,18%Overall Acceptance Rate535of2,713submissions,20%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader