In-Memory Data Parallel Processor

Authors:
Daichi Fujiki

University of Michigan, Ann Arbor, MI, USA

University of Michigan, Ann Arbor, MI, USA
View Profile

,
Scott Mahlke

University of Michigan, Ann Arbor, MI, USA

University of Michigan, Ann Arbor, MI, USA
View Profile

,
Reetuparna Das

University of Michigan, Ann Arbor, MI, USA

University of Michigan, Ann Arbor, MI, USA
View Profile

ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating SystemsMarch 2018Pages 1–14https://doi.org/10.1145/3173162.3173171

Published:19 March 2018Publication History

ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems

Pages 1–14

ABSTRACT

Recent developments in Non-Volatile Memories (NVMs) have opened up a new horizon for in-memory computing. Despite the significant performance gain offered by computational NVMs, previous works have relied on manual mapping of specialized kernels to the memory arrays, making it infeasible to execute more general workloads. We combat this problem by proposing a programmable in-memory processor architecture and data-parallel programming framework. The efficiency of the proposed in-memory processor comes from two sources: massive parallelism and reduction in data movement. A compact instruction set provides generalized computation capabilities for the memory array. The proposed programming framework seeks to leverage the underlying parallelism in the hardware by merging the concepts of data-flow and vector processing. To facilitate in-memory programming, we develop a compilation framework that takes a TensorFlow input and generates code for our in-memory processor. Our results demonstrate 7.5x speedup over a multi-core CPU server for a set of applications from Parsec and 763x speedup over a server-class GPU for a set of Rodinia benchmarks.

References

Martin Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Man, Rajat Monga, Sherry Moore, Derek Murray, Jon Shlens, Benoit Steiner, Ilya Sutskever, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Oriol Vinyals, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. http://download.tensorflow.org/paper/whitepaper2015.pdfGoogle Scholar
Nilmini Abeyratne, Reetuparna Das, Qingkun Li, Korey Sewell, Bharan Giridhar, Ronald G Dreslinski, David Blaauw, and Trevor Mudge. 2013. Scaling towards kilo-core processors with asymmetric high-radix topologies High Performance Computer Architecture (HPCA2013), 2013 IEEE 19th International Symposium on. IEEE, 496--507. Google ScholarDigital Library
Shaizeen Aga, Supreet Jeloka, Arun Subramaniyan, Satish Narayanasamy, David Blaauw, and Reetuparna Das. 2017. Compute Caches 2017 IEEE International Symposium on High Performance Computer Architecture, HPCA 2017, Austin, TX, USA, February 4--8, 2017. IEEE.Google Scholar
J. Ahn, S. Hong, S. Yoo, O. Mutlu, and K. Choi. 2015. A scalable processing-in-memory accelerator for parallel graph processing 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA). 105--117. Google ScholarDigital Library
Chris Yakopcic and Tarek M Taha. 2013. Energy efficient perceptron pattern recognition using segmented memristor crossbar arrays. In Neural Networks (IJCNN), The 2013 International Joint Conference on. IEEE, 1--8.Google Scholar
Dongping Zhang, Nuwan Jayasena, Alexander Lyashevsky, Joseph L. Greathouse, Lifan Xu, and Michael Ignatowski. 2014. TOP-PIM: Throughput-oriented Programmable Processing in Memory Proceedings of the 23rd International Symposium on High-performance Parallel and Distributed Computing (HPDC '14). Google ScholarDigital Library
Qiuling Zhu, B. Akin, H.E. Sumbul, F. Sadi, J.C. Hoe, L. Pileggi, and F. Franchetti. 2013. A 3D-stacked logic-in-memory accelerator for application-specific data intensive computing. In 3D Systems Integration Conference (3DIC), 2013 IEEE International.Google Scholar

Index Terms

In-Memory Data Parallel Processor
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Analog computers
2. Hardware
  1. Integrated circuits
    1. Semiconductor memory
      1. Non-volatile memory

Recommendations

In-Memory Data Parallel Processor
ASPLOS '18

Recent developments in Non-Volatile Memories (NVMs) have opened up a new horizon for in-memory computing. Despite the significant performance gain offered by computational NVMs, previous works have relied on manual mapping of specialized kernels to the ...
Read More
Resistive GP-SIMD Processing-In-Memory

GP-SIMD, a novel hybrid general-purpose SIMD architecture, addresses the challenge of data synchronization by in-memory computing, through combining data storage and massive parallel processing. In this article, we explore a resistive implementation of ...
Read More
A Novel ReRAM-Based Processing-in-Memory Architecture for Graph Traversal
Special Issue on NVM and Storage

Graph algorithms such as graph traversal have been gaining ever-increasing importance in the era of big data. However, graph processing on traditional architectures issues many random and irregular memory accesses, leading to a huge number of data ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems
March 2018
827 pages
ISBN:9781450349116
DOI:10.1145/3173162
General Chairs:
Xipeng Shen
North Carolina State University, USA
,
James Tuck
North Carolina State University, USA
,
Program Chairs:
Ricardo Bianchini
Microsoft Research, USA
,
Vivek Sarkar
Georgia Institute of Technology, USA
ACM SIGPLAN Notices Volume 53, Issue 2
ASPLOS '18
February 2018
809 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/3296957
Editor:
Matthew Fluet
Rodchester Institude of Technology
Issue’s Table of Contents
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 March 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
compilation support
data-parallel execution
in-memory computing
reram
spatial architecture
Qualifiers
- research-article
Conference

Acceptance Rates
ASPLOS '18 Paper Acceptance Rate56of319submissions,18%Overall Acceptance Rate535of2,713submissions,20%
More
Upcoming Conference
ASPLOS '24

Sponsor:

sigarch

sigarch

sigarch

29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems

April 27 - May 1, 2024

La Jolla , CA , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 111
  Total Citations
  View Citations
- 3,527
  Total Downloads
- Downloads (Last 12 months)614
- Downloads (Last 6 weeks)66
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

In-Memory Data Parallel Processor

ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

In-Memory Data Parallel Processor

Resistive GP-SIMD Processing-In-Memory

A Novel ReRAM-Based Processing-in-Memory Architecture for Graph Traversal