A programmable shared-memory system for an array of processing-in-memory devices
- Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
- Sogang Univ., Seoul (Republic of Korea)
Processing in memory (PIM), the concept of integrating processing directly with memory has been attracting a lot of attention, since PIM can assist in overcoming the throughput limitation caused by data movement between CPU and memory. The challenge, however, is that it requires the programmers to have a deep understanding of the PIM architecture to maximize the benefits such as data locality and parallel thread execution on multiple PIM devices. In this study, we present AnalyzeThat, a programmable shared-memory system for parallel data processing with PIM devices. Thematic to AnalyzeThat is a rich PIM-aware data structure (PADS), which is an encapsulation that integrally ties together the data, the analysis tasks and the runtime needed to interface with the PIM device array. The PADS abstraction provides (i) a sophisticated key-value data container that allows programmers to easily store data on multiple PIMs, (ii) a suite of parallel operations with which users can easily implement data analysis applications, and (iii) a runtime, hidden to programmers, which provides the mechanisms needed to overlay both the data and the tasks on the PIM device array in an intelligent fashion, based on PIM-specific information collected from the hardware. We have developed a PIM emulation framework called AnalyzeThat. In conclusion, our experimental evaluation with representative data analytics applications suggests that the proposed system can significantly reduce the PIM programming effort without losing its technology benefits.
- Research Organization:
- Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
- Sponsoring Organization:
- USDOE Office of Science (SC)
- Grant/Contract Number:
- AC05-00OR22725
- OSTI ID:
- 1468266
- Journal Information:
- Cluster Computing, Vol. 22; ISSN 1386-7857
- Publisher:
- SpringerCopyright Statement
- Country of Publication:
- United States
- Language:
- English
The missing memristor found
|
journal | May 2008 |
MapReduce: simplified data processing on large clusters
|
journal | January 2008 |
The International Exascale Software Project roadmap
|
journal | January 2011 |
Dynamo: amazon's highly available key-value store
|
journal | October 2007 |
FlashStore: high throughput persistent key-value store
|
journal | September 2010 |
SkewTune: mitigating skew in mapreduce applications
|
conference | January 2012 |
The architecture of the DIVA processing-in-memory chip
|
conference | January 2002 |
FlexRAM: Toward an advanced Intelligent Memory system
|
conference | September 2012 |
Phoenix++: modular MapReduce for shared-memory systems
|
conference | January 2011 |
A low cost, multithreaded processing-in-memory system
|
conference | January 2004 |
NDC: Analyzing the impact of 3D-stacked memory+logic devices on MapReduce workloads
|
conference | March 2014 |
A new perspective on processing-in-memory architecture design
|
conference | January 2013 |
Processing-in-memory technology for knowledge discovery algorithms
|
conference | January 2006 |
TOP-PIM: throughput-oriented programmable processing in memory
|
conference | January 2014 |
Phoenix rebirth: Scalable MapReduce on a large-scale shared-memory system
|
conference | October 2009 |
Mars: a MapReduce framework on graphics processors
|
conference | January 2008 |
AnalyzeThat: A Programmable Shared-Memory System for an Array of Processing-In-Memory Devices
|
conference | May 2017 |
A Comprehensive Performance Comparison of CUDA and OpenCL
|
conference | September 2011 |
Power-Law Distribution of the World Wide Web
|
journal | March 2000 |
Comparing Implementations of Near-Data Computing with In-Memory MapReduce Workloads
|
journal | July 2014 |
Similar Records
Data Locality Enhancement of Dynamic Simulations for Exascale Computing (Final Report)
HPC-Colony: Services and Interfaces to Aupport Systems With Very Large Numbers of Processors