short-paper

Selective off-loading to Memory: Task Partitioning and Mapping for PIM-enabled Heterogeneous Systems

Authors:
Dawen Xu

Hefei University of Technology, Hefei, China

Hefei University of Technology, Hefei, China
View Profile

,
Yi Liao

Hefei University of Technology, Hefei, China

Hefei University of Technology, Hefei, China
View Profile

,
Ying Wang

Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China

Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
View Profile

,
Huawei Li

Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China

Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
View Profile

,
Xiaowei Li

Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China

Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
View Profile

Authors Info & Claims

CF'17: Proceedings of the Computing Frontiers ConferenceMay 2017Pages 255–258https://doi.org/10.1145/3075564.3075584

Published:15 May 2017Publication History

CF'17: Proceedings of the Computing Frontiers Conference

Pages 255–258

ABSTRACT

Processing-in-Memory (PIM) is returning as a promising solution to address the issue of memory wall as computing systems gradually step into the big data era. Researchers continually proposed various PIM architecture combined with novel memory device or 3D integration technology, but it is still a lack of universal task scheduling method in terms of the new heterogeneous platform. In this paper, we propose a formalized model to quantify the performance and energy of the PIM+CPU heterogeneous parallel system. In addition, we are the first to build a task partitioning and mapping framework to exploit different PIM engines. In this framework, an application is divided into subtasks and mapped onto appropriate execution units based on the proposed PIM-oriented Earliest-Finish-Time (PEFT) algorithm to maximize the performance gains brought by PIM. Experimental evaluations show our PIM-aware framework significantly improves the system performance compared to conventional processor architectures.

References

Kozyrakis C.E, et al., Scalable processors in the billion-transistor era: IRAM. Computer, 1997, 30(9): pp. 75--78. Google ScholarDigital Library
Ahn J, et al., A scalable processing-in-memory accelerator for parallel graph processing. In Proc. of ISCA, 2015, pp. 105--117. Google ScholarDigital Library
Zhang D, et al. TOP-PIM: throughput-oriented programmable processing in memory. In Proc. of HPDC, 2014, pp. 85--98. Google ScholarDigital Library
Pugsley S.H, et al. NDC: Analyzing the impact of 3D-stacked memory+ logic devices on MapReduce workloads. In Proc. of ISPASS, 2014, pp. 190--200.Google ScholarCross Ref
Ahn J, et al. Pim-enabled instructions: A low-overhead, locality-aware processing-in-memory architecture. In Proc. of ISCA, 2015, pp. 336--348. Google ScholarDigital Library
Johnson R.C, Efficient program analysis using dependence flow graphs. Ph.D. Dissertation. 1994, Cornell University. Google ScholarDigital Library
Topcuoglu H, Hariri S, and Wu M.Y, Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Transactions on Parallel and Distributed Systems, 2002, 13(3): pp. 260--274. Google ScholarDigital Library
Hybrid Memory Cube Consortium. Hybrid Memory Cube Specification 2.0, Tech. Rep., 2013.Google Scholar
Ubal R, et al. Multi2sim: A simulation framework to evaluate multicore-multithread processors. In Proc. of SBAC-PAD, 2007, pp. 62--68.Google ScholarCross Ref
Chen K, et al. CACTI-3DD: Architecture-level modeling for 3D die-stacked DRAM main memory. In Proc. of DATE, 2012, pp. 33--38. Google ScholarDigital Library

Index Terms

Selective off-loading to Memory: Task Partitioning and Mapping for PIM-enabled Heterogeneous Systems
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Heterogeneous (hybrid) systems

Recommendations

GP-SIMD Processing-in-Memory

GP-SIMD, a novel hybrid general-purpose SIMD computer architecture, resolves the issue of data synchronization by in-memory computing through combining data storage and massively parallel processing. GP-SIMD employs a two-dimensional access memory with ...
Read More
A Memory Access Scheduling Method for Multi-core Processor
IWCSE '09: Proceedings of the 2009 Second International Workshop on Computer Science and Engineering - Volume 01

It is well known fact that multi-core processor architecture is the mainstream of the next-generation microprocessor architecture and actualizes by Chip Multi-core Processors (CMP). As the number of cores per processor and the number of threaded ...
Read More
Resistive GP-SIMD Processing-In-Memory

GP-SIMD, a novel hybrid general-purpose SIMD architecture, addresses the challenge of data synchronization by in-memory computing, through combining data storage and massive parallel processing. In this article, we explore a resistive implementation of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CF'17: Proceedings of the Computing Frontiers Conference
May 2017
450 pages
ISBN:9781450344876
DOI:10.1145/3075564
General Chair:
Roberto Giorgi
University of Siena, IT
,
Program Chairs:
Michela Becchi
North Carolina State University
,
Francesca Palumbo
University of Sassari, IT
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 May 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Architecture
Mapping
Memory Wall
PIM
Qualifiers
- short-paper
- Research
- Refereed limited
Conference

Acceptance Rates
CF'17 Paper Acceptance Rate43of87submissions,49%Overall Acceptance Rate240of680submissions,35%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 224
  Total Downloads
- Downloads (Last 12 months)10
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Selective off-loading to Memory: Task Partitioning and Mapping for PIM-enabled Heterogeneous Systems

CF'17: Proceedings of the Computing Frontiers Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

GP-SIMD Processing-in-Memory

A Memory Access Scheduling Method for Multi-core Processor

Resistive GP-SIMD Processing-In-Memory