research-article

Moving to memoryland: in-memory computation for existing applications

Author:
Pedro Trancoso

University of Cyprus, Nicosia, Cyprus

University of Cyprus, Nicosia, Cyprus
View Profile

CF '15: Proceedings of the 12th ACM International Conference on Computing FrontiersMay 2015Article No.: 32Pages 1–6https://doi.org/10.1145/2742854.2742874

Published:06 May 2015Publication History

CF '15: Proceedings of the 12th ACM International Conference on Computing Frontiers

Pages 1–6

ABSTRACT

Migrating computation to memory was proposed a long time ago as a way to overcome the memory bandwidth and latency bottleneck, as well as increase the computation parallelism. While the concept had been applied to several research projects it is only recently that the technological hurdles have been solved and we are able to see products arriving the market. While in most cases we need to concentrate on developing new algorithms and porting applications to new models as to fully exploit the potentials of the new products, we will still want to be able to execute efficiently existing applications. As such, in this work we focus on the analysis of the in-memory computation characteristics of existing applications in a way to evaluate how we would be able to have them move to "Memoryland".

We present a tool that analyses the locality of the memory accesses for the different routines in an application. The results observed from the execution of this tool on different applications are that while certain applications seem to be able to fit in a small granularity architecture (small memory-to-computation ratio), others have routines that require a large amount of data. Thus we believe that hierarchical in-memory processing architectures are a good fit for the demands of the different applications. In addition, results have shown that for most applications we can limit our analysis to the routines that issue the most memory accesses.

References

A. Anghel, G. Dittmann, R. Jongerius, and R. Luijten. Spatio-Temporal Locality Characterization. In Proceedings of the 1st Workshop on Near-Data Processing (WoNDP), 2013.Google Scholar
D. Bailey, E. Barszcz, J. Barton, D. Browning, R. Carter, L. Dagum, R. Fatoohi, P. Frederickson, T. Lasinski, R. Schreiber, H. Simon, V. Venkatakrishnan, and S. Weeratunga. The nas parallel benchmarks summary and preliminary results. In Supercomputing, 1991. Supercomputing '91. Proceedings of the 1991 ACM/IEEE Conference on, pages 158--165, Nov 1991. Google ScholarDigital Library
C. Bienia, S. Kumar, J. P. Singh, and K. Li. The parsec benchmark suite: Characterization and architectural implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, PACT '08, pages 72--81, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
L. Bonebakker, A. Over, and I. Sharapov. Working set characterization of applications with an efficient lru algorithm. In A. Horváth and M. Telek, editors, Formal Methods and Stochastic Models for Performance Evaluation, volume 4054 of Lecture Notes in Computer Science, pages 78--92. Springer Berlin Heidelberg, 2006. Google ScholarDigital Library
P. Dlugosch, D. Brown, P. Glendenning, M. Leventhal, and H. Noyes. An efficient and scalable semiconductor architecture for parallel automata processing. Parallel and Distributed Systems, IEEE Transactions on, PP(99): 1--1, 2014.Google Scholar
M. Gokhale, B. Holmes, and K. Iobst. Processing in memory: the terasys massively parallel pim array. Computer, 28(4): 23--31, Apr 1995. Google ScholarDigital Library
M. Hall, P. Kogge, J. Koller, P. Diniz, J. Chame, J. Draper, J. LaCoss, J. Granacki, J. Brockman, A. Srivastava, W. Athas, V. Freeh, J. Shin, and J. Park. Mapping irregular applications to diva, a pim-based data-intensive architecture. In Supercomputing, ACM/IEEE 1999 Conference, pages 57--57, Nov 1999. Google ScholarDigital Library
Y. Kang, W. Huang, S.-M. Yoo, D. Keen, Z. Ge, V. Lam, P. Pattnaik, and J. Torrellas. Flexram: toward an advanced intelligent memory system. In Computer Design, 1999. (ICCD '99) International Conference on, pages 192--201, 1999. Google ScholarDigital Library
G. H. Loh, N. Jayasena, M. Oskin, M. Nutter, D. Roberts, M. Meswani, D. P. Zhang, and M. Ignatowski. A Processing in Memory Taxonomy and a Case for Studying Fixed-function PIM. In Proceedings of the Workshop on Near-Data Processing (WoNDP).Google Scholar
V. T. Ltd. TOMI Technology Implementations. http://www.venraytechnology.com/Implementations.htm, 2014.Google Scholar
C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: Building customized program analysis tools with dynamic instrumentation. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '05, pages 190--200, New York, NY, USA, 2005. ACM. Google ScholarDigital Library
R. C. Murphy, K. B. Wheeler, B. W. Barrett, and J. A. Ang. Introducing the graph 500. Cray User's Group (CUG), May 2010.Google Scholar
M. Oskin, F. Chong, and T. Sherwood. Active pages: a computation model for intelligent memory. In Computer Architecture, 1998. Proceedings. The 25th Annual International Symposium on, pages 192--203, Jun 1998. Google ScholarDigital Library
D. Patterson, T. Anderson, N. Cardwell, R. Fromm, K. Keeton, C. Kozyrakis, R. Thomas, and K. Yelick. Intelligent ram (iram): chips that remember and compute. In Solid-State Circuits Conference, 1997. Digest of Technical Papers. 43rd ISSCC., 1997 IEEE International, pages 224--225, Feb 1997.Google ScholarCross Ref
J. Weinberg, M. O. McCracken, E. Strohmaier, and A. Snavely. Quantifying Locality In The Memory Access Patterns of HPC Applications. In Proceedings of the 2005 ACM/IEEE Conference on Supercomputing, SC '05, pages 50--, Washington, DC, USA, 2005. IEEE Computer Society. Google ScholarDigital Library

Index Terms

Moving to memoryland: in-memory computation for existing applications
1. Hardware
  1. Hardware validation
  2. Integrated circuits
    1. Semiconductor memory
      1. Dynamic memory

Recommendations

CORUSCANT: Fast Efficient Processing-in-Racetrack Memories
MICRO '22: Proceedings of the 55th Annual IEEE/ACM International Symposium on Microarchitecture

The growth in data needs of modern applications has created significant challenges for modern systems leading to a "memory wall." Spintronic Domain-Wall Memory (DWM), provides near-SRAM read/write performance, energy savings and non-volatility, ...
Read More
PIPF-DRAM: processing in precharge-free DRAM
DAC '22: Proceedings of the 59th ACM/IEEE Design Automation Conference

To alleviate costly data communication among processing cores and memory modules, parallel processing-in-memory (PIM) is a promising approach which exploits the huge available internal memory bandwidth. High capacity, wide row size, and maturity of DRAM ...
Read More
FePIM: Contention-Free In-Memory Computing Based on Ferroelectric Field-Effect Transistors
ASPDAC '21: Proceedings of the 26th Asia and South Pacific Design Automation Conference

The memory wall bottleneck has caused a large portion of the energy to be consumed by data transfer between processors and memories when dealing with data-intensive workloads. By giving some processing abilities to memories, processing-in-memory (PIM) is ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CF '15: Proceedings of the 12th ACM International Conference on Computing Frontiers
May 2015
413 pages
ISBN:9781450333580
DOI:10.1145/2742854
General Chairs:
Claudia Di Napoli
Istituto di Calcolo e Reti ad Alte Prestazioni, CNR, ITALY
,
Valentina Salapura
IBM T. J. Watson Research Center
,
Program Chairs:
Hubertus Franke
IBM T.J.Watson Research Center
,
Rui Hou
Institute for Computing Technology, Chinese Academy of Sciences, PRC
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 6 May 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
hierarchical processing
memory footprint characterization
processing-in-memory
Qualifiers
- research-article
Conference

Acceptance Rates
CF '15 Paper Acceptance Rate33of96submissions,34%Overall Acceptance Rate240of680submissions,35%
More
Upcoming Conference
CF '24

Sponsor:

sigmicro

21st ACM International Conference on Computing Frontiers

May 7 - 9, 2024

Ischia , Italy
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 254
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Moving to memoryland: in-memory computation for existing applications

CF '15: Proceedings of the 12th ACM International Conference on Computing Frontiers

ABSTRACT

References

Cited By

Index Terms

Recommendations

CORUSCANT: Fast Efficient Processing-in-Racetrack Memories

PIPF-DRAM: processing in precharge-free DRAM

FePIM: Contention-Free In-Memory Computing Based on Ferroelectric Field-Effect Transistors