Elsevier

Journal of Systems Architecture

Volume 71, November 2016, Pages 12-22
Journal of Systems Architecture

Energy efficient task allocation for hybrid main memory architecture

https://doi.org/10.1016/j.sysarc.2016.06.001Get rights and content

Abstract

Compared with the conventional dynamic random access memory (DRAM), emerging non-volatile memory technologies provide better density and energy efficiency. However, current NVM devices typically suffer from high write power, long write latency and low write endurance. In this paper, we study the task allocation problem for the hybrid main memory architecture with both DRAM and PRAM, in order to leverage system performance and the energy consumption of the memory subsystem via assigning different memory devices for each individual task. For an embedded system with a static set of periodical tasks, we design an integer linear programming (ILP) based offline adaptive space allocation (offline-ASA) algorithm to obtain the optimal task allocation. Furthermore, we propose an online adaptive space allocation (online-ASA) algorithm for dynamic task set where arrivals of tasks are not known in advance. Experimental results show that our proposed schemes achieve 27.01% energy saving on average, with additional performance cost of 13.6%.

Introduction

Memory subsystem has significant impact on the performance and energy efficiency of contemporary computer systems. In addition to the power consumption of read/write operations, traditional DRAM constantly requires background and refresh power in order to retain data integrity. Researches show that the DRAM-based main memory accounts for 40% of the total energy consumption of modern computer systems [1], [2]. For energy-constrained embedded systems with low-power embedded processors, memory management becomes a key consideration in energy-efficient system design.

Emerging non-volatile memory (NVM) technologies such as phase change random access memory (PRAM) attract extensive attention of both industry and research communities. PRAM can be used as the main memory since it has similar performance metrics as DRAM. Compared to DRAM, NVM typically has better power efficiency due to low background power and absence of refresh energy consumption. However, current NVM technologies such as PRAM usually suffer from low write endurance, as well as high write latency and power consumption. For instance, a detailed comparisons between DRAM and PRAM are shown in Table 1 as presented in [3].

Given the characteristics of DRAM and PRAM, many researchers have put forward the idea of design main memory architecture with both DRAM and PRAM. Some studies suggest to use a small DRAM as an upper-level buffer for the main memory of only PRAM [4]. On the other hand, there also have been some work on having parallel PRAM and DRAM which constitute the unified main address space [5], [6]. In this work, we focus on the latter approach where the entire main memory space with both DRAM and PRAM is managed by the operating system.

Most of the existing works on hybrid DRAM-PRAM memory focus on the architecture-level design methodologies. On the other hand, for application-specific embedded systems, it is of paramount importance to study the characteristics of the applications in order to achieve optimal system-level design choices. In particular, given the different memory access patterns of various applications, it is possible to determine an optimal ratio between DRAM and PRAM in the hybrid main memory architecture, as well as the corresponding task allocation schemes for high performance and energy efficiency design. For a given total main memory size, increasing the use of PRAM over DRAM saves static energy, but it may lead to excessive writes to PRAM which shortens PRAM’s lifetime and degrades the system performance. Although there have been several studies on reducing DRAM static energy consumption with task allocation and scheduling on digital signal processing (DSP) systems [7], the proportion of DRAM and PRAM in the hybrid architectures has not been considered. In this paper, we study the optimal ratio between DRAM and PRAM and task allocation problem in a hybrid main memory architecture for embedded systems. We explore the trade-offs and propose task allocation schemes for hybrid main memory for utmost energy saving, while prolonging the write endurance of PRAM and minimizing the performance overheads. If the original task set is schedulable with the traditional DRAM-based system, our proposed task allocation schemes guarantee the schedulability of the task set with the hybrid main memory architecture. The main contributions of this paper are as follows.

  • (1)

    For a given set of periodic tasks, we propose an Integer Linear Programming (ILP) based offline Allocation algorithm to explore the optimal ratio between DRAM and PRAM, which reduces the write operations on PRAM and leverages between the energy consumption and performance overhead.

  • (2)

    In order to enable fast design space exploration, we design a heuristic algorithm, i.e. offline Adaptive Space Allocation algorithm (offline-ASA) for task allocation on the hybrid memory architecture, which achieves near-optimal results in polynomial time.

  • (3)

    For a task set where the arrival time of individual task is unknown, we propose an online Adaptive Space Allocation algorithm (online-ASA) to balance the size of DRAM and PRAM while obtaining the minimum energy consumption and wear leveling of PRAM.

  • (4)

    We have performed experiments to evaluate the proposed algorithms. Results show that the proposed ILP and offline-ASA achieve 42.8% and 35.1% energy saving, at a cost of 35.4% and 17.1% performance degradation, respectively. On the other hand, the online-ASA leads to 27.01% energy saving with a 13.6% performance overhead.

The rest of this paper is organized as follows. Section 2 describes the previous work and the overview of the proposed strategy. Section 3 presents the background of the hybrid main memory architecture, describes the energy and calculation model and provides the problem description. Section 4 gives the ILP formulations and offline-ASA algorithm description. The discussion about online-ASA algorithm is explained in Section 5. Section 6 presents the experimental results, and also gives out the detailed results of the analysis. Finally, this paper is concluded in Section 7.

Section snippets

Related work

In the past decades, phase change RAM (PRAM), one of the emerging non-volatile memory technologies, has been comprehensively studied as a promising main memory candidate. Compared with the traditional DRAM, PRAM owes the advantages such as non-volatility, higher density, higher throughput, less leakage power consumption, etc. However, PRAM has deficiencies of limited write endurance and longer access latency. Therefore, the hybrid main memory architecture, which is consisting of both DRAM and

Problem analysis

In this section, we first introduce the background of the hybrid main memory architecture to be studied in this paper, followed by the performance and energy model for the architecture. Finally, we describe our targeting problem and objectives in the design of hybrid memory subsystem.

Offline task allocation algorithms

In this section, we first introduce the Integer Linear Programming formulations in Section 4.1, followed by the discussion of the offline adaptive space allocation (offline-ASA) algorithm in Section 4.2.

Online task allocation algorithm

In this section, we first discuss the our problem definition for online task allocation in Section 5.1, followed by the description of our dynamic adaptive space algorithm in Section 5.2.

Experiments

In this section, to evaluate the effectiveness of the proposed algorithms, we conduct a series of experiments and present the experimental results with analysis. In these experiments some key parameters are taken from our own hardware test platform. The DRAM chips(128MB) and the PRAM chips(128Mb) are both used with unified address in this platform. The hardware test platform is a prototype system designed as shown in Fig. 1. The DRAM chips(128MB) and the PRAM chips(128Mb) are both used and

Conclusions and future work

In this paper, we focus on the hybrid main memory architecture and study the task allocation algorithms. The higher energy saving could be obtained by ILP, but the higher performance loss is not desirable. The offline-ASA, the heuristic method of static task allocation, gives a relatively balanced solution. In the case of a fixed set of tasks, the algorithm of the offline-ASA is effective and practical. Meanwhile, the online-ASA, a heuristic dynamic allocation strategy, is proposed in this

Acknowledgements

This research is sponsored by the State Key Program of National Natural Science Foundation of China No. 61533011, National High-tech R&D Program of China (863 Program) No. 2015AA011504, Shandong Provincial Natural Science Foundation under Grant No. ZR2015FM001, the Fundamental Research Funds of Shandong University No.2015JC030.

Xiaojun Cai received the M.E. and B.E. degree in the School of Computer Science and Technology at Shandong University, in 2009 and 1997, respectively. Now he is pursuing the Ph.D. degree. His main research interests include distributed and embedded systems, wireless sensor network, emerging non-volatile memory and trust computing.

References (28)

  • J. Hu et al.

    Write activity reduction on non-volatile main memories for embedded chip multiprocessors

    ACM Trans. Embed. Comput. Syst. (TECS)

    (2013)
  • P. Mangalagiri et al.

    A low-power phase change memory based hybrid cache architecture

    Proceedings of the 18th ACM Great Lakes symposium on VLSI

    (2008)
  • L.A. Barroso et al.

    The case for energy-proportional computing

    IEEE Comput.

    (2007)
  • H.-S. Wong et al.

    Phase change memory

    Proc. IEEE

    (2010)
  • M.K. Qureshi et al.

    Scalable high performance main memory system using phase-change memory technology

    ACM SIGARCH Comput. Archit. News

    (2009)
  • ZhangW. et al.

    Exploring phase change memory and 3D die-stacking for power/thermal friendly, fast and durable memory architectures

    Parallel Architectures and Compilation Techniques, 2009 (PACT’09) 18th International Conference on

    (2009)
  • L.E. Ramos et al.

    Page placement in hybrid memory systems

    Proceedings of the International Conference on Supercomputing

    (2011)
  • T. Liu et al.

    Power-aware variable partitioning for DSPS with hybrid PRAM and DRAM main memory

    Design Automation Conference (DAC), 2011 48th ACM/EDAC/IEEE

    (2011)
  • A.P. Ferreira et al.

    Increasing PCM main memory lifetime

    Proceedings of the Conference on Design, Automation and Test in Europe

    (2010)
  • Z. Shao et al.

    PTL: PCM translation layer

    VLSI (ISVLSI), 2012 IEEE Computer Society Annual Symposium on

    (2012)
  • J. Yun et al.

    Bloom filter-based dynamic wear leveling for phase-change ram

    Proceedings of the Conference on Design, Automation and Test in Europe

    (2012)
  • J. Hu et al.

    Write activity minimization for nonvolatile main memory via scheduling and recomputation

    Comput. Aided Des. Integr. Circuits Syst. IEEE Trans.

    (2011)
  • D. Liu et al.

    A block-level flash memory management scheme for reducing write activities in PCM-based embedded systems

    Proceedings of the Conference on Design, Automation and Test in Europe

    (2012)
  • G. Dhiman et al.

    PDRAM: a hybrid PRAM and DRAM main memory system

    Design Automation Conference, 2009 (DAC’09) 46th ACM/IEEE

    (2009)
  • Cited by (5)

    • Adaptive-Classification CLOCK: Page replacement policy based on read/write access pattern for hybrid DRAM and PCM main memory

      2018, Microprocessors and Microsystems
      Citation Excerpt :

      The hierarchical organization uses the DRAM cache as the upper PCM main memory. It can still utilize an intact page replacement policy based on the DRAM such as LRU or CLOCK [22–26]. However, the hierarchical organization has an extra hardware cost for this additional layer, which is the DRAM cache.

    • Challenges in Design, Data Placement, Migration and Power-Performance Trade-offs in DRAM-NVM-based Hybrid Memory Systems

      2023, IETE Technical Review (Institution of Electronics and Telecommunication Engineers, India)
    • Hardware Model-Aware Joint Offloading and Resources Allocation Optimization

      2022, Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications
    • Optimal page allocation of hybrid main memory using page caching algorithm

      2019, International Journal of Recent Technology and Engineering

    Xiaojun Cai received the M.E. and B.E. degree in the School of Computer Science and Technology at Shandong University, in 2009 and 1997, respectively. Now he is pursuing the Ph.D. degree. His main research interests include distributed and embedded systems, wireless sensor network, emerging non-volatile memory and trust computing.

    Lei Ju received his Ph.D. in 2010 from School of Computing, National University of Singapore. In 2011, he started working as an associate professor in School of Computer Science and Technology, Shandong University. His research interests focus on design, analysis and optimization of real-time systems and embedded networks. He has authored a number of referred publications (including DAC, RTAS, DATE, and CODES+ISSS), and served as the technical program committee of several international conferences.

    Xin Li received his Ph.D. in 2008 from Intelligence Engineering Lab, Institute of Software, Chinese Academy of Sciences. In 2008, he started working as an associate professor in School of Computer Science and Technology, Shandong University. His research interests focus on energy efficient scheduling, emerging non-volatile memory and embedded system. He has authored a number of referred publications.

    Zhiyong Zhang received the M.E. and B.E. degree in the School of Computer Science and Technology at Shandong University, in 2013 and 2010, respectively. Now he is pursuing the Ph.D. degree. His main research interests include real-time and embedded systems, emerging non-volatile memory, trust computing and mobile network.

    Zhiping Jia received the Master and Ph.D. degree from the School of Computer Science and the School of Control Science, Shandong University, Jinan, China, in 1989 and 2007, respectively. From July 1989, he was with the Department of Computer Science and Technology at Shandong University. Since 2002, he has been a professor in the Department of Computer Science and technology at the Shandong University. He has published more than 70 research papers in refereed conferences and journals, and served as program committee members in numerous international conferences. He received Shandong Province Award, and Teaching Award.

    View full text