ABSTRACT
Write bandwidth is an inherent performance bottleneck for Phase Change Memory (PCM) for two reasons. First, PCM cells have long programming time, and second, only a limited number of PCM cells can be programmed concurrently due to programming current and write circuit constraints,
For each PCM write, the data bits of the write request are typically mapped to multiple cell groups and processed in parallel. We observed that an unbalanced distribution of modified data bits among cell groups significantly increases PCM write time and hurts effective write bandwidth. To address this issue, we first uncover the cyclical and cluster patterns for modified data bits. Next, we propose double XOR mapping (D-XOR) to distribute modified data bits among cell groups in a balanced way. D-XOR can reduce PCM write service time by 45% on average, which increases PCM write throughput by 1.8x. As error correction (redundant bits) is critical for PCM, we also consider the impact of redundancy information in mapping data and error correction bits to cell groups. Our techniques lead to a 51% average reduction in write service time for a PCM main memory with ECC, which increases IPC by 12%.
- C. F. Chen et al., "Accurate and complexity-effective spatial pattern prediction," in Proc. of the 10th Int'l Symp. on High Performance Computer Architecture (HPCA), Feb. 2004. Google ScholarDigital Library
- S. Cho and H. Lee, "Flip-N-Write: a simple deterministic technique to improve PRAM write performance, energy and endurance," in Proc. of the 42nd Int'l Symp. on Microarchitecture (MICRO), Dec. 2009. Google ScholarDigital Library
- Y. Choi et al., "A 20nm 1.8V 8Gb PRAM with 40MB/s program bandwidth," in Int'l Solid-State Circuits Conference (ISSCC), Feb. 2012.Google Scholar
- T. J. Dell., "A white paper on the benefits of chipkill-correct ECC for PC server main memory." 1997, IBM Microelectronics Division.Google Scholar
- A. González et al., "Eliminating cache conflict misses through XOR-based placement functions," in Proc. of the 11th Int'l Conf. on Supercomputing (ICS), 1997. Google ScholarDigital Library
- A. Hay et al., "Preventing PCM banks from seizing too much power," in Proc. of the 44th Int'l Symp. on Microarchitecture (MICRO), Dec. 2011. Google ScholarDigital Library
- M. Y. Hsiao, "A class of optimal minimum odd-weight-column SEC-DED codes," IBM Journal of Research and Development, vol. 14, no. 4, Jul. 1970. Google ScholarDigital Library
- E. Ipek et al., "Dynamically replicated memory: Building reliable systems from nanoscale resistive memories," in Proc. of the 10th Int'l Conf on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Mar. 2010. Google ScholarDigital Library
- L. Jiang et al., "LLS: cooperative integration of wear-leveling and salvaging for PCM main memory," in Proc. of the 41st Int'l Conf. on Dependable Systems Networks (DSN), Jun. 2011. Google ScholarDigital Library
- L. Jiang et al., "FPB: fine-grained power budgeting to improve write throughput of multi-level cell phase change memory," in Proc. of the 45nd Int'l Symp. on Microarchitecture (MICRO), Dec. 2012. Google ScholarDigital Library
- L. Jiang et al., "Improving write operations in MLC phase change memory," in Proc. of the 18th Int'l Symp. on High Performance Computer Architecture (HPCA), Feb. 2012. Google ScholarDigital Library
- D. E. Kim, C. K. Kwak, and K. J. Lee, "Nonvolatile memory device and related methods of operation," 2011, U. S. Patent 7,876,609 B2.Google Scholar
- B. C. Lee et al., "Architecting phase change memory as a scalable DRAM alternative," in Proc. of the 36th Int'l Symp. on Computer Architecture (ISCA), 2009. Google ScholarDigital Library
- K.-J. Lee et al., "A 90nm 1.8V 512Mb diode-switch PRAM with 266MB/s read throughput," in Int'l Solid-State Circuits Conference (ISSCC), Feb. 2007.Google Scholar
- P. Magnusson et al., "Simics: A full system simulation platform," Computer, vol. 35, no. 2, Feb. 2002. Google ScholarDigital Library
- O. Pearce et al., "Quantifying the effectiveness of load balance algorithms," in Proc. of the 26th Int'l Conf. on Supercomputing (ICS), Jun. 2012. Google ScholarDigital Library
- B. Predictors and S. McFarling, "Combining branch predictors," 1993.Google Scholar
- M. Qureshi, M. Franceschini, and L. Lastras-Montano, "Improving read performance of phase change memories via write cancellation and write pausing," in Proc. of the 16th Int'l Symp. on High Performance Computer Architecture (HPCA), Jan. 2010.Google Scholar
- M. K. Qureshi, "Pay-as-you-go: low-overhead hard-error correction for phase change memories," in Proc. of the 44th Int'l Symp. on Microarchitecture (MICRO), Dec. 2011. Google ScholarDigital Library
- M. K. Qureshi et al., "Morphable memory system: a robust architecture for exploiting multi-level phase change memories," in Proc. of the 37th Int'l Symp. on Computer Architecture (ISCA), Jun. 2010. Google ScholarDigital Library
- M. K. Qureshi et al., "Enhancing lifetime and security of PCM-based main memory with start-gap wear leveling," in Proc. of the 42nd Int'l Symp. on Microarchitecture (MICRO), Dec. 2009. Google ScholarDigital Library
- S. Schechter et al., "Use ECP, not ECC, for hard failures in resistive memories," in Proc. of the 37th Int'l Symp. on Computer Architecture (ISCA), Jun. 2010. Google ScholarDigital Library
- N. H. Seong, D. H. Woo, and H.-H. S. Lee, "Security refresh: prevent malicious wear-out and increase durability for phase-change memory with dynamically randomized address mapping," in Proc. of the 37th Int'l Symp. on Computer Architecture (ISCA), Jun. 2010. Google ScholarDigital Library
- S. Somogyi et al., "Spatial memory streaming," in Proc. of the 33rd Int'l Symp. on Computer Architecture (ISCA), Jun. 2006. Google ScholarDigital Library
- M. Valero, T. Lang, and E. Ayguadé, "Conflict-free access of vectors with power-of-two strides," in Proc. of the 6th Int'l Conf. on Supercomputing (ICS), 1992. Google ScholarDigital Library
- C. Villa et al., "A 45nm 1Gb 1.8V phase-change memory," in Int'l Solid-State Circuits Conference (ISSCC), Feb. 2010.Google Scholar
- J. Wang et al., "Energy-efficient multi-level cell phase-change memory system with data encoding," in Proc. of the 29th Int'l Conf. on Computer Design (ICCD), Oct. 2011. Google ScholarDigital Library
- W. Xu, J. Liu, and T. Zhang, "Data manipulation techniques to reduce phase change memory write energy," in Proc. of the 14th Int'l Symp. on Low Power Electronics and Design (ISLPED), 2009. Google ScholarDigital Library
- B. D. Yang et al., "A low power phase-change random access memory using a data-comparison write scheme," in Proc. of the IEEE Int'l Symp. on Circuits and Systems (ISCAS), May 2007.Google Scholar
- J. Yang and R. Gupta, "Frequent value locality and its applications," ACM Trans. Embed. Comput. Syst., vol. 1, no. 1, Nov. 2002. Google ScholarDigital Library
- D. H. Yoon et al., "FREE-p: protecting non-volatile memory against both hard and soft errors," in Proc. of the 17th Int'l Symp. on High Performance Computer Architecture (HPCA), Feb. 2011. Google ScholarDigital Library
- J. Yue and Y. Zhu, "Accelerating write by exploiting PCM asymmetries," in Proc. of the 19th Int'l Symp. on High Performance Computer Architecture (HPCA), Feb. 2013. Google ScholarDigital Library
- W. Zhang and T. Li, "Characterizing and mitigating the impact of process variations on phase change based memory systems," in Proc. of the 42nd Int'l Symp. on Microarchitecture (MICRO), Dec. 2009. Google ScholarDigital Library
- Z. Zhang, Z. Zhu, and X. Zhang, "A permutation-based page interleaving scheme to reduce row-buffer conflicts and exploit data locality," in Proc. of the 33rd Int'l Symp. on Microarchitecture (MICRO), Dec. 2000. Google ScholarDigital Library
- P. Zhou et al., "A durable and energy efficient main memory using phase change memory technology," in Proc. of the 36th Int'l Symp. on Computer Architecture (ISCA), Jun. 2009. Google ScholarDigital Library
Index Terms
- Bit mapping for balanced PCM cell programming
Recommendations
Bit mapping for balanced PCM cell programming
ICSA '13Write bandwidth is an inherent performance bottleneck for Phase Change Memory (PCM) for two reasons. First, PCM cells have long programming time, and second, only a limited number of PCM cells can be programmed concurrently due to programming current ...
Flip-N-Write: a simple deterministic technique to improve PRAM write performance, energy and endurance
MICRO 42: Proceedings of the 42nd Annual IEEE/ACM International Symposium on MicroarchitectureThe phase-change random access memory (PRAM) technology is fast maturing to production levels. Main advantages of PRAM are non-volatility, byte addressability, in-place programmability, low-power operation, and higher write endurance than that of ...
Efficient Data Mapping and Buffering Techniques for Multilevel Cell Phase-Change Memories
New phase-change memory (PCM) devices have low-access latencies (like DRAM) and high capacities (i.e., low cost per bit, like Flash). In addition to being able to scale to smaller cell sizes than DRAM, a PCM cell can also store multiple bits per cell (...
Comments