Abstract
In the era of big data, the capability of computer systems must be enhanced to support 2.5 quintillion byte/day data delivery. Among the components of a computer system, main memory has a great impact on overall system performance. DRAM technology has been used over the past four decades to build main memories. However, the scalability of DRAM technology has faced serious challenges. To keep pace with the ever-increasing demand for larger main memory, some new alternative technologies have been introduced. Phase change memory (PCM) is considered as one of such technologies for substituting DRAM. PCM offers some noteworthy properties such as low static power consumption, nonvolatility, and capability of storing more than one bit per cell (multilevel cell, or MLC). However, the short lifetime and long access latency of PCM (specifically MLC PCM) require feasible and efficient solutions.
In this article, based on the observation that applications access a significant number of read-friendly data blocks, we propose Express Read to prevent the MLC PCM read circuit to spend unnecessary time sensing the cells of a memory block. A read-friendly data block (RFDB) is composed of only “11” and “00” bit pairs, and thus upon sensing the most significant bit of a cell, the read operation can be early terminated to reduce the MLC read time and power consumption. Moreover, we increase the number of RFDBs using two simple techniques to better exploit the benefits of Express Read. Results obtained from full-system simulation near 6% performance improvement and 21% energy gain, on average, over the baseline system.
- Alaa Alameldeen and David Wood. 2004. Frequent Pattern Compression: A Significance-Based Compression Scheme for L2 Caches. Technical Report. University of Wisconsin--Madison.Google Scholar
- M. Arjomand, A. Jadidi, M. T. Kandemir, A. Sivasubramaniam, and C. Das. 2016. MLC PCM main memory with accelerated read. In Proceedings of the 2016 IEEE Symposium on Performance Analysis of Systems and Software (ISPASS’16).Google Scholar
- Mohammad Arjomand, Mahmut T. Kandemir, Anand Sivasubramaniam, and Chita R. Das. 2016. Boosting access parallelism to PCM-based main memory. In Proceedings of the 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA’16). Google ScholarDigital Library
- A. Athmanathan, M. Stanisavljevic, J. Cheon, S. Kang, C. Ahn, J. Yoon, M. Shin, T. Kim, N. Papandreou, H. Pozidis, and E. Eleftheriou. 2014. A 6-bit drift-resilient readout scheme for multi-level phase-change memory. In Proceedings of the 2014 IEEE Asian Solid-State Circuits Conference (A-SSCC’14). 137--140.Google Scholar
- Seungcheol Baek, Hyung Gyu Lee, C. Nicopoulos, Junghee Lee, and Jongman Kim. 2013. ECM: Effective capacity maximizer for high-performance compressed caching. In Proceedings of the 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA’13). Google ScholarDigital Library
- S. Baek, H. G. Lee, C. Nicopoulos, J. Lee, and J. Kim. 2015. Size-aware cache management for compressed cache architectures. IEEE Transactions on Computers 64, 8, 2337--2352.Google ScholarCross Ref
- Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, et al. 2011. The gem5 simulator. ACM SIGARCH Computer Architecture News 39, 2, 1--7. Google ScholarDigital Library
- C. Calligaro, V. Daniele, R. Gastaldi, A. Manstretta, and G. Torelli. 1995. A new serial sensing approach for multistorage non-volatile memories. In Records of the 1995 IEEE International Workshop on Memory Technology, Design, and Testing. 21--26. Google ScholarDigital Library
- W. C. Chien, Y. H. Ho, H. Y. Cheng, M. BrightSky, C. J. Chen, C. W. Yeh, T. S. Chen, et al. 2015. A novel self-converging write scheme for 2-bits/cell phase change memory for storage class memory (SCM) application. In Proceedings of the 2015 Symposium on VLSI Technology (VLSI Technology’15). T100--T101.Google ScholarCross Ref
- Y. Choi, I. Song, M. H. Park, H. Chung, S. Chang, B. Cho, J. Kim, et al. 2012. A 20nm 1.8V 8Gb PRAM with 40MB/s program bandwidth. In Proceedings of the 2012 IEEE International Solid-State Circuits Conference. 46--48.Google ScholarCross Ref
- A. Deb, P. Faraboschi, A. Shafiee, N. Muralimanohar, R. Balasubramonian, and R. Schreiber. 2016. Enabling technologies for memory compression: Metadata, mapping, and prediction. In Proceedings of the 2016 IEEE 34th International Conference on Computer Design (ICCD’16). 17--24.Google Scholar
- Gaurav Dhiman, Raid Ayoub, and Tajana Rosing. 2009. PDRAM: A hybrid PRAM and DRAM main memory system. In Proceedings of the 46th Annual Design Automation Conference (DAC’09). 664--469. Google ScholarDigital Library
- Xiangyu Dong and Yuan Xie. 2011. AdaMS: Adaptive MLC/SLC phase-change memory design for file storage. In Proceedings of the 16th Asia and South Pacific Design Automation Conference (ASPDAC’11). IEEE, Los Alamitos, CA, 31--36. Google ScholarDigital Library
- Julien Dusser, Thomas Piquet, and André Seznec. 2009. Zero-content augmented caches. In Proceedings of the 23rd International Conference on Supercomputing (ICS’09). 46--55. Google ScholarDigital Library
- Magnus Ekman and Per Stenstrom. 2005. A robust main-memory compression scheme. In Proceedings of the 32nd Annual International Symposium on Computer Architecture (ISCA’05). 74--85. Google ScholarDigital Library
- Abraham C. Ma Frank Yu, and Charles C. Lee. 2011. Command queuing smart storage transfer manager for striping data to raw-NAND flash modules. US Patent 8176238 B2. DOI:http://dx.doi.org/US8037234 B2Google Scholar
- I. Gorton, P. Greenfield, A. Szalay, and R. Williams. 2008. Data-intensive computing in the 21st century. Computer 41, 4, 30--32. Google ScholarDigital Library
- A. Hansson, N. Agarwal, A. Kolli, T. Wenisch, and A. N. Udipi. 2014. Simulating DRAM controllers for future system architecture exploration. In Proceedings of the 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’14). 201--210.Google Scholar
- John L. Henning. 2006. SPEC CPU2006 benchmark descriptions. ACM SIGARCH Computer Architecture News 34, 4, 1--17. Google ScholarDigital Library
- Morteza Hoseinzadeh, Mohammad Arjomand, and Hamid Sarbazi-Azad. 2014. Reducing access latency of MLC PCMs through line striping. In Proceedings of the 41st Annual International Symposium on Computer Architecture (ISCA’14). Google ScholarDigital Library
- K. Itoh. 2008. The history of DRAM circuit designs 2013; at the forefront of DRAM development 2013. IEEE Solid-State Circuits Society Newsletter 13, 1, 27--31.Google ScholarCross Ref
- Aamer Jaleel, William Hasenplaugh, Moinuddin Qureshi, Julien Sebot, Simon Steely Jr., and Joel Emer. 2008. Adaptive insertion policies for managing shared caches. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (PACT’08). 208--219. Google ScholarDigital Library
- Majid Jalili, Mohammad Arjomand, and Hamid Sarbazi-Azad. 2014. A reliable 3D MLC PCM architecture with resistance drift predictor. In Proceedings of the 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (ASAP’14). Google ScholarDigital Library
- Majid Jalili and Hamid Sarbazi-Azad. 2014. A compression-based morphable PCM architecture for improving resistance drift tolerance. In Proceedings of the 2014 IEEE 25th International Conference on Application-Specific Systems, Architectures, and Processors (ASAP’14).Google ScholarCross Ref
- Majid Jalili and Hamid Sarbazi-Azad. 2016. Tolerating more hard errors in MLC PCMs using compression. In Proceedings of the 2016 IEEE 34th International Conference on Computer Design (ICCD’16).Google ScholarCross Ref
- L. Jiang, B. Zhao, Y. Zhang, J. Yang, and B. R. Childers. 2012. Improving write operations in MLC phase change memory. In Proceedings of the IEEE International Symposium on High-Performance Computer Architecture (HPCA’12). 1--10. Google ScholarDigital Library
- Jungrae Kim, Michael Sullivan, Esha Choukse, and Mattan Erez. 2016. Bit-plane compression: Transforming data for better compression in many-core architectures. In Proceedings of the 43rd International Symposium on Computer Architecture (ISCA’16). 329--340. Google ScholarDigital Library
- Jungrae Kim, Michael Sullivan, Seong-Lyong Gong, and Mattan Erez. 2015. Frugal ECC: Efficient and versatile memory error protection through fine-grained compression. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’15). ACM, New York, NY, Article 12, 12 pages. Google ScholarDigital Library
- C. H. Lam. 2014. Phase change memory and its intended applications. In Proceedings of the 2014 IEEE International Electron Devices Meeting. 29.3.1--29.3.4.Google ScholarCross Ref
- Benjamin C. Lee, Engin Ipek, Onur Mutlu, and Doug Burger. 2009. Architecting phase change memory as a scalable dram alternative. In Proceedings of the 36th Annual International Symposium on Computer Architecture (ISCA’09). Google ScholarDigital Library
- Hyung Gyu Lee, Seungcheol Baek, Jongman Kim, and Chrysostomos Nicopoulos. 2012. A compression-based hybrid MLC/SLC management technique for phase-change memory systems. In Proceedings of the 2012 IEEE Computer Society Annual Symposium on VLSI (ISVLSI’12). Google ScholarDigital Library
- S. Mittal and J. S. Vetter. 2015. AYUSH: A technique for extending lifetime of SRAM-NVM hybrid caches. IEEE Computer Architecture Letters 14, 2, 115--118. Google ScholarDigital Library
- P. J. Nair, C. Chou, B. Rajendran, and M. K. Qureshi. 2015. Reducing read latency of phase change memory via early read and turbo read. In Proceedings of the 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA’15). 309--319.Google Scholar
- P. M. Palangappa and K. Mohanram. 2016. CompEx: Compression-expansion coding for energy, latency, and lifetime improvements in MLC/TLC NVM. In Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA’16). 90--101.Google Scholar
- D. J. Palframan, N. S. Kim, and M. H. Lipasti. 2015. COP: To compress and protect main memory. In Proceedings of the 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA’15). 682--693. Google ScholarDigital Library
- G. Pekhimenko, E. Bolotin, N. Vijaykumar, O. Mutlu, T. C. Mowry, and S. W. Keckler. 2016. A case for toggle-aware compression for GPU systems. In Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA’16). 188--200.Google Scholar
- G. Pekhimenko, T. Huberty, R. Cai, O. Mutlu, P. B. Gibbons, M. A. Kozuch, and T. C. Mowry. 2015. Exploiting compressed block size as an indicator of future reuse. In Proceedings of the 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA’15). 51--63.Google Scholar
- Gennady Pekhimenko, Vivek Seshadri, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, and Todd C. Mowry. 2012. Base-delta-immediate compression: Practical data compression for on-chip caches. In Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques (PACT’12). 377--388. Google ScholarDigital Library
- G. Pekhimnko, V. Seshadri, Y. Kim, H. Xin, O. Mutlu, P. B. Gibbons, M. A. Kozuch, and T. C. Mowry. 2013. Linearly compressed pages: A low-complexity, low-latency main memory compression framework. In Proceedings of the 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’13). 172--184. Google ScholarDigital Library
- Moinuddin K. Qureshi, Michele M. Franceschini, Luis A. Lastras-Montaño, and John P. Karidis. 2010. Morphable memory system: A robust architecture for exploiting multi-level phase change memories. In Proceedings of the 37th Annual International Symposium on Computer Architecture (ISCA’10). 153--162. Google ScholarDigital Library
- Moinuddin K. Qureshi, Vijayalakshmi Srinivasan, and Jude A. Rivers. 2009. Scalable high performance main memory system using phase-change memory technology. In Proceedings of the 36th Annual International Symposium on Computer Architecture (ISCA’09). 24--33. Google ScholarDigital Library
- Saeid Rashidi, Majid Jalili, and Hamid Sarbazi-Azad. 2017. Improving MLC PCM performance through relaxed write and read for intermediate resistance levels. ACM Transactions on Architecture and Code Optimization 2017, 31.Google Scholar
- K. Salem and H. Garcia-Molina. 1986. Disk striping. In Proceedings of the 1986 IEEE 2nd International Conference on Data Engineering. 336--342. Google ScholarDigital Library
- Stuart Schechter, Gabriel H. Loh, Karin Strauss, and Doug Burger. 2010. Use ECP, not ECC, for hard failures in resistive memories. In Proceedings of the 37th Annual International Symposium on Computer Architecture (ISCA’10). 141--152. Google ScholarDigital Library
- A. Shafiee, M. Taassori, R. Balasubramonian, and A. Davis. 2014. MemZip: Exploring unconventional benefits from memory compression. In Proceedings of the 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA’14).Google Scholar
- Jaswinder Pal Singh, Wolf-Dietrich Weber, and Anoop Gupta. 1992. SPLASH: Stanford parallel applications for shared-memory. ACM SIGARCH Computer Architecture News 20, 1, 5--44. Google ScholarDigital Library
- R. Wang, Y. Zhang, and J. Yang. 2016. ReadDuo: Constructing reliable MLC phase change memory through fast and robust readout. In Proceedings of the 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’16). 203--214.Google Scholar
- B. D. Yang, J. E. Lee, J. S. Kim, J. Cho, S. Y. Lee, and B. G. Yu. 2007. A low power phase-change random access memory using a data-comparison write scheme. In Proceedings of the 2007 IEEE International Symposium on Circuits and Systems. 3014--3017.Google Scholar
- Hanbin Yoon, Justin Meza, Naveen Muralimanohar, Norman P. Jouppi, and Onur Mutlu. 2014. Efficient data mapping and buffering techniques for multilevel cell phase-change memories. ACM Transactions on Architecture and Code Optimization 11, 4, Article 40. Google ScholarDigital Library
- Vinson Young, Prashant J. Nair, and Moinuddin K. Qureshi. 2015. DEUCE: Write-efficient encryption for non-volatile memories. In Proceedings of the 2015 International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’15). Google ScholarDigital Library
- W. Zhang and T. Li. 2011. Helmet: A resistance drift resilient architecture for multi-level cell phase change memory system. In Proceedings of the 2011 IEEE/IFIP 41st International Conference on Dependable Systems Networks (DSN’11). 197--208. Google ScholarDigital Library
- Miao Zhou, Yu Du, Bruce Childers, Rami Melhem, and Daniel Mossé. 2012. Writeback-aware partitioning and replacement for last-level caches in phase change main memory systems. ACM Transactions on Architecture and Code Optimization 8, 4, Article 53. Google ScholarDigital Library
Index Terms
- Express Read in MLC Phase Change Memories
Recommendations
Dynamic Wear Leveling for Phase-Change Memories With Endurance Variations
Phase change memory (PCM) has a write endurance problem. This problem is exacerbated due to endurance variations (EVs) when using advanced process technology (e.g., sub-20 nm), where PCM is expected to provide scaling benefits over dynamic random access ...
Modeling, Architecture, and Applications for Emerging Memory Technologies
Editor's note:Spin-transfer torque RAM and phase-change RAM are vying to become the next-generation embedded memory, offering high speed, high density, and nonvolatility. This article discusses new opportunities and challenges presented by these two ...
High-endurance hybrid cache design in CMP architecture with cache partitioning and access-aware policy
GLSVLSI '13: Proceedings of the 23rd ACM international conference on Great lakes symposium on VLSIIn recent years, NVM (non-volatile memory) technologies, such as STT-RAM (spin transfer torque RAM) and PRAM (phase change RAM), have drawn a lot of attention due to their low leakage and high density. However, both NVMs suffer from high write latency ...
Comments