ABSTRACT
Processing-in-Memory (PIM) architectures have gained popularity due to their ability to alleviate the memory wall by performing large numbers of operations within the memory itself. On top of this, nonvolatile memory (NVM) technologies offer highly energy-efficient operations, rendering processing in NVM especially promising. Unfortunately, a major drawback is that NVM has limited endurance. Even when used for standard memory, nonvolatile technologies face limited lifetimes, which is exacerbated by imbalanced usage of memory cells. PIM significantly increases the number of operations the memory is required to perform, making the problem much worse. In this work, we quantitatively analyze the impact of PIM applications on endurance considering representative memory technologies. Our findings indicate that limited endurance can easily block the performance and energy efficiency potential of PIM architectures. Even the best known technologies of today can fall short of meeting practical lifetime expectations. This highlights the importance of research efforts to improve endurance especially at the device technology level. Our study represents the first step in characterizing the very demanding endurance needs of PIM applications to derive a detailed technology level design specification.
- 2023. PIM Endurance Simulator. https://github.com/SalonikResch/PIMenduranceSimulator.git.Google Scholar
- Shahanur Alam, Chris Yakopcic, and Tarek M Taha. 2022. Memristor Based Federated Learning for Network Security on the Edge using Processing in Memory (PIM) Computing. In 2022 International Joint Conference on Neural Networks (IJCNN). IEEE, 1--8.Google ScholarCross Ref
- Elia Ambrosi, Alessandro Bricalli, Mario Laudato, and Daniele Ielmini. 2019. Impact of oxide and electrode materials on the switching characteristics of oxide ReRAM devices. Faraday discussions 213 (2019), 87--98.Google Scholar
- Zhenhua Cai, Jiayun Lin, Fang Liu, Zhiguang Chen, and Hongtao Li. 2020. NVM-Cache: Wear-Aware Load Balancing NVM-based Caching for Large-Scale Storage Systems. In 2020 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom). IEEE, 657--665.Google Scholar
- Peter R Cappello and Kenneth Steiglitz. 1983. A VLSI layout for a pipelined Dadda multiplier. ACM Transactions on Computer Systems (TOCS) 1, 2 (1983), 157--174.Google ScholarDigital Library
- Zamshed Chowdhury, Jonathan D Harms, S Karen Khatamifard, Masoud Zabihi, Yang Lv, Andrew P Lyle, Sachin S Sapatnekar, Ulya R Karpuzcu, and Jian-Ping Wang. 2017. Efficient in-memory processing using spintronics. IEEE Computer Architecture Letters 17, 1 (2017), 42--46.Google ScholarDigital Library
- Zamshed Chowdhury, S Karen Khatamifard, Salonik Resch, Husrev Cilasun, Zhengyang Zhao, Masoud Zabihi, Meisam Razaviyayn, Jian-Ping Wang, Sachin Sapatnekar, and Ulya R Karpuzcu. 2022. CRAM-Seq: Accelerating RNA-Seq Abundance Quantification using Computational RAM. IEEE Transactions on Emerging Topics in Computing (2022).Google Scholar
- Hüsrev Cılasun, Salonik Resch, Zamshed Iqbal Chowdhury, Erin Olson, Masoud Zabihi, Zhengyang Zhao, Thomas Peterson, Jian-Ping Wang, Sachin S Sapatnekar, and Ulya Karpuzcu. 2020. Crafft: High resolution fft accelerator in spintronic computational ram. In 2020 57th ACM/IEEE Design Automation Conference (DAC). IEEE, 1--6.Google ScholarCross Ref
- Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Binarized neural networks: Training deep neural networks with weights and activations constrained to+ 1 or-1. arXiv preprint arXiv:1602.02830 (2016).Google Scholar
- Xiangyu Dong, Cong Xu, Yuan Xie, and Norman P Jouppi. 2012. Nvsim: A circuit-level performance, energy, and area model for emerging nonvolatile memory. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 31, 7 (2012), 994--1007.Google ScholarDigital Library
- Alessandro Grossi, Elisa Vianello, Mohamed M Sabry, Marios Barlas, Laurent Grenouillet, Jean Coignus, Edith Beigne, Tony Wu, Binh Q Le, Mary K Wootters, et al. 2019. Resistive RAM endurance: Array-level characterization and correction techniques targeting deep learning applications. IEEE Transactions on Electron Devices 66, 3 (2019), 1281--1288.Google ScholarCross Ref
- Saransh Gupta, Mohsen Imani, and Tajana Rosing. 2018. Felix: Fast and energy-efficient logic in memory. In 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). IEEE, 1--7.Google ScholarDigital Library
- Christian Hakert, Kuan-Hsun Chen, Paul R Genssler, Georg von der Brüggen, Lars Bauer, Hussam Amrouch, Jian-Jia Chen, and Jörg Henkel. 2020. Softwear: Software-only in-memory wear-leveling for non-volatile main memory. arXiv preprint arXiv:2004.03244 (2020).Google Scholar
- Zhezhi He, Yang Zhang, Shaahin Angizi, Boqing Gong, and Deliang Fan. 2018. Exploring a SOT-MRAM based in-memory computing for data processing. IEEE Transactions on Multi-Scale Computing Systems 4, 4 (2018), 676--685.Google ScholarCross Ref
- John L Hennessy and David A Patterson. 2011. Computer architecture: a quantitative approach. Elsevier.Google ScholarDigital Library
- Jian Huang, Anirudh Badam, Laura Caulfield, Suman Nath, Sudipta Sengupta, Bikash Sharma, and Moinuddin K Qureshi. 2017. FlashBlox: Achieving Both Performance Isolation and Uniform Lifetime for Virtualized SSDs.. In FAST, Vol. 17. 375--390.Google Scholar
- Mohsen Imani, Saransh Gupta, Yeseong Kim, and Tajana Rosing. 2019. Floatpim: In-memory acceleration of deep neural network training with high precision. In Proceedings of the 46th International Symposium on Computer Architecture. 802--815.Google ScholarDigital Library
- Andrew D Kent and Daniel C Worledge. 2015. A new spin on magnetic memories. Nature nanotechnology 10, 3 (2015), 187--191.Google Scholar
- SangBum Kim, Geoffrey W Burr, Wanki Kim, and Sung-Wook Nam. 2019. Phase-change memory cycling endurance. MRS Bulletin 44, 9 (2019), 710--714.Google ScholarCross Ref
- Shahar Kvatinsky, Dmitry Belousov, Slavik Liman, Guy Satat, Nimrod Wald, Eby G Friedman, Avinoam Kolodny, and Uri C Weiser. 2014. MAGIC---Memristoraided logic. IEEE Transactions on Circuits and Systems II: Express Briefs 61, 11 (2014), 895--899.Google ScholarCross Ref
- Shuangchen Li, Cong Xu, Qiaosha Zou, Jishen Zhao, Yu Lu, and Yuan Xie. 2016. Pinatubo: A processing-in-memory architecture for bulk bitwise operations in emerging non-volatile memories. In Proceedings of the 53rd Annual Design Automation Conference. 1--6.Google ScholarDigital Library
- Jeffry Louis, Barak Hoffer, and Shahar Kvatinsky. 2019. Performing memristor-aided logic (MAGIC) using STT-MRAM. In 2019 26th IEEE International Conference on Electronics, Circuits and Systems (ICECS). IEEE, 787--790.Google ScholarCross Ref
- Sadahiko Miura, Koichi Nishioka, Hiroshi Naganuma, TV Anh Nguyen, Hiroaki Honjo, Shoji Ikeda, Toshinari Watanabe, Hirofumi Inoue, Masaaki Niwa, Takaho Tanigawa, et al. 2020. Scalability of Quad Interface p-MTJ for 1X nm STT-MRAM With 10-ns Low Power Write Operation, 10 Years Retention and Endurance> 1011. IEEE Transactions on Electron Devices 67, 12 (2020), 5368--5373.Google ScholarCross Ref
- Arijit Nath and Hemangee K Kapoor. 2020. WELCOMF: wear leveling assisted compression using frequent words in non-volatile main memories. In Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design. 157--162.Google ScholarDigital Library
- Keni Qiu, Nicholas Jao, Mengying Zhao, Cyan Subhra Mishra, Gulsum Gudukbay, Sethu Jose, Jack Sampson, Mahmut Taylan Kandemir, and Vijaykrishnan Narayanan. 2020. ResiRCA: A resilient energy harvesting ReRAM crossbar-based accelerator for intelligent embedded processors. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 315--327.Google ScholarCross Ref
- Moinuddin K Qureshi, Michele M Franceschini, and Luis A Lastras-Montano. 2010. Improving read performance of phase change memories via write cancellation and write pausing. In HPCA-16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture. IEEE, 1--11.Google ScholarCross Ref
- Moinuddin K Qureshi, John Karidis, Michele Franceschini, Vijayalakshmi Srinivasan, Luis Lastras, and Bulent Abali. 2009. Enhancing lifetime and security of PCM-based main memory with start-gap wear leveling. In Proceedings of the 42nd annual IEEE/ACM international symposium on microarchitecture. 14--23.Google ScholarDigital Library
- S Ravi, Govind Shaji Nair, Rajeev Narayan, and Harish M Kittur. 2015. Low power and efficient dadda multiplier. Research Journal of Applied Sciences, Engineering and Technology 9, 1 (2015), 53--57.Google ScholarCross Ref
- Salonik Resch, S Karen Khatamifard, Zamshed I Chowdhury, Masoud Zabihi, Zhengyang Zhao, Husrev Cilasun, Jian-Ping Wang, Sachin S Sapatnekar, and Ulya R Karpuzcu. 2020. MOUSE: Inference in non-volatile memory for energy harvesting applications. In 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 400--414.Google ScholarCross Ref
- Salonik Resch, S Karen Khatamifard, Zamshed I Chowdhury, Masoud Zabihi, Zhengyang Zhao, Husrev Cilasun, Jian-Ping Wang, Sachin S Sapatnekar, and Ulya R Karpuzcu. 2021. Energy Efficient and Reliable Inference in Nonvolatile Memory under Extreme Operating Conditions. ACM Transactions on Embedded Computing Systems (TECS) (2021).Google Scholar
- Salonik Resch, S Karen Khatamifard, Zamshed Iqbal Chowdhury, Masoud Zabihi, Zhengyang Zhao, Jian-Ping Wang, Sachin S Sapatnekar, and Ulya R Karpuzcu. 2019. Pimball: Binary neural networks in spintronic memory. ACM Transactions on Architecture and Code Optimization (TACO) 16, 4 (2019), 1--26.Google ScholarDigital Library
- Daisuke Saida, Saori Kashiwada, Megumi Yakabe, Tadaomi Daibou, Naoki Hase, Miyoshi Fukumoto, Shinji Miwa, Yoshishige Suzuki, Hiroki Noguchi, Shinobu Fujita, et al. 2016. Sub-3 ns pulse with sub-100 μA switching of 1x--2x nm perpendicular MTJ for high-performance embedded STT-MRAM towards sub-20 nm CMOS. In 2016 IEEE Symposium on VLSI Technology. IEEE, 1--2.Google ScholarCross Ref
- Vivek Seshadri, Donghyuk Lee, Thomas Mullins, Hasan Hassan, Amirali Boroumand, Jeremie Kim, Michael A Kozuch, Onur Mutlu, Phillip B Gibbons, and Todd C Mowry. 2017. Ambit: In-memory accelerator for bulk bitwise operations using commodity DRAM technology. In 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 273--287.Google ScholarDigital Library
- Yohei Shiokawa, Eiji Komura, Yugo Ishitani, Atsushi Tsumita, Keita Suda, Yuji Kakinuma, and Tomoyuki Sasaki. 2019. High write endurance up to 1012 cycles in a spin current-type magnetic memory array. AIP Advances 9, 3 (2019), 035236.Google ScholarCross Ref
- Zainab Swaidan, Rouwaida Kanj, Johnny El Hajj, Edward Saad, and Fadi Kurdahi. 2019. RRAM Endurance and Retention: Challenges, Opportunities and Implications on Reliable Design. In 2019 26th IEEE International Conference on Electronics, Circuits and Systems (ICECS). IEEE, 402--405.Google Scholar
- Whitney J Townsend, Earl E Swartzlander Jr, and Jacob A Abraham. 2003. A comparison of Dadda and Wallace multiplier delays. In Advanced signal processing algorithms, architectures, and implementations XIII, Vol. 5205. International Society for Optics and Photonics, 552--560.Google Scholar
- Jian-Ping Wang, Sachin S Sapatnekar, Chris H Kim, Paul Crowell, Steve Koester, Supriyo Datta, Kaushik Roy, Anand Raghunathan, X Sharon Hu, Michael Niemier, et al. 2017. A pathway to enable exponential scaling for the beyond-CMOS era. In Proceedings of the 54th Annual Design Automation Conference 2017. 1--6.Google ScholarDigital Library
- Wen Wen, Youtao Zhang, and Jun Yang. 2018. Wear leveling for crossbar resistive memory. In 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC). IEEE, 1--6.Google ScholarDigital Library
- Wen Wen, Youtao Zhang, and Jun Yang. 2019. ReNEW: Enhancing lifetime for ReRAM crossbar based neural network accelerators. In 2019 IEEE 37th International Conference on Computer Design (ICCD). IEEE, 487--496.Google ScholarCross Ref
- H-S Philip Wong, Simone Raoux, SangBum Kim, Jiale Liang, John P Reifenberg, Bipin Rajendran, Mehdi Asheghi, and Kenneth E Goodson. 2010. Phase change memory. Proc. IEEE 98, 12 (2010), 2201--2227.Google ScholarCross Ref
- T Patrick Xiao, Christopher H Bennett, Ben Feinberg, Sapan Agarwal, and Matthew J Marinella. 2020. Analog architectures for neural network acceleration based on non-volatile memory. Applied Physics Reviews 7, 3 (2020), 031301.Google ScholarCross Ref
- Leonid Yavits, Lois Orosa, Suyash Mahar, João Dinis Ferreira, Mattan Erez, Ran Ginosar, and Onur Mutlu. 2020. WoLFRaM: Enhancing wear-leveling and fault tolerance in resistive memories using programmable address decoders. In 2020 IEEE 38th International Conference on Computer Design (ICCD). IEEE, 187--196.Google ScholarCross Ref
- Masoud Zabihi, Zamshed Iqbal Chowdhury, Zhengyang Zhao, Ulya R Karpuzcu, Jian-Ping Wang, and Sachin S Sapatnekar. 2018. In-memory processing on the spintronic CRAM: From hardware design to application mapping. IEEE Trans. Comput. 68, 8 (2018), 1159--1173.Google ScholarDigital Library
- Masoud Zabihi, Arvind K Sharma, Meghna G Mankalale, Zamshed Iqbal Chowdhury, Zhengyang Zhao, Salonik Resch, Ulya R Karpuzcu, Jian-Ping Wang, and Sachin S Sapatnekar. 2020. Analyzing the effects of interconnect parasitics in the stt cram in-memory computational platform. IEEE Journal on Exploratory Solid-State Computational Devices and Circuits 6, 1 (2020), 71--79.Google ScholarCross Ref
- Masoud Zabihi, Zhengyang Zhao, DC Mahendra, Zamshed I Chowdhury, Salonik Resch, Thomas Peterson, Ulya R Karpuzcu, Jian-Ping Wang, and Sachin S Sapatnekar. 2019. Using spin-Hall MTJs to build an energy-efficient in-memory computation platform. In 20th International Symposium on Quality Electronic Design (ISQED). IEEE, 52--57.Google ScholarCross Ref
- Meiran Zhao, Huaqiang Wu, Bin Gao, Xiaoyu Sun, Yuyi Liu, Peng Yao, Yue Xi, Xinyi Li, Qingtian Zhang, Kanwen Wang, et al. 2018. Characterizing endurance degradation of incremental switching in analog RRAM for neuromorphic systems. In 2018 IEEE International Electron Devices Meeting (IEDM). IEEE, 20--2.Google ScholarCross Ref
- Ping Zhou, Bo Zhao, Jun Yang, and Youtao Zhang. 2009. A durable and energy efficient main memory using phase change memory technology. ACM SIGARCH computer architecture news 37, 3 (2009), 14--23.Google Scholar
Index Terms
- On Endurance of Processing in (Nonvolatile) Memory
Recommendations
Write Activity Minimization for Nonvolatile Main Memory Via Scheduling and Recomputation
Nonvolatile memories such as Flash memory, phase change memory (PCM), and magnetic random access memory (MRAM) have many desirable characteristics for embedded systems to employ them as main memory. However, there are two common challenges we need to ...
Exploring the use of emerging nonvolatile memory technologies in future FPGAs
As new nonvolatile memory technologies become increasingly mature, there has been a growing interest on investigating their use in future field-programmable gate arrays (FPGAs). Similar to existing FPGAs with embedded Flash memory, future FPGAs can ...
Enabling Hybrid PCM Memory System with Inherent Memory Management
RACS '16: Proceedings of the International Conference on Research in Adaptive and Convergent SystemsReplacing the traditional volatile main memory, e.g., DRAM, with a non-volatile phase change memory (PCM) has become a possible solution to reduce the energy consumption of computing systems. To further reduce the bit cost of PCM, the development trend ...
Comments