A survey of spintronic architectures for processing-in-memory and neural networks

doi:10.1016/j.sysarc.2018.11.005

Journal of Systems Architecture

Volume 97, August 2019, Pages 349-372

https://doi.org/10.1016/j.sysarc.2018.11.005 Get rights and content

Abstract

The rising overheads of data-movement and limitations of general-purpose processing architectures have led to a huge surge in the interest in “processing-in-memory” (PIM) approach and “neural networks” (NN) architectures. Spintronic memories facilitate efficient implementation of PIM approach and NN accelerators, and offer several advantages over conventional memories. In this paper, we present a survey of spintronic-architectures for PIM and NNs. We organize the works based on main attributes to underscore their similarities and differences. This paper will be useful for researchers in the area of artificial intelligence, hardware architecture, chip design and memory system.

Introduction

As conventional von-Neumann style processors get progressively restricted by the data-movement overheads [1], use of processing-in-memory (PIM) approach has become, not merely attractive, but even imperative. Further, as machine learning algorithms are being applied to solve cognitive tasks of ever-increasing complexity, their memory and computation demands are escalating fast. Since traditional processors are unable to meet these requirements, design of domain-specific accelerators has become essential. These factors and trends call for research into novel memory technologies, architectures and design approaches.

Spintronic memories allow performing computations such as arithmetic and logic operations inside memory. Also, they allow efficient modeling of neurons and synapses which make them useful for accelerating neural networks [2]. These properties, along with the near-zero standby power and high density of spintronic memories make them promising candidates for architecting future memory systems and even computing systems.

Use of spintronic memories, however, also presents key challenges. Compared to SRAM and DRAM, spintronic memories have higher latency and write energy. Also, most of the existing proposals have implemented simple neuron models such as neuron producing “binary output” based on the sign of the input. However, NN architectures aimed at solving complex cognitive tasks require modeling of more realistic neuron models [2]. Further, since some spin neuron-synapse units cannot be connected through spin-signaling [3], they need to be connected using CMOS (complementary metal-oxide semiconductor) based charge-signaling. Evidently, design of spintronic accelerators for PIM and NN is challenging and yet, rewarding. Several circuit, microarchitecture and system-level techniques have been recently proposed towards this end.

Contributions: In this paper we present a survey of spintronic-accelerators for PIM and NN. Fig. 1 summarizes the contents of this paper. Section 2 provides a background on key concepts and a classification of research works on key parameters. Sections 3 and 4 present techniques for designing logic and arithmetic units, respectively. Section 5 discusses spintronic accelerators for a range of application domains. In these sections, we focus on qualitative insights and not on quantitative results.

Finally, Section 6 concludes this paper with a discussion of future challenges. This paper will be useful for researchers interested in the confluence of machine learning, hardware architecture and memory architectures. Table 1 shows the acronyms used in this paper. Input and output carry are shown as C_i and C_o, respectively.

Section snippets

Background and motivation

We now discuss relevant concepts and refer the reader to prior works for a background on NVMs [4], [5], [6], [7].

Spintronic logic units

In this section, we discuss spintronic PIM architectures for bitwise operations (Section 3.1), programmable switch and logic element (Section 3.2), MUX and encoder (Section 3.3) and random number generators (Section 3.4). Table 3 classifies the PIM architectures for performing logic operations based on their design features. It classifies the works as all-spin or spintronic logic. It then shows the bit-cell designs used by different works.

Table 3 then shows the DWM device designs used in PIM

Spintronic arithmetic units

In this section, we discuss various arithmetic units such as (precise) adder (Section 4.1), approximate adder (Section 4.2), multiplier (Section 4.3), majority gate-based designs (Section 4.4) and LUT designs (Section 4.5). Table 6 classifies these works on several important parameters. We now review these works.

Spintronic accelerators for various applications

In this section, we review spintronic architectures in terms of their application domains, such as neuromorphic computing (Section 5.1), image processing (Section 5.2), data encryption (Section 5.3) and associative computing (Section 5.4).

Conclusion and future outlook

Memory latency and bandwidth constraints have now become the key bottleneck in scaling the performance of modern processors. Although traditional techniques such as prefetching [81] and data-compression [82] can mitigate these overheads partially, approaches that provide much higher efficiency are required for architecting processors of next-generation. In this paper, we presented a survey of spintronic-architectures for enabling “processing-in-memory” and designing accelerators for “neural

Acknowledgment

Support for this work was provided by Science and Engineering Research Board, award number ECR/2017/000622.

Sumanth Umesh is presently pursuing B.Tech. degree in the Department of Electrical Engineering at IIT Jodhpur, India. His research interests include Spintronic Memories and Hardware Architecture for Machine Learning.

References (83)

X. Chen et al.
A unified framework for designing high performance in-memory and hybrid memory file systems
J. Syst. Archit.
(2016)
S. Mittal et al.
DESTINY: a comprehensive tool with 3d and multi-level cell memory modeling capability
J. Low Power Electron. Appl.
(2017)
J. Chung et al.
Domain wall memory based convolutional neural networks for bit-width extendability and energy-efficiency
ISLPED
(2016)
Z. He et al.
Leveraging dual-mode magnetic crossbar for ultra-low energy in-memory data encryption
Proceedings of the on Great Lakes Symposium on VLSI 2017
(2017)
S. Mittal et al.
A survey of encoding techniques for reducing data-movement energy
J. Syst. Archit.
(2018)
A. Sengupta et al.
A vision for all-spin neural networks: a device to system perspective
IEEE Trans. Circuits Syst. I
(2016)
M. Sharad et al.
Spin-based neuron model with domain-wall magnets as synapse
IEEE Trans. Nanotechnol.
(2012)
S. Mittal
A survey of techniques for architecting processor components using domain wall memory
ACM J. Emerg. Technol. Comput. Syst.
(2016)
S. Mittal et al.
A survey of architectural approaches for managing embedded DRAM and non-volatile on-chip caches
IEEE Trans. Parallel Distrib. Syst.
(2015)
S. Mittal et al.
A survey of software techniques for using non-volatile memories for storage and main memory systems
IEEE Trans. Parallel Distribut. Syst.
(2016)

S. Peng et al.

Magnetic tunnel junctions for spintronics: principles and applications

Wiley Encycl. Electr. Electron. Eng.

(2014)

M. Wang et al.

Current-induced magnetization switching in atom-thick tungsten engineered perpendicular magnetic tunnel junctions with large tunnel magnetoresistance

Nat. Commun.

(2018)

I. Ahmed et al.

A comparative study between spin-transfer-torque and spin-hall-effect switching mechanisms in pmtj using spice

IEEE J. Explor. Solid-State Comput. Devices Circ.

(2017)

W. Kang et al.

Readability challenges in deeply scaled STT-MRAM

Non-Volatile Memory Technology Symposium (NVMTS), 2014 14th Annual

(2014)

S. Mittal

A survey of soft-error mitigation techniques for non-volatile memories

Computers

(2017)

S. Mittal et al.

Addressing read-disturbance issue in STT-RAM by data compression and selective duplication

IEEE Comput. Archit. Lett.

(2017)

W. Kang et al.

Advanced low power spintronic memories beyond STT-MRAM

Proceedings of the on Great Lakes Symposium on VLSI 2017

(2017)

J.G. Alzate et al.

Voltage-induced switching of nanoscale magnetic tunnel junctions

Electron Devices Meeting (IEDM), 2012 IEEE International

(2012)

A. Roohi et al.

A tunable majority gate-based full adder using current-induced domain wall nanomagnets

IEEE Trans. Magn.

(2016)

K. Huang et al.

A low power and high sensing margin non-volatile full adder using racetrack memory

IEEE Trans. Circ. Syst. I

(2015)

W. Kang et al.

Skyrmion-electronics: an overview and outlook.

Proc. IEEE

(2016)

X. Zhang et al.

Magnetic skyrmion logic gates: conversion, duplication and merging of skyrmions

Sci. Rep.

(2015)

Q. An et al.

Full-adder circuit design based on all-spin logic device

Nanoscale Architectures (NANOARCH), 2015 IEEE/ACM International Symposium on

(2015)

H. Mahmoudi et al.

High performance MRAM-based stateful logic

Ultimate Integration on Silicon (ULIS), 2014 15th International Conference on

(2014)

Q. Guo et al.

AC-DIMM: associative computing with STT-MRAM

ACM SIGARCH Comput. Arch. News

(2013)

Z. He et al.

Exploring STT-MRAM based in-memory computing paradigm with application of image edge extraction

Computer Design (ICCD), 2017 IEEE International Conference on

(2017)

W. Kang et al.

In-Memory processing paradigm for bitwise logic operations in STT–MRAM

IEEE Trans. Magn.

(2017)

S. Jain et al.

Computing-in-memory with spintronics

DATE

(2018)

H. Mahmoudi et al.

MRAM-based logic array for large-scale non-volatile logic-in-memory applications

Nanoscale Architectures (NANOARCH), 2013 IEEE/ACM International Symposium on

(2013)

S. Matsunaga et al.

MTJ-based nonvolatile logic-in-memory circuit, future prospects and issues

Proceedings of the Conference on Design, Automation and Test in Europe

(2009)

F. Parveen et al.

HielM: highly flexible in-memory computing using STT MRAM

Design Automation Conference (ASP-DAC), 2018 23rd Asia and South Pacific

(2018)

P. Butzen et al.

Reliable majority voter based on spin transfer torque magnetic tunnel junction device

Electron. Lett.

(2015)

A.F. Vincent et al.

Spin-transfer torque magnetic memory as a stochastic memristive synapse for neuromorphic systems

IEEE Trans. Biomed. Circuits Syst.

(2015)

D. Fan et al.

STT-SNN: a spin-transfer-torque based soft-limiting non-linear neuron for low-power artificial neural networks

IEEE Trans. Nanotechnol.

(2015)

H. Cai et al.

Approximate computing in MOS/spintronic non-volatile full-adder

Nanoscale Architectures (NANOARCH), 2016 IEEE/ACM International Symposium on

(2016)

L.A. de Barros Naviner et al.

Stochastic computation with spin torque transfer magnetic tunnel junction

New Circuits and Systems Conference (NEWCAS), 2015 IEEE 13th International

(2015)

Y. Wang et al.

A novel circuit design of true random number generator using magnetic tunnel junction

Nanoscale Architectures (NANOARCH), 2016 IEEE/ACM International Symposium on

(2016)

T. Hanyu et al.

Spintronics-based nonvolatile logic-in-memory architecture towards an ultra-low-power and highly reliable VLSI computing paradigm

Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition

(2015)

T. Hanyu et al.

Challenge of MOS/MTJ-hybrid nonvolatile logic-in-memory architecture in dark-silicon era

Electron Devices Meeting (IEDM), 2014 IEEE International

(2014)

B. Lokesh et al.

Full adder based reconfigurable spintronic ALU using STT-MTJ

India Conference (INDICON), 2013 Annual IEEE

(2013)

D. Kumar et al.

Design of 21 multiplexer and 12 demultiplexer using magnetic tunnel junction elements

Emerging Trends in VLSI, Embedded System, Nano Electronics and Telecommunication System (ICEVENT), 2013 International Conference on

(2013)

Cited by (47)

Computing of neuromorphic materials: an emerging approach for bioengineering solutions
2023, Materials Advances
The potential of neuromorphic computing to bring about revolutionary advancements in multiple disciplines, such as artificial intelligence (AI), robotics, neurology, and cognitive science, is well recognised. This paper presents a comprehensive survey of current advancements in the use of machine learning techniques for the logical development of neuromorphic materials for engineering solutions. The amalgamation of neuromorphic technology and material design possesses the potential to fundamentally revolutionise the procedure of material exploration, optimise material architectures at the atomic or molecular level, foster self-adaptive materials, augment energy efficiency, and enhance the efficacy of brain–machine interfaces (BMIs). Consequently, it has the potential to bring about a paradigm shift in various sectors and generate innovative prospects within the fields of material science and engineering. The objective of this study is to advance the field of artificial intelligence (AI) by creating hardware for neural networks that is energy-efficient. Additionally, the research attempts to improve neuron models, learning algorithms, and learning rules. The ultimate goal is to bring about a transformative impact on AI and better the overall efficiency of computer systems.
Tantalum pentoxide (Ta<inf>2</inf>O<inf>5</inf> and Ta<inf>2</inf>O<inf>5-x</inf>)-based memristor for photonic in-memory computing application
2023, Nano Energy
Citation Excerpt :
With an increase in the amount of data that need to be stored and processed, the traditional von Neumann architecture has encountered several bottlenecks due to its low transfer speeds and high-power consumption [1–3]. Recently, in-memory computing technology, which features integrated memory and central processing in a unit cell and possesses an efficient parallel processing mode like the human brain, has attracted considerable attention and is expected to subvert traditional computer architectures [4–8]. Various novel electronic devices have been proposed for developing in-memory computing.
Photonic in-memory computing exhibits promising potential to address the inherent limitations of traditional von Neumann architecture. In this study, we demonstrate a tantalum pentoxide (Ta₂O₅ and Ta₂O_5−x)-based memristor as a non-volatile memory for photonic in-memory computing functions. The active layer of the memristor on a heavily doped N-Si substrate comprises two films of Ta₂O₅ and Ta₂O_5−x with a size of 3 × 3 µm² of which roughness root-mean-square values are 1.25 nm and 1.59 nm, respectively. A controllable electrical behavior transition from write-once-read-many-times (WORM) memory to resistive random-access memory (RRAM) is achieved by changing the depositional sequence. Benefitting from the visible light response, the in situ photonic Boolean logic operations (“AND/OR”) are achieved in the RRAM device by mixing the light and electric signals, and the power consumption of an “AND” or “OR” operation consumes 4.503 nJ and 4.526 nJ, respectively, proving the superior photonic in-memory computing potential. The basic logic “IMPLICATION” operation is implemented by performing electrical regulation in a circuit with two RRAM devices connected in parallel. Finally, a 5 × 5 RRAM array is developed and thereafter, the array-level logic for image processing applications is realized. The proposed tantalum pentoxide-based memristors possess great potential in constructing efficient in-memory computing architectures.
A survey on hardware accelerators: Taxonomy, trends, challenges, and perspectives
2022, Journal of Systems Architecture
In recent years, the limits of the multicore approach emerged in the so-called “dark silicon” issue and diminishing returns of an ever-increasing core count. Hardware manufacturers, out of necessity, switched their focus to accelerators, a new paradigm that pursues specialization and heterogeneity over generality and homogeneity. They are special-purpose hardware structures separated from the CPU with aspects that exhibit a high degree of variability. We define a taxonomy based on fourteen of these aspects, grouped in four macro-categories: general aspects, host coupling, architecture, and software aspects. According to it, we categorize around 100 accelerators of the last decade from both industry and academia, and critically analyze emerging trends. We complete our discussion with throughput and efficiency figures. Then, we discuss some prominent open challenges that accelerators are facing, analyzing state-of-the-art solutions, and suggesting prospective research directions for the future.
A survey of SRAM-based in-memory computing techniques and applications
2021, Journal of Systems Architecture
Citation Excerpt :
Limitations of analog domain: Analog domain operation has higher vulnerability to process variation, noise, nonlinear I–V characteristics and aging. Also, use of ADC and DAC degrades signal precision and imposes energy and area penalty [9,15,16]. Challenges in mapping large DNNs: Large-scale DNNs have weight matrices of the order of thousands.
As von Neumann computing architectures become increasingly constrained by data-movement overheads, researchers have started exploring in-memory computing (IMC) techniques to offset data-movement overheads. Due to the widespread use of SRAM, IMC techniques for SRAM hold the promise of accelerating a broad range of computing systems and applications. In this article, we present a survey of techniques for in-memory computing using SRAM memory. We review the use of SRAM-IMC for implementing Boolean, search and arithmetic operations, and accelerators for machine learning (especially neural networks) and automata computing. This paper aims to accelerate co-design efforts by informing researchers in both algorithm and hardware architecture fields about the recent developments in SRAM-based IMC techniques.
A survey of hardware architectures for generative adversarial networks
2021, Journal of Systems Architecture
Recent years have witnessed a significant interest in the “generative adversarial networks” (GANs) due to their ability to generate high-fidelity data. Many models of GANs have been proposed for a diverse range of domains ranging from natural language processing to image processing. GANs have a high compute and memory requirements. Also, since they involve both convolution and deconvolution operation, they do not map well to the conventional accelerators designed for convolution operations. Evidently, there is a need of customized accelerators for achieving high efficiency with GANs. In this work, we present a survey of techniques and architectures for accelerating GANs. We organize the works on key parameters to bring out their differences and similarities. Finally, we present research challenges that are worthy of attention in near future. More than summarizing the state-of-art, this survey seeks to spark further research in the field of GAN accelerators.
Strain-promoted reversible spin transfer in rhombic graphene nanoflakes
2021, Applied Surface Science
The strain-promoted reversible spin flip and spin transfer on double-magnetic-center graphene nanoflakes (Ni₂&GNFs) are investigated with first-principles quantum chemical computations. We find that applying strain along the two diagonals of the rhombic graphene nanoflakes can significantly lead to the redistribution of spin density, and can therefore dominate the spin-flip and spin-transfer processes. The applied strain effectively changes the spin-density localization of the system, thus opening a channel for successful spin-transfer processes in all distorted structures. Especially, simultaneous spin-flip and transfer processes are achieved when the tensile strain is applied along the short diagonal, and some of the spin-transfer and spin-flip-transfer scenarios achieved on the stretched Ni₂&GNF are reversible. This reversible strain-assisted spin dynamics provides a feasible yet efficient way to control the spin magnetism of GNFs, which could increase the functionality and the flexibility of the integrated spin-logic in real straintronic devices.

View all citing articles on Scopus

Sparsh Mittal received the B.Tech. degree in electronics and communications engineering from IIT, Roorkee, India and the Ph.D. degree in computer engineering from Iowa State University (ISU), USA. He worked as a post-doctoral research associate at Oak Ridge National Lab (ORNL), USA for 3 years. He is currently working as an assistant professor at IIT Hyderabad, India. He was the graduating topper of his batch in B.Tech. and has received fellowship from ISU and performance award from ORNL. Sparsh has published more than 75 papers in top conferences and journals. His research has been covered by several technical news websites, e.g. Phys.org, InsideHPC, Primeur Magazine, StorageSearch, Data-Compression.info, TechEnablement, ScientificComputing, SemiEngineering, ReRAM forum and HPCWire. His research interests include accelerators for neural networks, architectures for machine learning, non-volatile memory, and GPU architectures. His webpage is http://www.iith.ac.in/~sparsh/.

¹: Sumanth worked on this paper while working as an intern at IIT Hyderabad.

View full text

A survey of spintronic architectures for processing-in-memory and neural networks

Abstract

Introduction

Section snippets

Background and motivation

Spintronic logic units

Spintronic arithmetic units

Spintronic accelerators for various applications

Conclusion and future outlook

Acknowledgment

J. Syst. Archit.

J. Low Power Electron. Appl.

A survey of encoding techniques for reducing data-movement energy

J. Syst. Archit.

A vision for all-spin neural networks: a device to system perspective

IEEE Trans. Circuits Syst. I

Spin-based neuron model with domain-wall magnets as synapse

IEEE Trans. Nanotechnol.

A survey of techniques for architecting processor components using domain wall memory

ACM J. Emerg. Technol. Comput. Syst.

A survey of architectural approaches for managing embedded DRAM and non-volatile on-chip caches

IEEE Trans. Parallel Distrib. Syst.

A survey of software techniques for using non-volatile memories for storage and main memory systems

IEEE Trans. Parallel Distribut. Syst.

Magnetic tunnel junctions for spintronics: principles and applications

Wiley Encycl. Electr. Electron. Eng.

Current-induced magnetization switching in atom-thick tungsten engineered perpendicular magnetic tunnel junctions with large tunnel magnetoresistance

Nat. Commun.

A comparative study between spin-transfer-torque and spin-hall-effect switching mechanisms in pmtj using spice

IEEE J. Explor. Solid-State Comput. Devices Circ.

Readability challenges in deeply scaled STT-MRAM

Non-Volatile Memory Technology Symposium (NVMTS), 2014 14th Annual

A survey of soft-error mitigation techniques for non-volatile memories

Computers

Addressing read-disturbance issue in STT-RAM by data compression and selective duplication

IEEE Comput. Archit. Lett.

Advanced low power spintronic memories beyond STT-MRAM

Proceedings of the on Great Lakes Symposium on VLSI 2017

Voltage-induced switching of nanoscale magnetic tunnel junctions

Electron Devices Meeting (IEDM), 2012 IEEE International

A tunable majority gate-based full adder using current-induced domain wall nanomagnets

IEEE Trans. Magn.

A low power and high sensing margin non-volatile full adder using racetrack memory

IEEE Trans. Circ. Syst. I

Skyrmion-electronics: an overview and outlook.

Proc. IEEE

Magnetic skyrmion logic gates: conversion, duplication and merging of skyrmions

Sci. Rep.

Full-adder circuit design based on all-spin logic device

Nanoscale Architectures (NANOARCH), 2015 IEEE/ACM International Symposium on

High performance MRAM-based stateful logic

Ultimate Integration on Silicon (ULIS), 2014 15th International Conference on

AC-DIMM: associative computing with STT-MRAM

ACM SIGARCH Comput. Arch. News

Exploring STT-MRAM based in-memory computing paradigm with application of image edge extraction

Computer Design (ICCD), 2017 IEEE International Conference on

In-Memory processing paradigm for bitwise logic operations in STT–MRAM

IEEE Trans. Magn.

Computing-in-memory with spintronics

DATE

MRAM-based logic array for large-scale non-volatile logic-in-memory applications

Nanoscale Architectures (NANOARCH), 2013 IEEE/ACM International Symposium on

MTJ-based nonvolatile logic-in-memory circuit, future prospects and issues

Proceedings of the Conference on Design, Automation and Test in Europe

HielM: highly flexible in-memory computing using STT MRAM

Design Automation Conference (ASP-DAC), 2018 23rd Asia and South Pacific

Reliable majority voter based on spin transfer torque magnetic tunnel junction device

Electron. Lett.

Spin-transfer torque magnetic memory as a stochastic memristive synapse for neuromorphic systems

IEEE Trans. Biomed. Circuits Syst.

STT-SNN: a spin-transfer-torque based soft-limiting non-linear neuron for low-power artificial neural networks

IEEE Trans. Nanotechnol.

Approximate computing in MOS/spintronic non-volatile full-adder

Nanoscale Architectures (NANOARCH), 2016 IEEE/ACM International Symposium on

Stochastic computation with spin torque transfer magnetic tunnel junction

New Circuits and Systems Conference (NEWCAS), 2015 IEEE 13th International

A novel circuit design of true random number generator using magnetic tunnel junction

Nanoscale Architectures (NANOARCH), 2016 IEEE/ACM International Symposium on

Spintronics-based nonvolatile logic-in-memory architecture towards an ultra-low-power and highly reliable VLSI computing paradigm

Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition