## PAPER Special Section on Multiple-Valued Logic and VLSI Computing

# **Energy-Efficient and Highly-Reliable Nonvolatile FPGA Using Self-Terminated Power-Gating Scheme**

Daisuke SUZUKI<sup>†a)</sup> and Takahiro HANYU<sup>††</sup>, Members

**SUMMARY** An energy-efficient nonvolatile FPGA with assuring highly-reliable backup operation using a self-terminated power-gating scheme is proposed. Since the write current is automatically cut off just after the temporal data in the flip-flop is successfully backed up in the non-volatile device, the amount of write energy can be minimized with no write failure. Moreover, when the backup operation in a particular cluster is completed, power supply of the cluster is immediately turned off, which minimizes standby energy due to leakage current. In fact, the total amount of energy consumption during the backup operation is reduced by 66% in comparison with that of a conventional worst-case-based approach where the long time write current pulse is used for the reliable write.

key words: field-programmable gate array (FPGA), power gating, magnetic tunnel junction (MTJ) device, and low power digital

## 1. Introduction

While an SRAM-based field-programmable gate array (FPGA) is utilized in a variety of applications [1], standbypower dissipation due to leakage current is a critical issue when it comes to battery-powered or energy-harvesting applications. A nonvolatile FPGA, wherein all the configuration data as well as temporal data are stored in nonvolatile devices is one promising solution for the standby power problem owing to its zero-standby-power capability [2]–[8]. There are several key requirements of the nonvolatile device: (1) virtually unlimited endurance, because temporal data in flip-flops (FFs) must be backed up before power off; (2) three-dimensional (3D) stacking capability, and (3) CMOS compatibility, for area efficiency of the configuration cell. While all the nonvolatile devices described the above satisfies (2) and (3), ReRAM devices, atom switches, and PCRAM devices do not satisfy (1). In contrast, a magnetic tunnel junction (MTJ) device has virtually unlimited endurance over 10<sup>16</sup> [12]. However, a large amount of energy consumption for the backup operation is one major challenge for the MTJ device. Since the MTJ switching is stochastic and the actual time to complete the write operation varies dramatically, a longer time write current pulse is required for the reliable backup operation than average one. The long time backup operation also wastes leakage

Manuscript received October 21, 2016.

energy because power supply of a particular logic cluster must be kept turned on unless backup operation is completed.

In this paper, an energy-efficient nonvolatile FPGA with assuring highly-reliable backup operation using a selfterminated power-gating scheme is proposed. The selfterminated write driver makes it possible to automatically cut off the write current by monitoring the change in current or voltage level when the desired data is written in the nonvolatile device [9], [10]. Just after the data is written, write-completion signal is generated and write operation is automatically terminated. Therefore, self-terminated power-gating can be realized by detecting all the writecompletion signals in the particular cluster. To detect the write-completion signals, a dynamic AND gate is utilized, which makes it possible to implement the power-gating controller with a compact circuitry. As a result, the total amount of energy consumption during the backup operation is reduced by 66% in comparison with that of a conventional worst-case-based approach where the long time write current pulse is used for the reliable write.

#### 2. Architecture of Nonvolatile FPGA

In this paper, an MTJ device [11]–[13] is used; note that the proposed power-gating scheme is also applicable to PCRAM or ReRAM based FPGAs. Figure 1 (a) shows the structure of an MTJ device, which consists of two ferromagnetic layers separated by a tunnel barrier. The magnetization of the pinned layer (PL) is fixed, while that of the free layer (FL) is free to rotate. By controlling the direction of the magnetization of the FL with respect to the PL, the MTJ device can be configured to a low-resistance state ( $R_0$ , the corresponding state is Y=0) or a high-resistance state ( $R_1$ ,



**Fig.1** MTJ device: (a) structure, (b) R-I curve. The MTJ device can be regarded as a variable resistor whose resistance is programmed by write currents  $I_{W0}$  or  $I_{W1}$ .

Manuscript revised March 1, 2017.

Manuscript publicized May 19, 2017.

<sup>&</sup>lt;sup>†</sup>The author is with the Frontier Research Institute for Interdisciplinary Sciences, Tohoku University, Sendai-shi, 980–8578 Japan.

<sup>&</sup>lt;sup>††</sup>The author is with the Research Institute of Electrical Communication, Tohoku University, Sendai-shi, 980–8578 Japan.

a) E-mail: daisuke.suzuki.e6@tohoku.ac.jp

DOI: 10.1587/transinf.2016LOP0015





**Fig. 3** Overall architecture of the proposed nonvolatile FPGA. When self-termination signal becomes high, the power switch of the tile is turned off.

the corresponding state is Y=1). Thus, the MTJ device can be regarded as a variable resistor whose resistance value is programmed by write currents  $I_{W0}$  ( $R_1$  to  $R_0$ ) or  $I_{W1}$  ( $R_0$  to  $R_1$ ) as shown in Fig. 1 (b). The difference between  $R_0$  and  $R_1$  is defined as  $\Delta_R [\Delta_R = (R_1 - R_0)/R_0]$ .

Due to the stochastic nature in the switching mechanism, there is a variation in the switching time of the MTJ device. The switching probability of the MTJ device ( $P_{SW}$ ) is given as follows;

$$P_{SW} = 1 - \exp\left\{-\frac{t}{\tau_0} \exp\left[-\Delta\left(1 - \frac{I}{I_{C0}}\right)^2\right]\right\}$$
(1)

where *t* is the duration of the current pulse,  $\tau_0$  is the attempt time (1ns),  $\Delta$  is the thermal stability factor and  $I_{C0}$  is the critical switching current. Equation (1) indicates that a long time write current pulse is required if *I* is fixed. As a result, a worst-case-oriented long-time current pulse is required for reliable write as shown in Fig. 2.

#### 2.1 Proposed Nonvolatile FPGA Architecture

Figure 3 shows the overall architecture of the proposed non-



Fig. 4 Block diagram of CB and SB.

volatile FPGA which is composed of a 2-dimensional array of tiles [14]. The tile is composed of a power switch, a configurable logic block (CLB), a switch block (SB), and two connection blocks (CBs). MTJ devices are stacked over the CMOS circuit plane of each block [6]. When the selftermination signal becomes high, the power switch of the tile is turned off. The CLB is surrounded by routing tracks on all four sides and logic input and output pins can be connected to the other CLBs via CBs and SBs.

Figure 4 shows the routing architecture of the FPGA. Logic input and output signals of the CLB are interfaced by CB and the signals are propagated to any adjacent tiles via the SB. The CB is composed of routing switches which are composed of a pair of an MTJ-based nonvolatile storage element (NVSE) and an NMOS pass gate. The NVSE is implemented so that its effective area becomes small by sharing a large-sized write-control transistor among several NVSEs and reducing the size of the write-control transistors in each NVSE [16]. The SB is composed of six routing switches and buffers. By the configuration of the latches, any arbitrary signal routing can be performed. We define the number of wires in the routing track as W, the ratio of wires in each channel to which a CLB input pin can connect as  $F_{cin}$ , and that of an output pin as  $F_{cout}$ . For example, in Fig. 4, W is 4,  $F_{cin}$  is 0.5, and  $F_{cout}$  is 0.5, respectively.

Figure 5 shows the block diagram of the CLB which is composed of N CLB slices. Each CLB slice has I external logic inputs and N internal feedback inputs from the CLB slices. In this paper, we assume that the CLB layout is one dimensional and the number of N is at most 10 [15]. The CLB slice consists of a K-input nonvolatile lookup table (LUT) circuit [8], a self-terminated FF [10], and a MUX to perform both combinational and sequential logic func-



**Fig. 5** Block diagram of the CLB. Each nonvolatile FF contains a self-terminated write driver. When all the write-completion signals (F[0], ..., F[N-1]) become high, the power supply of the tile is turned of by the power switch controller.

tions. The output of the FF (Q) is selected for the output of the LE (Z) when SEL=1, and the output of the LUT circuit (D) is selected when SEL=0, respectively. The LUT circuit performs any *K*-input logic function by using input vectors (**X**). The *K M*-to-1 multiplexers interfaces between the LUT circuit and *M* bit inputs (*I* external inputs and *N* feedback inputs). Each nonvolatile FF contains a self-terminated write driver and when all the write-completion signals (F[0], ..., F[*N*-1]) become high, the power supply of the tile is cut off by the power switch controller.

Figure 6 shows the schematic diagram of the selfterminated FF. It consists of a CMOS FF core, a selfterminated write driver, and a nonvolatile storage cell. During the normal mode, the FF operates as a standard CMOS FF and the self-terminated driver and the nonvolatile storage cell are electrically separated. During the backup mode, CLK is set low and the driver is activated by the store enable signal (STR). Then, a bi-directional write current is applied to the nonvolatile storage cell and the temporal data of the master latch (A) is stored in the MTJ device. If the desired data is written in the MTJ device, write enable signal (WR) becomes low and the bi-directional write current is cut off. The recall enable signal (RCL) is set low so that  $V_R$  is electrically separated from BL0. During the recall mode, RCL becomes high and the stored data is recalled to the slave latch. Two inverters in the slave latch are used to amplify  $V_R$  to a full-swing output voltage.

Figure 7 shows the block diagram of the proposed selfterminated write driver. The voltage signals of BL0 ( $V_{BL0}$ ) and BL1 ( $V_{BL1}$ ) are selectively used for monitoring the state transition of the MTJ device from  $R_1$  to  $R_0$  and  $R_0$  to  $R_1$ , respectively. Because the change in  $V_{BL0}$  is significant when Y = 0 is written whereas that of  $V_{BL1}$  is significant when Y =



**Fig. 6** Block diagram of the self-terminated NVFF. Before power off, the temporal data in the master latch (A) is stored into the nonvolatile storage cell. The data is restored into the slave latch just after power on.



**Fig.7** Schematic diagram of the self-terminated driver. At the beginning of the backup operation, the output node WR is pre-charged to  $V_{DD}$  by activating STR low. Then, STR becomes high to detect write completion. If A and SD are matched and CMP becomes high, WR is discharged and the write current is automatically cut off.

1, selective voltage monitoring makes it possible to achieve sufficiently large sense margin in any write current direction. At the beginning of the backup operation, the output node WR is pre-charged to  $V_{DD}$  by activating STR low. Then, STR becomes high to detect write completion. If A and selected data (SD) are matched and CMP becomes high, WR is discharged and the write current is automatically cut off. Note that, if the backup data A is the same as the stored data M, write operation is immediately skipped, which greatly saves the backup energy consumption.

Figure 8 shows the schematic diagram of the powergating controller which is implemented by using dynamic circuitry. To turn on the power swich, PGEN is activated at low level. Therefore, N0 and N1 are pre-charged and ST



**Fig. 8** Schematic diagram of the power-gating controller. To turn on the power switch, PGEN is activated at low level. To perform power-gating, PGEN is activated at high level and two PMOS transistors are turned off. When all the write completion signals (F[0], F[1], ..., and F[N-1]) become high, N2 is discharged and ST becomes high.



**Fig.9** Example of energy consumption during backup operation. By utilizing the proposed self-termination scheme, write current of each FF is optimally turned off and total energy consumption is greatly reduced compared to that of worst-cased based method.

becomes low and the power-switch is turned on. During the basic operation PGEN is kept low. To perform powergating, PGEN is activated at high level and two PMOS transistors are turned off. When all the write completion signals namely F[0], F[1], ..., and F[N-1] become high, N0 is discharged and then, N2 is also discharged. In this way, ST becomes high and the power-switch is turned off. Note that the power line of the controller is directly connected to *VDD* line.

Figure 9 shows energy consumption of a tile during the backup operation where 3 CLB slices (thus N is 3) are embedded.  $I_{TOTAL}$  and  $I_L$  are the total current of the tile and leakage current of the tile, respectively. FF[*i*] ( $0 \le i \le N-1$ ) corresponds to the self-terminated FF in *i* th CLB slice (CLB[*i*]).  $I_B[i]$  and  $t_i$  correspond to the backup current of FF[*i*] and the time when the backup operation in FF[*i*] is finished, respectively. We can see that the total power consumption is greatly reduced compared to the worst case (in this case, write current pulse width is set  $t_{WORST}$ ). In the next section, we quantitatively evaluate the usefulness of the proposed method.

#### 3. Evaluation

## 3.1 Formulation

For the evaluation, let us formulate the energy consumption during the backup operation. First, the backup energy of FF[*i*] ( $0 \le i \le N - 1$ ) is expressed as  $t_i V_{DD} I_B[i]$ . Therefore, the total backup energy  $E_B^{TOTAL}$  is expressed as follows.

$$E_{B}^{total} = \sum_{i=0}^{N-1} t_{i} V_{DD} I_{B}[i]$$
(2)

The energy consumption during backup operation  $(E_L)$  is depends on the longest  $t_i$ . Therefore  $E_L$  is expressed as follows.

$$E_L = V_{DD} \times I_L \times max(t_0, t_1, \dots, t_{N-1}).$$
(3)

 $I_L$  is expressed as follows;

$$I_{L} = I_{L}^{CLB} + I_{L}^{CB} + I_{L}^{SB}.$$
 (4)

where  $I_L^{CLB}$ ,  $I_L^{CB}$ , and  $I_L^{SB}$  are leakage current of CLB, CB, and SB, respectively. If we design *M*-to-1 multiplexer in the CLB slice by using an NMOS binary tree, the number of latch required for one CLB slice is given by  $K(\lceil \log_2 M \rceil) +$ 1. Therefore, the leakage current of the CLB is given as follows;

$$I_L^{CLB} = N \Big[ I_L^{LAT} \{ K(\lceil \log_2 M \rceil) + 1 \} + I_L^{LUT} + I_L^{FF} + I_L^{MUX} \Big]$$
(5)

where  $I_L^{LAT}$ ,  $I_L^{LUT}$ ,  $I_L^{FF}$ , and  $I_L^{MUX}$  are leakage current of the latch, LUT, FF, and 2-to-1 MUX, respectively. The total number of the latch is given as  $IWF_{cin} + NWF_{cout}$ . Thus,

$$I_L^{CB} = I_L^{LAT} (IWF_{cin} + NWF_{cout})$$
(6)

As shown in Fig. 4, the SB is composed of W basic block which is comprised from 10 latches and 8 tri-state buffers. Thus,

$$I_L^{SB} = W(10I_L^{LAT} + 8I_L^{TBUF})$$
<sup>(7)</sup>

where  $I_I^{TBUF}$  is leakage current of the tri-state buffer.

## 3.2 Evaluation Setup

In this evaluation, a 90 nm CMOS technology together with MTJ devices whose parameters are summarized in Table 1 are used (note that these parameters are estimated from [13] and [17]). In the nano-second region, it is reported that the variation of the MTJ switching time follows the Gaussian distribution [17]. Therefore we assume that the standard deviation of the switching time ( $\sigma_W$ ) is 30% of ideal switching time, thus,  $\sigma_W = 3$  [ns]. Figure 10 shows the simulated MTJ's switching variation where 2560 times of write operation is performed in a single MTJ device to obtain an accurate distribution of the switching time. We can see that

Table 1MTJ design parameters.

| $R_0$              | 6 kΩ    |
|--------------------|---------|
| $R_1$              | 13.5 kΩ |
| TMR ratio          | 1.25    |
| I <sub>W1</sub>    | 50 μA   |
| I <sub>wo</sub>    | -100 μA |
| T <sub>IDEAL</sub> | 10 ns   |



Fig. 10 Simulated switching time variation.

 Table 2
 Leakage current of each components.

| ILLUT                                                       | 10.8nA |  |  |
|-------------------------------------------------------------|--------|--|--|
| IL <sup>MUX</sup>                                           | 9.6nA  |  |  |
| <i>I<sub>L</sub><sup>FF</sup></i><br>(Proposed)             | 69.3nA |  |  |
| <i>I<sub>L</sub><sup>FF</sup></i><br>(w/o self-termination) | 37.7nA |  |  |
| ILLAT                                                       | 9.1nA  |  |  |
| I <sub>L</sub> <sup>BUF</sup>                               | 5.5nA  |  |  |

| Table 3 | MTJ | design | parameters. |
|---------|-----|--------|-------------|
|---------|-----|--------|-------------|

| W                 | 72   |
|-------------------|------|
| 1                 | 24   |
| N                 | 8    |
| F <sub>Cin</sub>  | 0.25 |
| F <sub>Cout</sub> | 0.25 |

the worst-case switching time is about 20 ns. So we set  $T_{WORST} = 20$  [ns].

The circuit components of the FPGA are designed by using NS-SPICE which includes MTJ macro model [18]. Table 2 summarizes the amounts of leakage current of the FPGA components. To determine the FPGA architecture parameters that are shown in Fig. 4 and Fig. 5, an opensource CAD tool called verilog-to-routing (VTR) [19] is utilized. Table 3 summarizes FPGA architecture parameters for the evaluation. These parameters are determined so that typical benchmark circuits can be implemented successfully [14], [15].



Fig. 11 Waveform of self-terminated power gating.





Total energy [pJ]

Fig. 12 Simulation results of the tiles.

Table 4Summary of simulation results.





Fig. 13 Comparison of break-even points (BEPs).

#### 3.3 Simulation Results

Figure 11 shows the simulated waveform of the self-terminated power-gating. We can confirm that when F[0], F[1], ..., and F[7] become high, ST becomes high and the power-switch is turned off.

Figure 12 summarizes the simulation results of two tiles; with the proposed power-gating scheme and with the worst-case based one. We assume that random '0' ans '1' patterns are applied to the inputs of the FFs and the simulation is performed 20 times We can see that the proposed power-gating scheme effectively reduces both write energy and leakage energy. As a result, 66% of total energy reduction is demonstrated as shown in Table 4. Note that if the number of the circuit components except for FFs is increased, the leakage power reduction becomes more significant.

Figure 13 shows the comparison of the break-even points (BEPs) of the conventional tile and the proposed tile, where  $E_{NOPG}$ ,  $E_{PG}$ , and  $T_{IDLE}$  are the energy consumption during the idle mode with no power gating, the energy consumption during the idle mode with power gating, and the length of the idle time, respectively. Because 20 ns is required for backup operation, the rest of  $(T_{IDLE} - 20)$  ns can be power gated. We define the BEP as the value of  $T_{IDLE}$  when  $E_{NOPG} - E_{PG}$  firstly exceeds 0. As shown in Fig. 13, BEP of the proposed power-gating scheme is 70% lower than that of conventional one.

#### 4. Conclusion

An energy-efficient and highly-reliable nonvolatile FPGA using self-terminated power-gating scheme has been proposed. Since the write current is automatically cut off just after the data in the FF is backed up in the nonvolatile device, the total amount of backup energy can be minimized with no write failure. Moreover, when all the backup operation is completed, power supply of a particular domain of FPGA is fall into sleep mode, which minimizes standby power consumption. As a result, the total amount of energy consumption is reduced by 66% in comparison with that of conventional worst-case-based approach. It is very important to consider how to apply and optimize the proposed method to the CLB with more complex structure under physical effect such as wiring delay, capacitance, and so on. Since severe increase in the standby power consumption can be avoided even if a state-of-the-art CMOS technology is used, it is expected that MTJ-based FPGA can break through the power and performance walls that cannot be overcome by CMOS-only based approach, thus explore a wide variety of battery-powered or energy-harvested applications.

## Acknowledgments

This research is supported by JSPS KAKENHI Grant Number 25870067. A part of this work is supported by CIES consortium program.

## References

- F.-Li Yuan, C.C. Wang, T.-H. Yu, and D. Markovic, "A multigranularity FPGA with hierarchical interconnects for efficient and flexible mobile computing," IEEE J. Solid-State Circuits, vol.50, no.1, pp.137–149, Jan. 2015.
- [2] Y.Y. Liauw, Z. Zhang, W. Kim, A.E. Gamal, and S.S. Wong, "Nonvolatile 3D-FPGA with monolithically stacked RRAM-based configuration memory," IEEE Int. Conf. Solid-State Circuits Dig. Tech. Papers, pp.406–407, Feb. 2012.
- [3] M. Miyamura, M. Tada, T. Sakamoto, N. Banno, K. Okamoto, N. Iguchi, and H. Hada, "First demonstration of logic mapping on nonvolatile programmable cell using complementary atom switch," International Electron Device Meeting Tech. Dig., pp.247–250, Dec. 2012.
- [4] S. Yamamoto, Y. Shuto, and S. Sugahara, "Nonvolatile powergating field-programmable gate array using nonvolatile static random access memory and nonvolatile flip-flops based on pseudospin-transistor architecture with spin-transfer-torque magnetic tunnel junctions," Jpn. J. Appl. Phys., vol.51, no.11S, p.11PB02, 2012.
- [5] D. Suzuki, M. Natsui, T. Endoh, H. Ohno, and T. Hanyu, "Sixinput lookup table circuit with 62% fewer transistors using nonvolatile logic-in-memory architecture with series/parallel-connected magnetic tunnel junctions," J. Appl. Phys., vol.111, no.7, 07E318, Feb. 2012.
- [6] D. Suzuki, M. Natsui, A. Mochizuki, S. Miura, H. Honjo, H. Sato, S. Fukami, S. Ikeda, T. Endoh, H. Ohno, and T. Hanyu, "Fabrication of a 3000-6-input-LUTs embedded and block-Level power-gated non-volatile FPGA chip using p-MTJ-based logic-in-memory structure," Symp. VLSI Circuits Dig. Tech. Papers, pp.172–173, June 2015.

- [7] K. Huang, Y. Ha, R. Zhao, A. Kumar, and Y. Lian, "A low active leakage and high reliability phase change memory (PCM) based non-volatile FPGA storage element," IEEE Trans. Circuits and Syst.-I, vol.61, no.9, pp.2605–2613, Dec. 2014.
- [8] D. Suzuki and T. Hanyu, "Design of an MTJ-based nonvolatile lookup table circuit using an energy-efficient single-ended logicin-memory structure," IEEE Midwest Symp. Circuits and Systems, pp.317–320, Aug. 2015.
- [9] D. Suzuki, M. Natsui, A. Mochizuki, and T. Hanyu, "Costefficient self-terminated write driver for spin-transfer-torque RAM and logic," IEEE Trans. Magn., vol.50, no.11, pp.1–4, Nov. 2014.
- [10] D. Suzuki and T. Hanyu, "Magnetic-tunnel-junction based lowenergy nonvolatile flip-flop using an area-efficient self-terminated write driver," J. Appl. Phys., vol.117, no.17, pp.1–3, Jan. 2015.
- [11] S. Ikeda, K. Miura, H. Yamamoto, K. Mizunuma, H.D. Gan, M. Endo, S. Kanai, J. Hayakawa, F. Matsukura, and H. Ohno, "A perpendicular-anisotropy CoFeB–MgO magnetic tunnel junction," Nature Materials, vol.9, no.9, pp.721–724, July 2010.
- [12] C. Yoshida, T. Ochiai, Y. Iba, Y. Yamazaki, K. Tsunoda, A. Takahashi, and T. Sugii, "Demonstration of non-volatile working memory through interface engineering in STT-MRAM," IEEE Symp. VLSI Technology, Dig. Tech. Papers, pp.59–60, June 2012.
- [13] H. Sato, M. Yamanouchi, S. Ikeda, S. Fukami, F. Matsukura, and H. Ohno, "MgO/CoFeB/Ta/CoFeB/MgO recording structure in magnetic tunnel junctions with perpendicular easy axis," IEEE Trans. Magn., vol.49, no.7, pp.4437–4440, July 2013.
- [14] V. Betz, J. Rose, and A. Marquardt, "Arcitecture and CAD for deepsubmicron FPGAs," Kluwer Academic Publishers, 1999.
- [15] E. Ahmed and J. Rose, "The effect of LUT and cluster size on deepsubmicron FPGA performance and density," IEEE Trans. VLSI Syst., vol.12, no.3, pp.288–298, March 2004.
- [16] D. Suzuki, M. Natsui, A. Mochizuki, S. Miura, H. Honjo, K. Kinoshita, S. Fukami, H. Sato, S. Ikeda, T. Endoh, H. Ohno, and T. Hanyu, "Design and fabrication of a perpendicular magnetic tunnel junction based nonvolatile programmable switch achieving 40% less area using shared-control transistor structure," J. Appl. Phys., vol.115, no.17, pp.1–3, April 2014.
- [17] P. Wang, X. Chen, Y. Chen, H. Li, S. Kang, X. Zhu, and W. Wu, "A 1.0V 45nm nonvolatile magnetic latch design and its robustness analysis," Proc. IEEE Custom Integrated Circuits Conf., pp.1–4, Sept. 2011.
- [18] N. Sakimura, R. Nebashi, Y. Tsuji, H. Honjo, T. Sugibayashi, H. Koike, T. Ohsawa, S. Fukami, T. Hanyu, H. Ohno, and T. Endoh, "High-speed simulator including accurate MTJ models for spintronics circuit design," Proc. IEEE Int. Symp. Circuits and Systems (ISCAS), pp.1971–1974, May 2012.
- [19] J. Luu, N. Ahmed, K.B. Kent, J. Anderson, J. Rose, V. Betz, J. Goeders, M. Wainberg, A. Somerville, T. Yu, K. Nasartschuk, M. Nasr, S. Wang, and T. Liu, "VTR 7.0: next generation architecture and CAD system for FPGAs," ACM Trans. Reconf. Tech. Syst., vol.7, no.2, pp.6:1–6:30, June 2014.



**Daisuke Suzuki** received the B.E., M.E., and D.E. degrees from Tohoku University, Sendai, Japan, in 2004, 2006, and 2009, respectively. From 2010 to 2014, he was a Research Associate with the Center for Spintronics Integrated Systems, Tohoku University, Sendai, Japan. From 2014 to 2015, he was an Assistant Professor with the Center for Innovative Integrated Electronic Systems, Tohoku University. He is currently an Assistant Professor of the Frontier Research for Interdisciplinary Re-

search, Tohoku University. His main interests and activities are nonvolatile logic and their application to reconfigurable systems. Dr. Suzuki was the recipient of the Excellent Paper Award of The Institute of Electronics, Information and Communication Engineers of Japan in 2010.



**Takahiro Hanyu** received the B.E., M.E., and D.E. degrees in electronic engineering from Tohoku University, Sendai, Japan, in 1984, 1986, and 1989, respectively. He is currently a Professor in the Research Institute of Electrical Communication, Tohoku University. His general research interests include nonvolatile logic circuits and their applications to ultra-lowpower and/or highly dependable VLSI processors, and post-binary/asynchronous network-onchip architecture and its application to brain-

inspired VLSI systems (called "Brainware LSI"). Dr. Hanyu received the Sakai Memorial Award from the Information Processing Society of Japan in 2000, the Judge's Special Award at the 9th LSI Design of the Year from the Semiconductor Industry News of Japan in 2002, the Special Feature Award at the University LSI Design Contest from ASP-DAC in 2007, the APEX Paper Award of Japan Society of Applied Physics in 2009, the Excellent Paper Award of IEICE, Japan, in 2010, Ichimura Academic Award in 2010, the Best Paper Award of IEEE ISVLSI 2010, the Paper Award of SSDM 2012, the Best Paper Finalist of IEEE ASYNC 2014, and the Commendation for Science and Technology by MEXT, Japan in 2015.