# Leakage Current Reduction Using Subthreshold Source-Coupled Logic Armin Tajalli, Student Member, IEEE, and Yusuf Leblebici, Senior Member, IEEE Abstract—The performance of subthreshold source-coupled logic (STSCL) circuits for ultra-low power applications is explored. It is shown that the power consumption of STSCL circuits can be reduced well below the subthreshold leakage current of static CMOS circuits. STSCL circuits exhibit a better power-delay performance compared to their static CMOS counterparts in situations where the leakage current constitutes a significant part of the power dissipation of static CMOS gates. The superior control on power consumption, in addition to lower sensitivity to the process and supply voltage variations make STSCL topology very suitable for implementing ultra-low-power low-frequency digital systems in modern nanometer scale technologies. An analytical approach for comparing the power-delay performance of these two topologies is proposed. #### I. Introduction To optimize the power consumption of integrated digital CMOS systems, different approaches have been proposed in the literature [1]. These techniques (e.g. multiple threshold voltage devices or various power management techniques [1]-[3]), can be used to reduce the system power dissipation with respect to the work load. In ultra-low power applications, where the power dissipation is a crucial parameter, supply voltage $(V_{DD})$ is generally reduced below the threshold voltage $(V_T)$ of MOS devices [4], [5]. Reducing the supply voltage or choosing high threshold voltage (HVT) devices results in a smaller $V_{eff} = V_{DD} - V_T$ value and hence less power consumption [2]. However, reducing $V_{eff}$ , reduces the ratio of the on-current of a logic gate $(I_{ON})$ to its leakage current $(I_{OFF})$ as shown in Fig. 1(a). Reduction in $\gamma = I_{ON}/I_{OFF}$ results in degradation of reliability and power efficiency of the circuit, requiring special design techniques to implement robust logic operations [4]. Wide variation of circuit characteristics, such as speed of operation and power dissipation, due to variations of process parameters, supply voltage, and temperature (PVT) is the other important issue in the design of ultra-low power digital circuits in modern nanometer scale technologies [6]. The effects of such variations become more evident when the devices are biased in subthreshold regime. Figure 1(a) depicts the variation of $\gamma$ for different process corner parameters and Fig. 1(b) shows the variation of drain current versus temperature in different $V_{GS}$ values. As illustrated in Fig. 1(b), variation on drain bias current increases by moving towards subthreshold regime. Meanwhile, in this regime the operation frequency The authors are with the Microelectronic Systems Laboratory (LSM), Swiss Federal Institute of Technology (EPFL), CH-1015 Lausanne, Switzerland (e-mail: armin.tajalli@epfl.ch, yusuf.leblebici@epfl.ch). Manuscript received October 15, 2008; revised October 15, 2008. Fig. 1. (a) Simulated turn on to turn off current ratio $(\gamma = I_{ON}/I_{OFF})$ of a static CMOS inverter gate implemented in 65nm technology in different corner cases. (b) Simulated variation of drain bias current of an NMOS transistor in 65nm due to the thermal variations. Here, the current is normalized to the current level in $T_0$ =27°C. and power consumption both depend exponentially on the supply voltage. Therefore, a very accurate control on $V_{DD}$ is required [7]. The design of such high-precision supply voltage control systems, however, becomes more challenging in battery operated systems where the power budget is very restricted and also battery voltage reduces with time. Subthreshold source-coupled logic (STSCL) topology has recently been shown as an alternative approach for implementing ultra-low power circuits [8], [9]. The accurate control on the power consumption of each gate makes this topology very suitable for very low bias current operations where the dissipation of conventional static CMOS circuits is limited by their subthreshold leakage current. Meanwhile, the gate delay in this configuration does not depend on supply voltage and hence there is a very low sensitivity to the supply voltage variations. The performance variation due to the PVT variations is also much less in this type of circuits compared to the static CMOS topology as will be shown later. In this article, an analytical approach for analyzing and comparing the leakage and power-delay performance of CMOS and STSCL topologies will be presented. In Section II, after a very short introduction on subthreshold SCL topologies, the Fig. 2. Subthreshold SCL buffer (inverter) circuit schematic [9]. main performance parameters of this topology is analyzed. In Section III power-speed tradeoffs for CMOS topology will be studied and Section IV provides a comparison between the two topologies. # II. PERFORMANCE ANALYSIS OF SUBTHRESHOLD SOURCE-COUPLED LOGIC #### A. STSCL Topology Figure 2 shows the topology of a subthreshold SCL circuit [8]. In this topology, the n-channel metal-oxide-semiconductor (NMOS) switching transistors as well as the p-channel metaloxide-semiconductor (PMOS) load devices are biased in subthreshold regime. In order to execute a Boolean operation, the voltage swing at the input and output of this circuit should be $V_{SW} > 4 \cdot n_n U_T$ [10] ( $n_n$ is the subthreshold slope factor of the NMOS differential pair devices, and $U_T = kT/q$ is the thermal voltage, k is Boltzmann's constant, T is the junction temperature in kelvin, and q stands for elementary charge). Satisfying this constraint, the circuit shown in Fig. 2 will show also enough gain for successful logic operation [9]. To provide the required voltage swing at very low tail bias current values $(I_{SS})$ , very high valued load resistances are required $(R_L = V_{SW}/I_{SS})$ . The load resistances should occupy a very small area with a very good controllability to be able to adjust their resistivity with respect to their tail bias current. In Fig. 2, PMOS transistors with shorted drain-bulk terminals have been utilized to implement the proposed high resistance load devices [8]. Using small size PMOS devices, this structure can be used to implement very high valued resistances with a relatively high voltage swing at the output. A replica bias circuit can be used to control the resistance of the load devices and hence adjust the output voltage swing with respect to the tail bias current [9]. The replica bias circuit will also track the variations on temperature and supply voltage and hence compensates their effect on the circuit performance. ## B. Power-Speed Tradeoff in STSCL Circuits In contrast to the CMOS gates where there is no static power consumption (neglecting the leakage current), each STSCL gate draws a constant bias current of $I_{SS}$ from supply source Fig. 3. Measured STSCL MUX gate delay for different tail bias currents in $0.18\mu m$ CMOS technology. (Fig. 2). Therefore, the power consumption of each STSCL gate can be calculated by $$P_{diss,STSCL,1} = V_{DD}I_{SS}. (1)$$ Meanwhile, the time constant at the output node of each STSCL gate, i.e. $$\tau = R_L \cdot C_L \approx (V_{SW}/I_{SS}) \cdot C_L \tag{2}$$ is the main speed limiting factor in this topology ( $C_L$ is the total output loading capacitance). Based on (2), one can choose the proper $I_{SS}$ value to operate at the desired frequency. Since the power consumption and delay of each gate only depend on $I_{SS}$ which can be controlled very precisely, this circuit exhibits very low sensitivity to the process variations. Meanwhile, since the speed of operation in this case does not depend on threshold voltage of the MOS devices, it is not necessary to use special process options to have low threshold voltage devices as frequently used for static CMOS. Shown in Fig. 3, it can be seen that the gate delay is adjustable in a very wide range proportional to the tail bias current. This figure shows that the tail bias current can be reduced to about $10 \, \mathrm{pA}$ where the forward bias current of the source-bulk diode of the PMOS load devices becomes comparable to $I_{SS}$ . Considering (1), it can also be concluded that the power consumption is constant and independent of the operation frequency. Therefore, it is necessary to use the STSCL circuits at their maximum activity rate to achieve the maximum achievable efficiency. It is also important to note that the gate delay does not depend on the supply voltage while it varies with the tail bias current linearly. This property can be exploited for applications in which the supply can vary during the operation. Based on (1) and (2), power-delay product (PDP) of each gate can be approximately calculated by $$PDP_{STSCL,1} \approx \ln 2 \times V_{DD}V_{SW}C_L$$ (3) which is directly proportional to the supply voltage, the voltage swing at the output of the gate, and the total load capacitance. To have a better understanding of the power-speed tradeoff in STSCL configuration, consider a simple STSCL circuit constructed of N cascaded identical gates (indeed, N is the logic depth) that is operating at frequency of $f_{op}$ . Using (1) and (2), it can be shown that the total power consumption of this chain will be: $$P_{diss,STSCL,N} \approx \ln 2 \times N^2 V_{DD,STSCL} V_{SW} C_L f_{op}$$ (4) Fig. 4. Variation on STSCL MUX gate delay due to the temperature variations in $0.18\mu m$ CMOS technology (measurement and simulation). which is increasing quadratically with the logic depth and linearly with the operation frequency. #### C. Process and Temperature Variation Considering (4), it can be concluded that the device parameters and especially the threshold voltage does not influence the speed-power consumption tradeoff in STSCL topology. As mentioned before, the replica bias circuit (that is used to generate and adjust the gate voltage of PMOS load devices) will compensate the effect of temperature variations [9]. Therefore, this topology exhibits a very low sensitivity to PVT variations. Figure 4 shows the simulated gate delay versus load capacitance in different temperatures. It can be seen that the variation on gate delay due to the temperature variations is less than 4% (Fig. 4). Based on this figure, $t_d \approx 1.38 \times 10^8 C_L$ which is very close to the value predicted by (2) and also agrees very well with the measurement results. # D. Minimum Supply Voltage Since the devices are biased in weak inversion, it is possible to use HVT devices in STSCL circuits without affecting the speed of operation. The minimum supply voltage of a STSCL gate is (Fig. 2): $$V_{DD,min} = V_{CS} + V_{GS1} \tag{5}$$ where $V_{CS}$ is the required headroom for the current source. Since all the devices are in subthreshold, therefore $V_{CS} \geq 4U_T$ . Meanwhile, $V_{GS,1} = V_{T0} + n_n U_T \ln I_{SS}/I_0$ ( $V_{T0}$ stands for the threshold voltage of M1-M2 and $I_0 = 2n_n(W/L_{eff})U_T^2$ ) [11]. Notice that for a complete switching, $V_{GS,1}$ should be always larger than $V_{SW}$ , or: $V_{GS,1} > V_{SW}$ . Therefore, assuming $V_{SW} \approx 6U_T$ , the minimum supply voltage will be: $$V_{DD\ min} \approx 10 U_T.$$ (6) Measurements show that it is possible to reduce the supply voltage of an $(8\times8)$ multiplier implemented based on STSCL topology down to 350mV [9]. Fig. 5. (a) A chain of N identical CMOS gates. Note that the type of logic gate used in the chain is arbitrary. (b) Modeling the current waveform. ## III. PERFORMANCE ANALYSIS OF CMOS LOGIC CIRCUITS The required power consumption of a chain of N STSCL gates operating at a frequency of $f_{op}$ was calculated in (4). Similar to that case, consider a chain of identical CMOS gates. Figure 5(a) illustrates the proposed test structure and Fig. 5(b) depicts the simplified waveform of the current drawn from supply source by a single gate. The peak current $(I_{peak})$ and leakage current $(I_{leak})$ drawn form supply by the proposed logic cell, both depend on $V_{DD}$ and the size ratio of devices. Meanwhile, $I_{peak}$ depends on the transition time at the input of the proposed gate. To simplify the calculations, we are assuming that the transition time at the input of each gate is comparable to the intrinsic transition time at the output of that gate when it drives $C_L$ . This assumption is very close to reality when the logic depth is high. With this constraint, $I_{peak}$ will depend only on $V_{DD}$ . The rms (root mean square) power consumption of this circuit shown in Fig. 5(a) can be calculated by: $$P_{diss,CMOS,N} = V_{DD} \cdot \sqrt{\frac{1}{T} \int_0^T i_{DD}^2(t) dt}.$$ (7) Considering the simplified waveform of Fig. 5(b) for supply current, the total rms power consumption of the circuit will be: $$P_{diss,CMOS,N} \approx N I_{leak} V_{DD} \sqrt{1 + \frac{\alpha \cdot \eta}{3} (\frac{\gamma^2}{N^2} + \frac{\gamma}{N} - 2)}$$ (8) where, $\alpha = f_{op}/f_{max}$ represents the activity rate, $f_{max} = 1/(2t_d)$ is the maximum operation frequency of a single gate, $\gamma = I_{peak}/I_{leak}$ , $f_{op} = 1/T$ , and: $$\eta = \begin{cases} N/2 & \text{if } N \text{ even,} \\ (N+1)/2 & \text{if } N \text{ odd.} \end{cases}$$ Here, $\eta$ is used to take into account that supply current depends only on the current that is used for charging the load capacitances. As expected, the minimum power consumption of the circuit is determined by the leakage current when activity rate is very low ( $\alpha \approx 0$ ). At higher operating frequencies where the dynamic power consumption becomes dominant, the power dissipation is proportional to the square root of the operating Fig. 6. Power consumption of a chain of CMOS gates versus activity rate $(\alpha)$ for logic depth of N=10. frequency. Figure 6 illustrates the power consumption versus speed of operation (or activity rate) as predicted by (8). By increasing the logic depth, the total power consumption scales up proportionally while the maximum speed of operation reduces by the same factor. Based on (8), it can be found that for activity rates smaller than a "critical activity rate" ( $\alpha_C$ ) given by: $$\alpha_C \approx 3N^2/(\eta \cdot \gamma^2) \approx 6N/(\gamma^2)$$ (9) the subthreshold leakage power consumption will be dominant, while for higher activity rates, the dynamic power consumption comprises the main part of the power consumption. Since $\alpha_C$ is proportional to $1/\gamma^2=(I_{leak}/I_{peak})^2,~\alpha_C$ increases quadratically with reducing $\gamma.$ This means that in more advanced CMOS technologies, the contribution of leakage current will be more evident and $\alpha_C$ will be higher. As illustrated in Fig. 7, $\alpha_C$ increases considerably by moving towards technologies with smaller feature sizes. While in 0.18 $\mu$ m CMOS technology $\alpha_C\approx 10^{-4}$ for $V_{DD}$ =0.2V, it increases by almost four orders of magnitude in 65nm CMOS technology at the same supply voltage. Based on Fig. 5(b), the maximum operating frequency of a CMOS gate $(f_{max})$ can be estimated by: $$f_{max} \approx I_{peak}/(2V_{DD}C_L).$$ (10) Although this is a simplified relationship, it can predict with good accuracy the $f_{max}$ . To complete the calculations, it is necessary to estimate the peak and leakage currents. The EKV model can provide a general expression for drain current of MOS devices operating in different regions and different supply voltages [11]. Using the EKV model, it is possible to calculate the peak and leakage currents in $V_{GS}$ = $V_{DD}$ and $V_{GS}$ =0V, respectively. Figure 8 depicts the peak and leakage currents for a CMOS inverter gate designed in 65nm technology. It is noticeable that the leakage current does not reduce exponentially by reducing the supply voltage when the devices operate in subthreshold. This implies that reducing the supply voltage does not help very much to reduce the leakage current. The other important parameter is $\gamma = I_{peak}/I_{leak}$ which is an indicator of power efficiency in CMOS topology. While $\gamma \approx 10^4$ for $V_{DD} > 0.6$ V, it reduces rapidly by reducing the supply voltage and ultimately it gets close to unity for very low supply voltages. In addition to (8), the EKV model provides the necessary information in order to estimate the power consumption versus speed of operation for CMOS topology. Fig. 7. Variation of the critical activity rate $(\alpha_C)$ as a function of $V_{DD}$ , for different technology nodes. In 65nm technology node, $\alpha_C$ is shown for both high $V_T$ and low $V_T$ devices. Fig. 8. Peak current and leakage current of a CMOS inverter gate as a function of $V_{DD}$ in 65nm CMOS. The analysis done in this Section does not depend on the type of logic cell used in the test structure shown in Fig. 5 and it is sufficient to use the $I_{peak}$ and $I_{leak}$ corresponding to the proposed logic circuit to complete the discussion. # IV. PERFORMANCE COMPARISON Using (4) and (8), it is possible to compare the power consumption of two chains of identical gates with logic depth of N that are constructed based on CMOS and STSCL topologies. Based on this comparison, the maximum logic depth for which the STSCL topology exhibits lower power consumption compared to the CMOS topology, is: $$N_{max} \approx \begin{cases} \frac{I_{leak}V_{DD}}{\ln 2V_{SW}C_L f_{op}V_{DD,STSCL}} & \text{if } \alpha << \alpha_C, \\ \sqrt[3]{\frac{\alpha}{6} \cdot (\frac{I_{peak}V_{DD}}{\ln 2V_{SW}C_L f_{op}V_{DD,STSCL}})^2} & \text{if } \alpha >> \alpha_C. \end{cases}$$ (11) where $V_{DD}$ is the supply voltage of CMOS circuit. Figure 9 compares the power consumption of CMOS and STSCL XOR gates for logic depth of 20 as a function of operation frequency based on simulation results in CMOS 65nm. It can be seen clearly that the power consumption of CMOS gates cannot be reduced beyond a certain level due to leakage (both for LVT and HVT case), whereas the STSCL topology offers smaller power consumption below the cross-over frequency. The maximum logic depth for which an STSCL circuit with operating frequency of $f_{op}$ consumes less power compared to its CMOS counterpart, is shown in Fig. 9(b), for 65nm CMOS technology. The comparison has been made for XOR gates and the simulation results have been depicted for both HVT and LVT (low $V_T$ ) devices. As expected, increasing the logic depth reduces the efficiency of the STSCL topology. However, for low supply voltages or at low operation frequencies where Fig. 9. (a) Simulated power consumption versus operation frequency for CMOS and STSCL XOR gates with logic depth of N=20. Note that CMOS power consumption cannot be reduced beyond a certain level due to leakage. (b) Maximum logic depth for which STSCL topology exhibits less power consumption compared to the CMOS topology based on (11) (dashed lines) in comparison to the simulation results. The results are shown for both low $V_T$ (top) and high $V_T$ devices (bottom) in 65nm CMOS technology. XOR logic gates are used for this comparison. Here, $V_{DD,STSCL}$ =400mV, and $V_{SW}$ =200mV. the leakage current is more evident, STSCL starts to exhibit better performance. This can be also concluded from (11(b)). At high frequencies, $N_{max}$ is proportional to the activity rate. Therefore, STSCL (or SCL) topology needs to be employed in high activity rates. On the other hand, Fig. 9 and (11(b)) imply that as operation frequency reduces, $N_{max}$ increases and hence power efficiency of STSCL will increase in comparison to CMOS. In other words, in nanometer scale technologies where subthrshold leakage current in CMOS topology is more evident, STSCL topology can offer a more power efficient solution, even at low activity rates (or equivalently, for higher logic depths). This is in addition to the superior power-delay performance of SCL topology at very high activity rates or very high frequencies [9]. Figure 9(b) also shows that with HVT devices, the power efficiency of CMOS topology improves. However, the main issue with HVT devices is that they can not be used in very low supply voltages mainly because of reliability issues. Figure 10 shows the measurement results for two $(8\times8)$ array multipliers designed based on CMOS and STSCL topologies. The test circuits are implemented in $0.18\mu m$ CMOS technology where the leakage current is much less than CMOS 65nm. As depicted in Fig. 10, for frequencies below 80kHz, the STSCL topology consumes less power and exhibits less variations due to the process and temperature differences. As predicted in Fig. 9, it is expected that in more advanced technologies, the cross-over frequency increases. Fig. 10. Measured power consumption versus operating frequency for two $8\times8$ STSCL and CMOS array multipliers fabricated in $0.18\mu m$ CMOS technology. The simulation results in different process corners and temperatures have been shown. ### V. CONCLUSION An analytical approach for studying and comparing the performance of ultra-low power CMOS and STSCL circuits has been presented. While there is a tight tradeoff among the power consumption, speed of operation, and supply voltage in design of CMOS digital circuits, the STSCL topology provides a more flexible design option for ultra-low-power applications. The frequency range in which STSCL topology exhibits a superior performance over static CMOS topology, depends on the logic depth and also the leakage current in CMOS circuits. While STSCL topology occupies more area and the supply voltage can not be reduced below $10 U_T$ , this topology can be utilized successfully for reducing the power consumption of digital systems well below the levels limited by CMOS subthreshold leakage current when the circuit operates in low frequencies. #### REFERENCES - M. Pedram and J. Rabaey, Power Aware Design Methodologies, Kluwer Academic Publishers, 2002. - [2] M. Anis and M. Elmasry, Multi-Threshold CMOS Digital Circuits, Managing Leakage Power, Kluwer Academic Publishers, 2003. - [3] H. Soeleman, K. Roy, and B. C. Paul, "Robust subthreshold logic for ultra-low power operation," *IEEE Trans. on Very Large Scale Integration* (VLSI) Syst., vol. 9, no. 1, pp. 90-99, Feb. 2001. - [4] B. H. Calhoun, A. Wang, and A. Chandrakasan, "Modeling and sizing for minimum energy operation in subthreshold circuits," *IEEE J. Solid-State Circuits*, vol. 40, no. 9, pp. 1778-1786, Sep. 2005. - [5] B. Nikolič, "Design in the power-limited scaling regime," *IEEE Trans. on Electron Devices*, vol. 55, no. 1, pp. 71-83, Jan. 2008. - [6] N. Verma, J. Kwong, and A. Chandrakasan, "Nanometer MOSFET variation in minimum energy subthreshold circuits," *IEEE Trans. on Electron Devices*, vol. 55, no. 1, pp. 163-174, Jan. 2008. - [7] E. Alon and M. Horowitz, "Integrated regulation for energy-efficient digital circuits," *IEEE J. Solid-State Circuits*, vol. 43, no. 8, pp. 1795-1807, Aug. 2008. - [8] A. Tajalli, E. Vittoz, Y. Leblebici, and E. J. Brauer, "Ultra-low power subthreshold current-mode logic ulitising PMOS load device concept," *IET Electronics Letters*, vol. 43, no. 17, pp. 911-913, Aug. 2007. - [9] A. Tajalli, E. J. Brauer, Y. Leblebici, and E. Vittoz, "Subthreshold source-coupled logic circuits for ultra-low power applications," *IEEE J. Solid-State Circuits*, vol. 43, no. 7, pp. 1699-1710, Jul. 2008. - [10] P. R. Gray, P. J. Hurst, S. H. Lewis, and R. G. Meyer, Analysis and Design of Analog Integrated Circuits, John Wiely & Sons Inc., Fourth Edition, 2000. - [11] C. C. Enz and E. A. Vittoz, Charge-Based MOS Transistor Modeling, John Wiely & Sons Inc., 2006.