Keywords

1 Introduction

One important requirement for sensors and cyber-physical systems developed for Human-Computer Interaction applications is the reduced size of the hardware, which commonly leads to the use of state-of-the-art CMOS nanotechnology to produce such hardware sensors. Moreover, the Internet of Everything (IoE) paradigm is enabling the interaction with a wide variety of devices, which tend to be self-powered and equipped with complex digital systems, including microcontrollers, sensors and sensor networks. Therefore, power consumption in CMOS integrated circuits, as never before, have a huge importance in today’s cyber-physical systems and sensors for HCI applications, as all self-powered devices quest for the never-ending battery life, but also with smaller and smaller dimensions every day. However, the use of reduced CMOS technology with reduced power budgets imposes additional reliability challenges for such hardware. The success and effectiveness of an HCI application relies greatly in the use of reliable hardware, especially those applications that are safety-critical, or those that can influence human lives (such as [1]).

Reliability and power consumption are two key concerns in the development of today’s cyber-physical systems. Several low-power techniques are available to reduce consumption in today’s chips, such as Dynamic Voltage and Frequency Scaling (DVFS) techniques, to reduce performance and power consumption when it’s not needed, or the use of subthreshold power-supply voltages [32], to boost up the energy savings. However, the use of low energy modes of operation increases the probability of error occurrence due to several problems: (1) reducing power-supply voltages impose the simultaneous reduction of the clock frequency (the performance reduction), which changes the existing power and performance operating conditions (VDD/clock frequency pair) and may lead to a different and non-optimized circuit operation; (2) working at reduced power-supply voltages makes the circuit more vulnerable to operational-induced delay-faults and transient-faults, as well as to several environmental parameters (radiation, electromagnetic interference, aging, etc.). Moreover, the use of nanotechnologies with transistors with reduced dimensions increases the uncertainty of circuit behaviors. With smaller gate dimensions, gate’s vulnerability to environmental conditions is increased. Therefore, the simple use of wider static security margins to account for all variability is not enough and new methodologies to improve reliability in nanometer technologies should be defined, because they definitely should be compatible with ultra-low-power design schemes and aggressive low-power techniques.

The performance of CMOS circuits can be affected by many effects and parametric variations, and the most important ones are: Process, power-supply Voltage and Temperature variations (PVT) [15, 28, 29], and when CMOS technology entered the nano-size era, also the aging effects (PVT and Aging – PVTA), with effects like the Bias Temperature Instability (the negative, NBTI, and the positive, PBTI) [19] being a main concerns that affect transistors. These variations, alone or pilled-up, degrade circuit performance and increase variability in circuits. Consequently, circuit operation and its performance should constantly be monitored in today’s chips and hardware systems, to avoid errors and make circuits and their applications reliable. In this context, several performance sensors were proposed in the past [30, 33], but, or they do not work at reduced subthreshold VDD voltages ([30]), or their use is complex to implement and they are not widely adopted by industry ([33]).

The purpose of this work is to propose new easy-to-use performance sensors for a reliable operation that can work at sub-threshold voltages, allowing its use in ultra-low-power applications. This performance sensor is an improved version of the previously published aging sensors [19] and [34], but with the necessary changes to work at subthreshold voltage levels, constituting the low-power (LP) version. This new sensor has all the main features available in the [34] global sensor version, but sensor functionality was re-design so that VDD can be changed to subthreshold voltage levels and prediction detection of errors is enhanced, to account for higher unpredictability at reduced voltage levels.

The reminder of the paper is as follows. The background work on performance sensors is presented on the next section (Sect. 2). Section 3 presents the LP Global Performance Sensor (LPGPS), its architecture and functionality. Section 4 describes how to use the LPGPS in a circuit and how to achieve a reliable operation. Section 5 present the SPICE simulations results, while finally on Sect. 6 the conclusions and the future work are summarized.

2 Background

In the past, various sensors were proposed and can be used to measure performance, despite the fact that not all were proposed as performance sensors. The purpose is to monitor changes in performance, time deviations in transitions occurred at key nodes, or even to monitor delays in the circuit, and to detect when these variations may lead to circuit errors. These sensors may appear in literature as aging sensors, delay-fault sensors, or as soft-error sensors. Nevertheless, they all can be labeled as performance sensors, as they can identify deviations that affect circuit’s performance [30].

There are two approaches regarding the use of these sensors in the circuit: local sensors or global sensors. Local sensors detect circuit’s degradation locally, where errors may occur, and they provide a detailed analysis on circuit’s performance behavior [9,10,11, 16, 17, 33]. However, they have two important limitations: (1) when working on-line, they can only monitor performance and detect its variation if they are activated (or when circuit operation activates the monitored paths); (2) their use and implementation in a circuit is complex and, because of that, so far they are not adopted by industry. Regarding global sensors, they use key parameters to detect circuit’s performance degradation, or use dummy critical paths’ replicas [20,21,22,23]. Despite the fact that global sensors do not monitor the actual circuit locations where errors may occur and, because of that, they may have different PVTA variations when compared to the circuit they are monitoring, their use is straight-forward and performance monitoring is, normally, independent from circuit operation. Consequently, they can easily be used in circuits and they are widely used in industry. Interestingly, both global and local sensors can also be used in improved solutions to monitor performance degradation and they work cooperatively to minimize error occurrence [13, 19]. However, despite the good results of this solution, its complexity requires too much effort to be widely adopted in industry.

Regarding the error detection, there are also two possible approaches: detect errors predictively, or detect errors after their occurrence. Detect errors predictively is done by identifying data transitions which arrive in the eminence of an error at key FFs or key memory elements. This is done using a safety margin or guard-band, placed before the clock edge-trigger [9, 16, 17, 26]. Errors are anticipated, as sensors detect the eminence of an error, without the actual error occurrence. However, if an abnormal delay exists that exceeds the guard-band margin, an irrecoverable error happens. On the other hand, detecting errors after their occurrence is done by detecting and/or correcting late transients at key memory elements. For this, an additional delayed capture is needed [11, 18, 24, 25]. However, to correct errors after their occurrence needs usually complex recovery mechanisms, and using different architectural levels. Another possibility is to locally use time-borrowing to avoid errors, by borrowing time from subsequent clock cycles, reducing again the safety margins of subsequent clock-cycles. Interestingly, improved solutions like [27] implement both strategies, but the achieved solutions is complex, resulting in a FF with high overhead in performance and area.

Despite the variety of solutions in literature to monitor performance or error occurrences related to performance deviations, only recently appeared in literature a work focused on sensors and techniques to work at subthreshold voltage levels in the power supply ([33]). The work in [33] presented a local performance sensor that can work predictively at subthreshold voltage levels. Despite this good solution, local sensors are difficultly adopted by industry, because they are intrusive to the circuit, i.e., the original circuit has to be changed in order to insert locally the sensors. Therefore, changes in performance may be restricted but they are expected, and industry normally avoids changing their optimized designs and prototypes. As a consequence, global sensors are, still now, a better solution for industry, because they are not intrusive and, according with the safety margins adopted, they can easily work predictively and avoid errors, while help on optimizing performance and/or power consumption in the presence of environmental and operational induced variations.

Regarding subthreshold operation, several previous works show that working at subthreshold voltages can drastically decrease power consumption, being one of the most important research areas regarding optimization of energy in digital circuits. Especially if IoT, mobile, or battery-operated applications are considered, subthreshold operation is gaining increased importance, because these applications normally do not require permanent and intensive performance, or the processing speed is not a critical factor for all the operating time. Moreover, application for subthreshold operation is wide, from digital circuits [7, 8], to analog circuits [5, 6], mixed-signal applications [4], or even at memory applications [1,2,3]. Therefore, as mobile and battery-operated applications are becoming more and more popular and widely used, it’s urgent to define new reliability techniques and sensors to work at subthreshold voltage levels.

The purpose of this work is to adapt existing global and local sensors for performance failure prediction to be compatible with subthreshold voltages in the VDD. The main purpose of the new proposed sensor is to avoid errors, monitor PVTA variations, and measure performance deviations. This measure can be used to control the application of DVFS techniques, used to reduce power consumption and/or clock frequency.

3 Low-Power Global Performance Sensor

The proposed Low-Power Global Performance Sensor (LPGPS) is an improved version of the previously published local aging sensors [19, 34], but with the necessary changes to work at subthreshold voltage levels and improving its design. The LPGPS is based on two dummy critical paths (CP), to produce several delays’ replicas. As will be explained later, these two dummy CP are designed in order to be highly sensitive to NBTI (one path) and to PBTI (the other path) aging effects, while the delay in both paths will change according to every environmental or operational parameter variation that can affect performance. Therefore, we can say that the LPGPS is sensitive to the main variations affecting performance, i.e., PVTA variations.

When we talk about performance, we are intrinsically talking about delays and timings available to perform a certain task. Therefore, when designing a performance sensor, the best way is to monitor delays and monitor or measure how quickly a specific task is performed. In CMOS circuits, delays are measured by comparing an initial signal transition and the succeeding transition at the end of the stimulated data path. Hence, by changing any parameter that can affect performance in a circuit (e.g., PVTA variations), it will change the delay between the initial and the ending transition in the stimulated path. Moreover, as different transitions can impose different delays, our decision in this work was to implement dummy paths (two in fact, as will be explained later) and to stimulate them with the two possible transitions, Low-to-High and High-to-Low. Then, by measuring how far in these paths will the initial transitions propagate during a predetermined time imposed by the clock (the clock period) we have a measure of the performance of these dummy critical paths.

3.1 Sensor’s Architecture and Functionality

Figure 1 presents the main LPGPS architecture. It is composed of a controller block, two dummy critical paths, and two groups of sensor Latches, to measure the propagation delay in both dummy paths, in order to compare these delays with the available clock period. With the knowledge of the critical paths of the circuit under test (CUT) where this sensor will be installed, it is possible to create two fictitious paths (dummy paths) with propagation delays higher than the expected CUT’s CP during its lifetime.

Fig. 1.
figure 1

Low-Power Global Sensor architecture.

Figure 2 shows the dummy critical paths in more detail. As it can be seen, the two paths are composed of chains with NAND and NOR gates. One chain is implemented with NOR gates (dummy critical path 1 in Fig. 2), creating a fictitious critical path, which will, presumably, age more than the critical paths of the circuit when subjected to NBTI effect (which strongly influences the degradation of PMOS transistors’ Vth). The other chain is implemented with NAND gates (dummy critical path 2 in Fig. 2) creating another fictitious critical path which will, presumably, age more than the critical paths of the circuit when subjected to the PBTI effect (which strongly influences the degradation of Vth in NMOS transistors).

Fig. 2.
figure 2

Dummy critical paths of the LPGS.

It is important to notice that, in a global sensor that is different from the CUT itself and monitors delays from paths that are not the actual CP in the CUT, we must consider a safety margin to account for differences from CUT’s CP delay and the LPGPS dummy paths delays. Moreover, when we want to address aging variations, we must consider that exists the possibility of the CUT’s CP age more than the sensor’s CP, which may overcome the safety margin used and then errors may happen. Therefore, it is important to impose an extreme aging degradation to these dummy CP in the LPGS, so that the existing safety margin always work in favor of error avoidance and not the opposite. That is also the reason for using two dummy CP, one sensitive to NBTI aging and the other sensitive to PBTI aging.

The NORs and the NANDs input port-map are also very important for the high aging degradation of the PMOS and NMOS transistors in the respective chain paths. The internal structure of NOR and NAND gates are presented in Fig. 3. For the NOR gate (Fig. 3(a)), the probability for the transistor P1 to be in stress mode is equal to the probability of having its NOR_chain input at logic value 0. If global sensor is activated periodically, this signal will most likely be at low logic value most of the time, making a high degradation probability for transistor P1. However, the probability to put P2 in stress mode is equivalent to the probability of having both P1 and P2 transistors on, i.e., \( {\text{P}}\left( { \left[ {{\text{NOR}}\_{\text{chain}}} \right]_{input}\,= 0 } \right) \times {\text{P}}\left( { \left[ {{\text{Age}}\_{\text{enable}}\_1} \right]_{input}\,= 0 } \right) \). Considering that global sensor is activated periodically, Age_enable_1 signal has low probability to be at 0 logic value. Henceforth, P2 will have negligible degradation. Yet, a high degradation probability of CP delay’s replica is guaranteed with the high degradation probability of all the P1 transistors from the NOR chain, due to the NBTI effect. Moreover, if a higher degradation is needed in the dummy critical path, a 3-input NOR gate can also be used, with two of its inputs connected to the same NOR_chain input and having now two PMOS transistors in a high aging state. However, if subthreshold voltages are to be used, 3-input gates that use a classic CMOS and implementation should be avoided, as may restrict VDD reduction.

Fig. 3.
figure 3

Internal structure and port map for: (a) NOR gates, and (b) NAND gates.

Regarding the NAND gates, a similar analysis can be drawn to create a high aging probability in the dummy critical-path 2. For the NAND gate (Fig. 3(b)), the probability for the transistor N1 to be in stress mode is equal to the probability of having its NAND_chain input at logic value 1. If global sensor is activated periodically, this signal will most likely be at low logic value most of the time, making a high degradation probability for transistor N1. However, the probability to put N2 in stress mode is equivalent to the probability of having both N1 and N2 transistors on, i.e., \( {\text{P}}\left( { \left[ {{\text{NAND}}\_{\text{chain}}} \right]_{input}\, = 1 } \right) \times {\text{P}}\left( { \left[ {\overline{{{\text{Age}}\_{\text{enable}}\_2}} } \right]_{input}\, = 1 } \right) \). Considering that global sensor is activated periodically, \( \overline{{{\text{Age}}\_{\text{enable}}\_2}} \) signal has low probability to be at 1 logic value. Henceforth, N2 will have negligible degradation. Yet, a high degradation probability of CP delay’s replica is guaranteed with the high degradation probability of all the N1 transistors of the NAND chain due to PBTI effect.

The LPGPS operation is as follows. When the local control unit in the LPGPS (the Global Sensor Controller block) receives a signal to start the analysis of the circuit’s performance and generates control signals (\( {\text{Test}}\_{\text{data}}\_1 \), \( {\text{Age}}\_{\text{Enable}}\_1/\overline{{{\text{Test}}\_{\text{Enable}}\_1}} \), \( {\text{Test}}\_{\text{data}}\_2 \), \( \overline{{{\text{Age}}\_{\text{Enable}}\_2}} /{\text{Test}}\_{\text{Enable}}\_2 \)) to operate the overall performance analysis. These control signals allow to place transparent NOR and NAND gates chain inputs, using signals \( {\text{Age}}\_{\text{Enable}}\_1 \) and \( \overline{{{\text{Age}}\_{\text{Enable}}\_2}} \)), so that through signals \( {\text{Test}}\_{\text{data}}\_1 \) and \( {\text{Test}}\_{\text{data}}\_2 \) can be generated a test sequence that stimulates the two state transitions at the outputs of the gates of the two chains. Along the two chains, special sensor cells, build with a Latch, a Delay Element and an Activity Sensor with on-retention logic, are connected at the output of several NAND and NOR gates, to create several fictitious paths with different propagation times. The architecture of these sensor cells will be defined in the next section.

3.2 Low-Power Sensor Latch

As observed in Fig. 1, critical path outputs are connected to sensor latches. These sensor latches are composed by a common D-Latch, to implement the Latch functionality, and an Activity Sensor, to implement the sensor functionality, as denoted in Fig. 4. The existence of a Latch functionality is important because we need to compare the delays of the CP connected to the latches with the existing clock period, and the Latch is a clocked gate that can easily implement different behaviors along the clock period. The implemented latch is enabled when the clock signal is low, i.e., it is in transparent mode for the low state of the clock, and in the non-transparent mode (opaque) for the high state of the clock. Regarding the activity sensor functionality, its main function is to sense and signalize when critical transitions occur in the Latch, i.e., when transitions in the output of the dummy CP arrive at the Latch input at the end of the transparent mode. When a critical transition in the Latch is signalized, it means that the CP has a delay almost similar to the clock period, with a pre-defined safety margin. As we have several dummy CP outputs to sensor latches, we can have a measure of the comparison made between the dummy CP delays and the clock period, having a rate for the performance level of the circuit. Moreover, it is important to impose a safety margin in this monitoring process, as the dummy CP in the LPGPS is not the real CP of the CUT. This safety margin can be identified as the margin to signalize critical transitions in the Latch data, before the latch become opaque. Moreover, the activity sensor implements an on-retention logic, to keep sensor output (SO) active, once it is activated and until the sensor reset (\( \overline{SR} \)) is activated.

Fig. 4.
figure 4

Low Power Sensor Latch architecture.

The architecture of the new Activity Sensor is presented in Fig. 5. The sensor functionality is based on delaying the data signal inside the activity sensor. The difference in time delay between the delayed and the non-delayed data signals creates a time window, which allows a safety margin. If a transition in this delayed data signal occurs after the positive clock trigger, the sensor detects a critical transition for the existing clock period and signalizes it in the sensor output (SO). This activity sensor used was previously published in [33] and it is especially designed to allow VDD reductions to subthreshold levels.

Fig. 5.
figure 5

Activity Sensor architecture.

Figure 6 presents a timing diagram for the main signals of the activity sensor described in Fig. 5. From Fig. 1, each delay path connected to a sensor latch will propagate an initial transition from the positive edge-trigger of the clock until it reaches the Latch input, and this timing delay is symbolized in the D signal in Fig. 6, which becomes stable after τDE. If the transition arrives at D input when clock is low, the latch is in transparent mode (Fig. 4) and easily Q signal will also change (Fig. 6). However, as denoted in Fig. 5 and Fig. 6, the Delay Element (DE) is used to postpone data signals’ arrivals at the output of the latch (signal Q), during the Clk low state, while a XOR gate is used to generate a pulse (det signal) for every activity in the latch data (signal Q).

Fig. 6.
figure 6

Timmings in the Sensor Latch.

Note that both inputs of the XOR gate are feed from signal Q, although one is a delayed signal Q. This means that the output of the XOR gate, the det signal, will have a pulse when any transition (Low-to-High or High-to-Low) occurs at the inputs, with its pulse duration proportional to the propagation delay of DE block. It is important to note that the XOR gate should be implemented using a pass-transistor logic, to allow VDD to be reduced to subthreshold voltage levels. Figure 7 presents a possible architecture for the XOR gate.

Fig. 7.
figure 7

Pass-transistor logic XOR gate’s architecture.

The reminder functionality of the activity sensor from Figs. 5 and 6 is, basically, the AND functionality implemented with transistors Q1–Q4 and the inverter gate. When a clock pulse (considering that the latch will be opaque when clock is active) and a det pulse occurs simultaneously, the output SO signal (sensor output) will activate the sensor output signal. This way, all the late transitions at Latch input will be signalized by the Activity Sensor. Therefore, a pre-defined margin obtained from the propagation delays of DE blocks and the det signal pulse duration defines the instant before ending the clock period where the activity sensor output is activated. If the CP propagation delays increase due to adverse PVTA variations, the margin will also increase, which is a good result for a sensor functionality, as the safety margin of the global sensor, and the detection margin of this activity sensor, increases when operating conditions worsen.

Moreover, this activity sensor has also an on-retention logic feature. This is executed by transistors Q5 and Q6 (Fig. 5), which avoids the use of an additional latch to hold an active sensor output signal (SO). Besides, it has also an active-low reset signal, the \( \overline{SR} \) input, to force SO signal to reset (no sensor predictive error detection), which is implemented by transistor Q7.

Regarding the Delay Element, as mentioned before, its propagation delay defines a detection margin in the sensor latch. As working at subthreshold voltages imposes to avoid complex gate architectures, the architecture of the DE is basically two inverters, as presented in Fig. 8 (a simple buffer). Nevertheless, if several delay options are needed for the DE block, multiple buffers can be used to provide higher propagation delay in the DE block. Moreover, the DE can be optimized by designed, by changing W/L transistors’ ratios, which also changes cell’s propagation delay. The use of a higher detection margins avoids the use small delay frames in consecutive sensor latches in the NOR/NAND paths. Therefore, the granularity of the performance measurement obtained by the Global Sensor, for a specific maximum critical path, depends on the number of sensor latches used and their detection margins. Note that, the time margin is not programmable, as done in other aging sensor solutions [17]. Instead, the margins are defined at design time. Nevertheless, DE’s propagation delay is adaptive with PVTA variations, which changes, as needed, the sensitivity of the Activity Sensor to activate its output. Moreover, Sensor Latches and DEs design and selection should be tuned during silicon validation, according with the clock frequency and VDD chosen.

Fig. 8.
figure 8

Delay element typical architecture for subthreshold operation.

4 Global Sensor’s Usage Procedure

The main purpose of the proposed LPGPS is to monitor and measure the performance, to avoid errors and guarantee a reliable operation, even in the presence of PVTA variations. Therefore, during the design stage, the circuit’s CP should be defined, preferably using an aging-aware static timing analysis tool, to define the worst case circuit’s CP. With this information, the LPGPS can be implemented (or selected from a list of different pre-defined LPGPS implementations), namely the dummy critical paths created with the NAND and NOR gates’ chains. The designer must choose how many NAND/NOR gates to use, according with the circuit’s CP, and how many Sensor Latches are used and how sparsely/densely placed in the dummy CP outputs (depending on the sensibility needed). Figure 9 resumes this procedure, and we can see that the selected outputs of the dummy critical paths must guarantee that circuit’s CP has its delay between the maximum and minimum dummy CP delays. Moreover, the Sensor Latch detection margin will guarantee that a progressive measurement of performance can be created by signalizing progressively more Sensor Latch outputs (for a given clock frequency and power-supply). When the detection margins of each Sensor Latch occur during the active phase of the clock, the Activity Sensor signalizes a detection. Figure 9 illustrates a typical situation, where circuit’s CP delay is placed in the middle of Global Sensor’s dummy paths delays, and the more sensitive sensor latches (connected to the high delay dummy paths) are signalizing error detections (S1–S4), while the less sensitive sensor latches (connected to the low delay dummy paths) do not signalize error detections (S5–S7).

Fig. 9.
figure 9

Delay element typical architecture for subthreshold operation.

After the circuit is fabricated, and during manufacturing off-line tests, an initial calibration procedure is mandatory (it can be repeatedly sparsely during lifetime). This procedure uses off-line tests (e.g., scan-based delay-fault oriented tests), performed repeatedly for different values of the clock frequency and voltage, to determine the maximum and minimum working clock frequencies and voltages. In addition, it will also determine the output of the Global Sensor (number of sensor latches with error detections) for a correct operation of the circuit under test. This information determines the LPGPS output for a safe/optimized circuit operation and should be used to avoid errors, and to increase clock frequency, if performance optimization is the goal, or to decrease VDD, if power consumption reduction is the goal. Note that by reducing clock frequency (or increasing VDD), it signalizes less sensor latches; and that by increasing clock frequency (or reducing VDD), it signalizes more sensor latches.

It is also important to note that, if LPGPS is activated sparsely in time, the aging degradations in the dummy-paths will be higher than in circuit’s CP, which guarantees a reliable operation during circuit’s lifetime. As LPGPS was developed to work at subthreshold voltages, this reliable operation is guaranteed even for ultra-low-power modes, with reduced VDD and clock frequency. Moreover, being a non-intrusive sensor, it can be easily adopted by industry.

5 Results

Spice simulation results are presented for a 65 nm technology using Predictive Technology Models [14]. Typical nominal conditions (NC) are VDD = 1.1 V with T = 27 °C. VDD is considered to be lowered until 0.3 V, which still is a possible working power-supply voltage (the minimum working VDD is 0,25 V).

The first simulations made were to show that the pulse generated in the Activity Sensor increases its length when PVTA variations worsen. Figure 10 shows the det signal (a pulse detecting an unsafe transition in Latch data input) from schematic presented in Figs. 4 and 5, for VDD variations and T (temperature) variations. As it can be seen, pulse duration increases with VDD reduction and T increase. Moreover, as shown in previous publications [9, 10, 19, 30], similar results are obtained for aging and process variations. This makes the sensor more sensitive when worse environmental conditions exists, which lightens the burden of sensor design [33].

Fig. 10.
figure 10

Det signal, with several VDD and T values for 65 nm technology: (a) VDD from 1,1 V down to 0.8 V; (b) T from 27º up to 147º.

Next simulation results present the Global Sensor’s outputs, when a monitoring procedure takes place. Figure 11 shows at the bottom the clock signal (2.4 GHz), in the middle the stimulus (rise and fall transitions) of the dummy paths (in this case, the NOR gates’ stimulus), and one Sensor Latch output (in this case, S4) in the upper graph. It can be seen that both transitions are activated (middle graph), which in this case triggers Sensor Latch number 4 to activate its output (in fact, S1–S4 sensors will be activated).

Fig. 11.
figure 11

Global Sensor signals (clock, NOR gates’ stimulus and Sensor Latch number 4 output).

In this LPGPS, seven Sensor Latches were used in each dummy path, creating eight performance levels for the monitoring process. Figure 12 shows the 7 worst case dummy CPs, considering both NOR and NAND gates’ chains (for VDD values higher than 0.6 V).

Fig. 12.
figure 12

Global Sensor dummy paths.

Several simulations were made to determine the clock frequency (clock period) for VDD values from 1.1 V to 0.3 V, so that the high sensitive Sensor Latches are activated (S1–S4). Figure 13 resumes all the simulations and presents Clock Period vs. Power-supply voltage (note that for easier understanding, the clock period axis is shown in logarithmic scale).

Fig. 13.
figure 13

Clock Period vs. Power-supply voltage for LPGPS operation signalizing sensors S1–S4.

Considering that the optimum operation (with a safety margin) is obtained for signalizing S1–S4 (and no activation for S5–S7), an optimized clock frequency can be obtained for each VDD value. Therefore, the LPGPS output can be used to dynamically change voltage and frequency with an on-line DVFS methodology. The circuit operation is optimized and a reliable operation is guaranteed even at subthreshold voltages because safety margins are increased when PVTA variations worsen and LPGPS increases its sensibility for worse operating conditions.

6 Conclusions

In this paper, a Global Performance Sensor for reliable operation was presented, designed especially to work at nominal and subthreshold power-supply voltage levels. Spice simulations for 65 nm CMOS technology show that VDD reductions increase LPGPS sensibility, which makes the sensor more cautious when reduced voltages are used (or worse PVTA degradations). This feature makes this solution unique in non-intrusive global sensors for ultra-low-power operation modes (working at subthreshold voltage levels).

Future work includes the use of the proposed LPGPS in an on-line adaptive DVFS scheme, to work at subthreshold voltage levels. Moreover, real circuit tests are also important, to validate the power and delay trade-off with real data obtained by measurements in real circuits. For this matter, a test chip was recently produced and new results are expected in the near future.