# A Deep-Subthreshold Variation-Aware 0.2-V Open-Loop VCO-Based ADC

Viet Nguyen<sup>()</sup>, *Student Member, IEEE*, Filippo Schembari<sup>()</sup>, *Member, IEEE*, and Robert Bogdan Staszewski<sup>D</sup>, Fellow, IEEE

Abstract-This article demonstrates the potential of deepsubthreshold mixed-signal circuits in delivering medium-to-high performance to supply-constrained, energy-harvesting Internet of Things (IoT) sensing applications. This effort encapsulates the design and implementation of an ultra-low-voltage (ULV) 0.2-V open-loop VCO-based analog-to-digital converter (ADC). A replica VCO facilitates variation-aware VCO analog linearization. Analog phase-domain signal processing (APSP) techniques for beat-frequency extraction, phase-interpolation, and phase-folding relax constraints on both voltage-to-frequency analog circuitry and frequency-to-digital synchronous digital hardware. High-speed multi-phase frequency-to-digital converters (FDCs) and multi-rate digital back-end enable a sampling speed of 35 MS/s. The ADC prototype is implemented in 28-nm CMOS and achieves a peak SNDR of 64.4/59.9 dB, equivalent to an ENOB of 10.4/9.7 over 80-/160-kHz bandwidth (BW). The ADC core occupies an active area of 0.12 mm<sup>2</sup> and consumes 15.9  $\mu$ W, resulting in a Walden and Schreier FoM of, respectively, 73.3/61.5 fJ/c-s and 161.4/159.9 dB at the corresponding BW configurations. Measurements across multiple ICs and supply voltages consolidate the value of variation-aware deep-subthreshold open-loop ADCs.

Index Terms-0.2 V, analog-to-digital converter (ADC), beatfrequency (BF), deep-subthreshold, energy-harvesting, folding and interpolation, Internet of Things (IoT), phase-domain, ultralow voltage (ULV), variation-aware, voltage-controlled oscillator (VCO)-based ADC.

### I. INTRODUCTION

NERGY harvesting (EH) holds an immense promise of L delivering full energy autonomy and system portability to networks of densely interconnected Internet of Things (IoT) devices. Ambient energy sources, such as photovoltaic, thermo-electric, and human vibration harvesters, can produce an output voltage well below 300 mV with power density

Viet Nguyen and Robert Bogdan Staszewski are with the School of Electrical and Electronic Engineering, University College Dublin, Dublin 4, D04 V1W8 Ireland (e-mail: viet.nguyen@ucdconnect.ie).

Filippo Schembari was with the School of Electrical and Electronic Engineering, University College Dublin, Dublin 4, D04 V1W8 Ireland. He is now with Huawei Technologies, 20090 Segrate, Italy.

Color versions of one or more figures in this article are available at https://doi.org/10.1109/JSSC.2021.3114006

Digital Object Identifier 10.1109/JSSC.2021.3114006

10<sup>4</sup> 10<sup>3</sup> 10 Energy/cycle (fJ) 10 <sup>2</sup> 10<sup>2</sup> 00 MH <sup>8</sup>μW 10 <sup>2</sup> 80 f.J 20x 80x 10<sup>0</sup> 10 5 MHz 100 hW 10 10 0.2 0.4 0.6 0.8 0.4 0.6 0.8 0.2 0.2 0.4 0.6 0.8 V<sub>DD</sub> (V) V<sub>DD</sub> (V) V<sub>DD</sub> (V) (b) (a) (c) S 25 Subthreshold σ<sub>del</sub> fosc Inverter Delay Variation (0.2 V)3 20 μ<sub>del</sub> Superthreshold osc.nom Relative f<sub>osc</sub> (0.8 V) 15 2 10 1 5 0 10 0 0 40 0.2 0.4 0.6 0.8 0 -10 V<sub>DD</sub> (V) Temperature (°C) V<sub>DD</sub> Variation (%) (d) (e)

Fig. 1. Simulated (a) power consumption, (b) energy/cycle, (c) oscillation frequency  $(f_{osc})$  of an RO, and (d) statistical variation in the propagation delay of an inverter from subthreshold ( $V_{DD} = 0.2$  V) up to super-threshold operation ( $V_{DD} = 0.8$  V). (e) Impact of VT variations on  $f_{osc}$  relative to the nominal frequency  $f_{\rm osc,nom}$  in sub- and super-threshold regions.

less than 100  $\mu$ W/cm<sup>2</sup> [1]. Intermediate dc–dc converters can boost the supply voltage, but intrinsic parasitic losses with large voltage up-conversion ratios waste the scavenged energy [2]-[4]. The need for innovation at the circuit and architecture levels is, therefore, paramount, as evidenced by research progress in ultra-low-power (ULP) and ultra-lowvoltage (ULV) sub-systems for communications [5]-[7], frequency synthesis [8]-[10], and digital processing [11], [12]. In the domain of analog-to-digital conversion, we recognize that the analog-to-digital converter (ADC) should: 1) operate from well below 0.3-V supply using scaled CMOS technology; 2) provide versatility to different applications, thus covering a resolution and bandwidth (BW) range of approx. 9-11 bits and 50–200 kHz, respectively; 3) achieve  $\mu$ W-level power dissipation; and 4) exhibit some degree of tolerance to process, voltage, and temperature (PVT) variations.

A voltage-controlled oscillator (VCO)-based ADC can be considered a natural candidate for ULV operation [13], [14] because of its mostly digital implementation (i.e., simple inverters, flip-flops, and logic gates). To reinforce the intuition of supply scalability, the characteristics of a ring-oscillator

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/



Manuscript received May 2, 2021; revised July 18, 2021; accepted September 10, 2021. Date of publication October 12, 2021; date of current version May 26, 2022. This article was approved by Associate Editor Shanthi Pavan. This work was supported in part by the Science Foundation Ireland under Grant 14/RP/I2921, in part by the European Union's Horizon 2020 Programme through Marie Sklodowska-Curie under Grant 747585, and in part by the Microelectronic Circuits Centre Ireland (MCCI) through Enterprise Ireland under Grant TC-2015-0019. (Corresponding author: Viet Nguyen.)



Fig. 2. Open-loop VCO-based ADC with an eight-stage ring oscillator  $(N_{del} = 8)$  and with (a) single XOR-based FDC  $(N_{FDC} = 1)$  used in [16] and (b) eight parallel XOR-based FDC slices  $(N_{FDC} = 8)$  followed by the output summation logic, as a starting point for this work.

(RO) are shown in Fig. 1. Low-power [see Fig. 1(a)] and low-energy [see Fig. 1(b)] consumption requirements call for the minimization of the power-delay product (PDP) [15]. This generally occurs at the lowest supply voltage feasible  $(V_{\text{DD,min}})$  and fortunately coincides with the available voltage generated from the aforementioned energy harvesters. However, the circuit designer must address a number of unique deep-subthreshold challenges detrimental to the achievable BW and resolution of the VCO-based ADC. The exponential dependence of MOS subthreshold drain current on  $V_{DD}$  and threshold voltage  $(V_t)$  results in a 20× reduction in RO oscillation frequency as  $V_{DD}$  scales down to 0.2 V from 0.4 V [see Fig. 1(c)]. Intra-die mismatch arising from process (P) random dopant fluctuation (RDF)-induced  $V_t$  variations leads to a  $5 \times$  increase in gate propagation delay uncertainty at ULV [see Fig. 1(d)]. Voltage and temperature (VT) fluctuations are dynamic by nature and alter  $f_{osc}$  by more than  $10 \times (f_{\rm osc,max}/f_{\rm osc,min})$ , in stark contrast with merely  $0.5 \times$ (or  $\pm 20\%$ ) change in super-threshold operation [see Fig. 1(e)].

Fig. 2(a) depicts the simple open-loop VCO-based ADC in our prior work [16]. The input voltage  $V_{in}$  passes through a resistive-drive "tune" circuit [17] with a voltage attenuation factor  $K_{tune}$ . This limits the peak voltage swing of  $V_{tune}$ that directly modulates the oscillation frequency  $f_{VCO}$  of the  $N_{del}$ -stage RO circuit. The  $V_{tune}$ -to- $f_{VCO}$  gain of the RO is  $K_{ring}$ , while the overall gain of the VCO ( $K_{VCO}$ ) is the product  $K_{tune}K_{ring}$ . A 1-bit XOR-based frequency-to-digital converter (FDC) [18] operating at the sampling speed of  $f_{CLK}$ is employed to digitize the VCO frequency information. For this topology,  $f_{VCO,max} < f_{CLK}/2$  prevents the overflow of the FDC (i.e., exceeding the available quantization range). Parallel FDC slices ( $N_{FDC} > 1$ ) can be introduced [see Fig. 2(b)] to further improve the quantization noise performance, at the cost of increased complexity in the synchronous digital hardware.

Operating at 0.2 V, the predecessor of our ADC [16] achieved 11-ENOB over 60-kHz BW while consuming merely 7  $\mu$ W. The "raw" figure-of-merit (FoM) of <30 fJ/c-s was obtained by performing a meticulous circuit-level optimization of the resistive-drive VCO structure [the input resistor network is sized so as to cancel third-order harmonic distortion (HD3)] and maximizing the speed of the phase-sampling sense-amplifier flip-flop (SAFF) within the 1-bit XOR-based FDC.

Despite the encouraging FoM, critical limitations exist. The over-reliance on harmonic distortion cancellation to linearize the high gain VCO meant that 6- and 18-dB losses in SNDR are incurred within, respectively, 2% and 8% variations of  $V_{DD}$ . At higher supply voltages (or temperature), the oscillation frequency of the VCO becomes so fast that it overflows the FDC.

A radically different approach is taken in this work. Instead of applying *circuit optimizations*, we explore whether *architectural improvements* can be introduced to obtain high resolution, BW, and possibly guarantee a "PVT-adaptive" functionality without incurring severely limited power efficiency. To this end, we present a versatile 0.2-V open-loop VCO-based ADC featuring a beat-frequency (BF) VCO network, resulting in a variation-aware, highly linear, and low-noise voltageto-frequency (V-to-f) conversion. Analog phase-domain signal processing (APSP) techniques provide a flexible interface between the analog (i.e., VCO) and digital (i.e., FDC) domains to alleviate their respective design constraints. Finally, multirate synchronous digital circuitry enables high-BW frequency digitization.

This article is organized as follows. Section II reviews the compatibility of the main classes of VCO-based ADCs with the ULV operation and discusses the design challenges of a subthreshold open-loop VCO-based ADC, which motivates the proposed architecture. Section III investigates the VCO network, while Sections IV and V are dedicated to the description of the APSP chain and its constituent circuits. The experimental results are presented in Section VI.

# II. ARCHITECTURAL CHALLENGES AND SOLUTIONS

# A. Compatibility of Prior Art With ULV Operation

To address the general limitations of VCO-based ADCs, several architectural innovations have proven effectiveness in mitigating the nonlinear V-to-f tuning curve of the VCO and the need for high-speed digital counting. Nonetheless, most of such solutions are not portable to ULV supply (<0.3 V). In the popular closed-loop configuration, the VCO and digital circuitry act as a multi-bit quantizer (frequencybased) [18], [19] or perform voltage-to-phase integration and quantization (phase-based) [20] within the delta-sigma ( $\Delta \Sigma$ ) modulator loop. The large loop gain suppresses the V-to-fnonlinearity and allows increasing the order of the loop filter for more aggressive in-band noise shaping. However, the stability demands a rather narrow loop BW, and moreover, such topology is hardly scalable with supply voltage since it heavily relies on analog-intensive circuitry, such as current sources and opamps.

In the multistage configuration [21], [22], the VCO-based ADC is typically used as the fine-stage quantizer to enhance the overall ADC resolution. V-to-f nonlinearity is mitigated by virtue of processing only a small residue voltage. Nevertheless, such a solution entails all the difficulties of the feedback DAC design, while the gain mismatch and timing delay between the coarse and fine paths seriously impair the ADC performance at ULV. An interesting alternative is to operate the VCO in the inherently linear two-point frequency modulation scheme. This enables cascading switched-current VCO-based integrators [23]–[26] within digitally intensive high-order noise-shaping  $\Delta\Sigma$  architectures, at the cost of an increased system complexity.

Background calibration of the V-to-f nonlinearity using multi-level dither correlation [27] can accommodate slow PVT variations although the digital signal processing circuitry may not be functional in deep-subthreshold and, furthermore, would otherwise greatly dominate the power budget. Foreground calibration [28]-[31] using post-silicon integrated built-in self-test only needs to operate intermittently (thus minimizing the averaged power), but a lack of real-time PVT tracking necessitates regular and disruptive off-line re-calibrations. Moreover, the accuracy of the calibration is dependent on the precision and stability of the test signal generated on-chip. This may be prohibitively expensive to integrate into a microwatt-level ULV design, limiting the calibration procedures to a well-controlled laboratory environment. The associated digital nonlinearity correction (NLC) also introduces quantization errors, while high-speed access of large memory arrays increases the power consumption and area [31].

The discussions above, while diverse, seem to reach the same unavoidable verdict: the supply-constrained environment of energy-harvesting IoT applications, together with the severe impact of PVT variations, demands the designer to pursue the lowest level of system complexity. The simplicity provided by the open-loop VCO-based ADC shown in Fig. 2, clearly offers a promising path to overcome the practical limitations of deep-subthreshold operation. Motivated by the pioneering works in [13] and [14] and building around the performance of [16], we continue to explore the deep-subthreshold open-loop ADC design landscape that, to date, remains uncharted territory.

## B. Deep-Subthreshold Design Challenges

To appreciate the impact of the deep-subthreshold operation on the achievable performance of the architecture shown in Fig. 2, we guide the reader through its system optimization.

The signal-to-quantization noise ratio  $(SNR_Q)$  derives from the contribution of the first-order noise-shaped quantization error

$$\text{SNR}_{Q} = 6.02 \cdot M_{Q} + 30 \cdot \log_{10} \left( \frac{f_{\text{CLK}}}{2\text{BW}} \right) - 3.41 \quad (1)$$

$$M_Q = \log_2 \left( \frac{2A_{\rm in} \cdot K_{\rm VCO} \cdot 2N_{\rm FDC}}{f_{\rm CLK}} \right) \tag{2}$$

where  $A_{in}$  is the input amplitude and BW represents the noiseintegration BW. The in-band signal-to-noise ratio (SNR<sub>PN</sub>), dominated by the VCO phase noise (PN), is equal to

$$SNR_{PN} = 10 \cdot \log_{10} \left[ \frac{(A_{in} \cdot K_{VCO})^2}{16 \cdot \mathcal{L}(\Delta f) \cdot \Delta f^2 \cdot BW} \right]$$
$$\mathcal{L}(\Delta f) \propto \frac{(f_{VCO,0})^2}{P_{VCO}}$$
(3)

where  $\mathcal{L}(\Delta f)$  represents the thermal PN<sup>1</sup> at an offset  $\Delta f$  from the rest (quiescent) oscillation frequency  $f_{\text{VCO},0}$  ( $V_{\text{in}} = 0.1 \text{ V}$ ),



Fig. 3. VCO frequency tuning curves depicting (a) tradeoff between tuning range and linearity (varying the slope,  $K_{tune}$ ), (b) effect of large  $V_{DD}$ -induced variations, and (c) significant reduction to be applied to the VCO gain to prevent the overflow of the FDC in response to  $V_{DD}$  variations.

with PN being further dependent on the VCO power  $P_{VCO}$ . See [32] for the derivation of the above expressions from first principles and [33] for the analysis of pseudo-digital RO noise and power efficiency.

The simplest optimization procedure is to first maximize the sampling speed of the SAFF within the FDC [ $f_{CLK}$ in (1)] to target a large oversampling ratio (OSR). Then, design the VCO with  $K_{tune}$  close to one so as to maximize its tuning range,  $2A_{in}K_{VCO}$ , essentially mapping the input voltage range to occupy as much of the frequency range from dc to  $f_{\text{CLK}}/2$  as possible. From here, further reduction in quantization noise is achieved by increasing the number of parallel FDCs [ $N_{FDC}$  in (2)]. Thermal noise is suppressed by burning more power in the VCO while maintaining the same rest oscillation frequency  $f_{VCO,0}$  [ $\mathcal{L}(\Delta f)$  in (3)]. The result, however, is a highly nonlinear V-to-f conversion that looks like the  $K_{\text{tune}} = 0.5$  tuning curve of Fig. 3(a). To overcome this with minimal hardware overhead, the pseudo-differential topology and the mixed-mode voltage/current resistor tuning scheme are chosen to cancel second- and third-order harmonic distortion (HD2 and HD3) components, respectively. At 0.2 V, the VCO operates in weak inversion and exhibits the highest power efficiency [33]. This optimization approach, therefore, allows the open-loop VCO-based ADC to operate near the upper limit of its theoretical performance although, unfortunately, it becomes unpractical when PVT variations are taken into consideration.

In a more realistic design,  $V_{in}$  must be sufficiently attenuated [toward the  $K_{\text{tune}} = 0.1$  tuning curve of Fig. 3(a)] so that  $V_{\text{tune}}$  exercises only a relatively linear portion of the VCO tuning curve [24], [34], [35]. This reduction in  $K_{\text{tune}}$  comes at the severe cost of both a higher thermal and quantization noise. To make matters worse, shifts in  $f_{\rm VCO,0}$  due to  $V_{\rm DD}$ (or, similarly, temperature) variations can cause the VCO to oscillate at frequencies much higher than  $f_{\text{CLK}}/2$ , which would thus lie outside of the quantization range of the following XOR-based FDC [see Fig. 3(b)]. Coarse-fine counting FDC architectures [30], [31], [34], [35] can accommodate  $f_{\rm VCO} \gg f_{\rm CLK}/2$  efficiently although, contrarily, they embed edge-triggered sequential logic that must operate at a speed even higher than  $f_{\text{CLK}}$  (which, as stated earlier, should conveniently be maximized). Moreover, intra-die mismatch induces random propagation delay variability that impacts the

<sup>&</sup>lt;sup>1</sup>Neglecting flicker noise for simplicity such that  $\mathcal{L}(\Delta f) \cdot \Delta f^2$  is constant.

reliability of high-speed flip-flops, leading to setup and hold time violations.

A more practical and reliable solution is to slow down the VCO with a large number of delay stages  $N_{del}$  [see Fig. 3(c)] or, optionally, by introducing excess capacitive load  $C_L$  at each output terminal of the delay stage. To maintain a sufficient SNR<sub>O</sub>, an increase in  $N_{FDC}$  is then required to compensate for the reduction in  $K_{\rm VCO}$ . For example, with a 3-MHz tuning range,  $N_{\rm FDC} \approx 16$  is required to achieve an effective resolution of 10-bit over 160-kHz BW and at 35 MS/s, assuming a THD <-70 dB (so that the ADC resolution is not limited by distortion) and PN of -100 dBc/Hz at 100-kHz offset. The under-utilized quantization range (blueshaded regions in Fig. 3) and the relatively high working frequency of the digital logic make the use of many parallel FDCs inefficient. To prevent timing violations with VT variability and statistical mismatch, propagation delay margins must be overestimated. This translates into a design with prohibitively large devices or an unnecessarily slow system clock  $f_{\text{CLK}}$ , both being suboptimal solutions. In addition, the long clock distribution, as well as the heavily pipelined output summation logic and the multi-bit output decimation filter [to realize the topology in Fig. 2(b)], would all concur to an overall power consumption exceeding by  $5 \times -10 \times$  that of the analog section of the ADC (the ring-VCO).

# C. Re-Engineering the Open-Loop VCO-Based ADC

Faced with the daunting challenges of deep-subthreshold operation, several advancements of the open-loop VCO-based ADC, as visualized in Fig. 4, are proposed as follows.

1) Embedded PVT Sensing: Since ROs offer a digitalintensive, low-cost solution for PVT sensing [36]–[38], a replica VCO (input terminal grounded) is proposed to dynamically track VT shifts via its free-running frequency  $f_{\text{REF}}$ . A low-rate, low-resolution (therefore, reliable) digital counter could be used to provide virtually costless digitization of  $f_{\text{REF}}$ . At the system level,  $f_{\text{REF}}$  can be used for the dynamic voltage-frequency scaling (DVFS) [39] of the entire energyharvesting sensor node. Furthermore, the replica VCO can serve as a multi-phase clock to the aforementioned dc–dc converters, such as a switched-cap voltage doubler [3], [6], [40] to generate  $2V_{\text{DD}}$ , which may be necessary elsewhere in the system.

2) Variation-Aware VCO Linearization: Harmonic distortion cancellation in the analog domain eliminates a significant portion of hardware required with digital NLC techniques. A high-gain, power-efficient VCO can then be designed, but the extreme sensitivity to PVT variations persists. Ultimately, if the energy constraints of the energy harvester do not allow for power-hungry, expensive on-chip calibration procedures, the ADC is practically constrained to operate at a single, stable temperature environment where the supply voltage is stringently regulated [29]. It is shown in [16] that, by tuning the input resistor network, the ADC linearity performance can be recovered at different  $V_{DD}$  points. We expand upon this in Section III to investigate whether  $f_{REF}$  can be directly utilized to extend the variation tolerance of the VCO open-loop analog linearization technique.



Fig. 4. (a) Conceptual block diagram of the proposed architecture. The frequency tuning curves demonstrate the beat-frequency extraction in the case of a relatively (b) slow and (c) fast VCO operations. (d) Phase folding interleaves four phases to utilize the available frequency quantization range of the single-bit FDC. The equivalent PFM spectra [46] for (c) and (d) are shown in (e), taking into account only the first modulation sideband.

3) Beat-Frequency Extraction: The rest oscillation frequency  $f_{VCO,0}$  is equivalent to a dc voltage "bias" that does not carry any signal information (as opposed to the signal component  $2A_{in}K_{VCO}$ ) and is anyway canceled by the ADC pseudo-differential configuration. Yet, in order to not overflow the FDC, the VCO must be slowed down significantly to suppress large PVT-induced shifts in  $f_{VCO,0}$ . This results in a massive under-utilization of the available FDC quantization range. Together with the need for a larger number of parallel FDC slices ( $N_{FDC}$ ), the high-speed synchronous digital circuitry becomes increasingly inefficient and unreliable.

By setting the gain of the replica VCO to be slightly larger than that of the main VCO,  $f_{\text{REF}}$  remains unconditionally higher than  $f_{\text{VCO,max}}$  [see Fig. 4(b) and (c)]. Both  $f_{\text{REF}}$  and  $f_{\text{VCO}}$  are passed through a mixer stage to extract their beatfrequency  $f_{\text{BF}}$  (=| $f_{\text{REF}} - f_{\text{VCO}}$ ) in the feedforward path, representing the useful signal component. Visualizing this in the frequency-domain for a single input tone [see Fig. 4(e)], the beat-frequency interface between the VCO and FDC performs a *translation* of  $f_{\text{VCO}}$  to an intermediate frequency (IF). Consequently, the down-converted modulation products ( $f_{\text{BF}}$ )



Fig. 5. Top-level implementation of the proposed 0.2-V VCO-based ADC core. The subscripts  $_{+/-}$  denote the respective positive and negative complementary halves of the pseudo-differential ADC. The phases within each half are differential and are denoted by the subscripts  $_{p/n}$ .

do not fall within the signal band (BW). The *baseband* signal information is preserved, while the complete decoupling of  $f_{\rm VCO,0}$  from  $K_{\rm VCO}$  allows the convenient design of large-gain VCOs.

The purposely introduced gain mismatch between the main and replica VCO causes  $f_{BF}(V_{in} = 0)$  to drift by a few hundred kHz as  $V_{DD}$  varies (i.e., from 0.2 to 0.25 V) although this represents barely 5% of the >10-MHz drift experienced instead by  $f_{VCO}$  versus  $V_{DD}$ , indicating that  $f_{BF}$  is significantly more *insensitive* to VT variations compared to  $f_{VCO}$ . Another advantage brought by the proposed beat-frequency interface is that it relaxes the frequency tuning requirements of the main VCO (e.g., a ~3× smaller programmable capacitor bank is needed compared to [16]) since the frequency shift that needs to be covered is  $f_{BF}$  (i.e., the difference of two frequency components) and not anymore the absolute  $f_{VCO}$  shift.

4) Phase Interpolation (PI) and Phase Folding (PF): When the VCO tuning range is limited, a large number of FDC slices (i.e.,  $N_{\rm FDC}$ ) may still be required to sufficiently suppress the quantization noise. A viable remedy is to take inspiration from the folding flash ADC architecture [41], where each voltage comparator is dedicated to multiple evenly distributed level-crossings rather than just one. Similarly, the VCO-based ADC can also benefit from such a concept, as its quantization process is inherently flash-based. For example, if  $f_{BF,max}$  <  $f_{\text{CLK}}/8$  [see Fig. 4(d)], a single phase can be interpolated by  $4 \times$  and then subsequently folded/interleaved back into a single wire. This results in an implicit frequency multiplication by the same factor of 4 (it is now  $f_{4BF}$  to be upper bounded to  $f_{\rm CLK}/2$ ). The pulse-frequency modulation (PFM) equivalence of this  $\times 4$  frequency *multiplication* is the up-conversion and expansion of the beat-frequency  $(f_{BF})$  modulation sideband shown in Fig. 4(e). The hardware resources of a single FDC slice are now shared among four phases (i.e., through analog recombination) to utilize its full quantization range. Arithmetically, each FDC slice computes the function

$$D_{\text{out,FDC}} = \frac{4 \cdot |f_{\text{REF}} - f_{\text{VCO}}(V_{\text{in}})|}{f_{\text{CLK}}/2} = \frac{2 \cdot f_{\text{4BF}}}{f_{\text{CLK}}}.$$
 (4)

## D. ADC Implementation

Fig. 5 shows the top-level architecture of the implemented pseudo-differential VCO-based ADC. The entire core operates at a nominal supply of 0.2 V. With respect to each complementary half (positive and negative), the input signal  $V_{in+}$  ( $V_{in-}$ ) directly connects to a resistor divider network [17], [23] and subsequently modulates the oscillation frequency  $f_{VCO+}$  ( $f_{VCO-}$ ) of a 16-stage ring-VCO. The input resistor network can accommodate different input signal ranges and significantly mitigates the HD3 of the VCO.

An array of phase detector (PD)-based digital mixers with a merged passive PI network extracts the frequency difference between  $f_{VCO\pm}$  and the replica VCO frequency  $f_{REF}$ . A 16-phase beat-frequency waveform  $I1_+$  ( $I1_-$ ) with frequency  $f_{BF+}$  ( $f_{BF-}$ ) is reconstructed. The  $2\pi$  phase range span by  $I1_{\pm}$  (at  $f_{BF}$ ) is then folded into four-phase sub-ranges at node  $I4_{\pm}$  to interleave the 16 phases originally required in the case of directly processing  $f_{BF}$  ( $N_{FDC} = 16$ ), into only 4 ( $N_{FDC} = 4$ ).

The complementary halves of the ADC digitize the multiphase frequency information  $f_{4BF\pm}$  at I4<sub>±</sub> using a bank of four paralleled 1-bit XOR-based FDCs (an optimized version of [16]) operating at the nominal oversampling clock frequency  $f_{CLK}$  of 35 MHz. The four FDC pulse-density-modulated (PDM) output bit-streams are then individually processed with a polyphase decimate-by-4, second-order cascaded-integrator-comb (CIC) digital filter (DEC blocks), and recombined at  $f_{CLK}/4$ -rate to produce the digital outputs  $D_{out\pm}$ . A digital subtractor generates the 8-bit 2's-complement final output  $D_{out}$ .



Fig. 6. Circuit-level implementation details of the pseudo-differential ring-VCO.

## **III. RING-VCO NETWORK**

## A. VCO Circuit Implementation

The ring-VCO presented in Fig. 6 consists of a loop of 16 delay stages, implemented with CMOS pseudodifferential inverters. A large number of delay stages (16 as opposed to 8 in [16]) sacrifices some tuning range to suppress the dramatic increase in VCO oscillation frequency in face of "fast" VT conditions. It also helps in reducing both flicker PN [42] and  $f_{\rm VCO}$  mismatch (via stage averaging, thus allowing better HD2 cancellation) while ensuring robust startup. The inverter outputs are passively coupled with feedforward 80-k $\Omega$  resistors to impose the proper edge alignment in the RO and, thus, establish a stable oscillation mode.

The aspect ratio of both pMOS and nMOS devices of the inverter delay cell is 16  $\mu$ m/30 nm, where the channel width is chosen to optimize the noise-power tradeoff for the nominal  $V_{DD}$  of 0.2 V. A weak nMOS-only latch (6  $\mu$ m/50 nm) improves the VCO output voltage swing since the voltage across the main inverters  $(V_{DD} - V_{tune})$  is amplitude modulated at the nMOS source control node  $V_{tune}$ , i.e., at the resistive partition of the analog input voltage  $V_{in}$  between resistors  $R_1$  and  $R_2$ . The output phases are further buffered using inverter cells. The programmability of the input resistors  $R_1$ and  $R_2$  (4-bit programmable within 2.62–5 and 0.12–2.5 k $\Omega$ , respectively), of the capacitor banks at the output of each delay stage  $C_L$  (4-bit programmable within 50–125 fF) and the tunability of the pMOS n-well voltage BODYP (frequency tuning sensitivity of 6 kHz/mV) represent the tuning "knobs" that are integrated into the VCO core, used to properly shape its characteristics (i.e., HD3, HD2,  $K_{\rm VCO}$ , and  $f_{\rm VCO,0}$ ).

### **B.** VCO Linearization Characteristics

The propagation delay of this inverter-based delay cell (assuming identical rise and fall voltage slopes) operating in weak inversion can be expressed as

$$T_{\rm del} = \frac{\alpha V_{\rm ring} \cdot C_L}{\beta I_{D0} \frac{W}{T} e^{(\alpha V_{\rm ring} - V_t)/n V_{\rm therm}}}$$
(5)

where  $\alpha V_{\text{ring}}$  is the output voltage swing, being the product of  $V_{\text{ring}}$  (the difference  $V_{\text{DD}} - V_{\text{tune}}$ ) and  $\alpha \ (\geq 1)$ , a factor that boosts the main delay-cell inverter via the weak nMOS latch.  $I_{D0}$  is the MOS saturation current, n is the subthreshold slope factor, and  $V_{\text{therm}}$  is the well-known thermal voltage kT/q. Factor  $\beta$  (<1) embeds the rather dynamic drain current modulation as  $C_{\text{L}}$  is charged/discharged by the nMOS/pMOS paths. The  $V_{\text{in}}$ -to- $f_{\text{VCO}}$  transfer characteristics are expressed as

$$f_{VCO} = \frac{1}{2N_{del}T_{del}}$$

$$f_{VCO} = b_0 + b_1 V_{tune} + b_2 V_{tune}^2 + b_3 V_{tune}^3$$

$$b_1, b_3 < 0, \quad b_2 > 0 \quad (6)$$

$$V_{tune} = \frac{R_2}{R_1 + R_2} [V_{in} + I_{tune}(V_{tune}) \cdot R_1]$$

$$V_{tune} = a_0 + a_1 V_{in} + a_2 V_{in}^2 + a_3 V_{in}^3, \quad a_1, a_2, a_3 > 0$$
(7)

$$f_{\rm VCO}(V_{\rm in}) = c_0 + c_1 V_{\rm in} + c_2 V_{\rm in}^2 + c_3 V_{\rm in}^3$$
  

$$c_1 \approx a_1 b_1, \quad c_3 \approx a_1^3 b_3 + a_3 b_1 + 2a_1 a_2 b_2$$
(8)

where  $I_{\text{tune}}$  is the current flowing from the RO into the resistor network  $R_1-R_2$  and is a complex function of  $V_{\text{tune}}$ , supply voltage, temperature, threshold voltage, and MOS transistor dimensions. The combined characteristics of the input resistor network and  $I_{\text{tune}}$  perform analog predistortion of  $V_{\text{tune}}$ .

As revealed in Fig. 5, the positive and negative main VCOs operate in the pseudo-differential configuration so that  $c_2$  is canceled and the nonlinearity of the VCO differential gain is largely dominated by  $c_3$ . When  $R_1 || R_2 \ll V_{\text{ring}} / I_{\text{tune}}$ , the nonlinear current  $I_{tune}$  interacts weakly with the resistor network.  $V_{\text{tune}}$  almost linearly scales with  $V_{\text{in}}$  ( $a_2 \ll a_1$  and  $a_3 \ll a_2$ ). Because the  $V_{\text{tune}}$ -to- $f_{\text{VCO}}$  transfer characteristic is a result of the exponential drain current of the deep-subthreshold MOS devices which charges/discharges  $C_L$  [see (5) and (6)], the differential VCO gain  $\Delta f_{\rm VCO,diff} / \Delta V_{\rm in,diff}$  (thus, dominated by HD3) exhibits an intrinsically expansive behavior (red short-dashed curves of Fig. 7), where  $c_1 < 0$  and  $c_3 \approx$  $a_1^3 b_3 < 0$ . When  $R_1 || R_2 \gg V_{\text{ring}} / I_{\text{tune}}$ ,  $I_{\text{tune}}$  interacts strongly with the resistor network  $(a_2 \neq 0)$ . The VCO V-to-f tuning curve can be shaped from being expansive into compressive (black long-dashed curves of Fig. 7, where  $c_3 > 0$  and  $a_1^3b_3 + a_3b_1 > -2a_1a_2b_2$ ). The transition between these two modes  $(c_3 = 0 \text{ if } a_1^3 b_3 + a_3 b_1 = -2a_1 a_2 b_2)$  provides optimal linearity (blue lined curves of Fig. 7). The V-to-f linearization is, therefore, dependent on the ratio  $a_2/a_1$ .

In reality, the imperfect matching of the pseudo-differential main VCOs causes residual even-order distortion components at the ADC output that, although not as severe as HD3, must also be mitigated. Simulations indicate an HD2 mean



Fig. 7. (a) VCO's nonlinear interactions between  $I_{tune}$  and  $V_{tune}$ , resulting in (b) expansive, linearized, and compressive differential gain characteristics.

and worst case of -72 and -60 dBc, respectively. Applying  $\Delta$ BODYP (the differential body-bias component of the positive and negative main VCOs' BODYP) within 20 mV is sufficient to recover the HD2 performance to better than -80 dBc.

## C. HD3 Mitigation

Fig. 8(a) plots the VCO HD3 resulting from modifying the input resistor network characteristics. Having fixed  $R_1$  at 4 k $\Omega$  and computing HD3 across a range of supply voltages (180–220 mV) for different values of  $R_2$ , a notch-like behavior can be observed. The HD3 minima are encountered as the VCO differential gain transitions from its expanding to compressing nonlinearity. Prima facie, it may appear obvious that a large  $R_1/R_2$  ratio should be chosen so that the VCO exhibits a low HD3 over a wider range of supply voltages (i.e., a wide "notch valley"). However, we must realize that this is the equivalent of reducing  $K_{tune}$ . The improved PVT tolerance comes at the expense of a higher thermal and quantization noise power. The supply sensitivity, and thus PVT robustness of the VCO analog linearization technique, is quantified with the metric  $\Delta V_{\rm DD}$ , which defines the change in supply voltage (mV) that maintains HD3 below a tolerable upper bound (e.g., -65 dBc here). To quantify the VCO's input-referred PN (IR-PN) and, thus, power efficiency, we define the metric  $K_{\text{norm,VCO}} \equiv 2A_{\text{in}}(K_{\text{tune}}K_{\text{ring}})/f_{\text{VCO},0}$  [derived from (3)]. At the same  $P_{\rm VCO}$  and  $f_{\rm VCO,0}$ , a larger  $K_{\rm tune}$ , therefore, corresponds to less of the VCO PN being referred back to the input. The normalization of the VCO tuning range by  $f_{\rm VCO,0}$ implies that the input-referred thermal noise is independent of  $C_L$  and  $N_{del}$ , in accordance with [33].

The tradeoff between  $K_{\text{norm,VCO}}$  and  $\Delta V_{\text{DD}}$  is further elucidated in Fig. 8(b). When  $R_1/R_2 = 1$ ,  $K_{\text{norm,VCO}}$  of 1.25 can be achieved, which is a power-efficient means to obtain a low IR-PN. However,  $\Delta V_{\text{DD}}$  of less than 3 mV, along with a significant deterioration in HD3 outside of the notch valley (>-50 dBc), makes this configuration highly sensitive. On the contrary, large  $R_1/R_2$  ratios (e.g., 7) greatly extend  $\Delta V_{\text{DD}}$  to 40 mV [see Fig. 8(b)] but at the cost of a significantly lower  $K_{\text{norm,VCO}}$ , which drops to 0.3, thus resulting in an overall 12-dB deterioration in SNR<sub>PN</sub>.



Fig. 8. Simulated VCO HD3 characteristics at (a) different  $R_1/R_2$  ratios versus  $V_{\text{DD}}$  and (b) corresponding  $K_{\text{norm,VCO}}/\Delta V_{\text{DD}}$  tradeoff.

A balanced tradeoff would be operating somewhere in-between these two extremes, where both the HD cancellation and suppression are observed (e.g., setting  $R_1/R_2 \approx 3$ ). By maintaining the  $R_1/R_2$  ratio but varying the absolute resistance values [see Fig. 9], the HD3 notch is shown to "slide" along with supply voltage changes while keeping  $K_{\text{norm,VCO}}$  approximately constant at 0.75. A simple control strategy of the values of  $R_1$  and  $R_2$  may be to exploit the PVT information provided by the replica VCO and define  $f_{\text{REF}}$ regions, visualized by the vertical blue-dashed lines in Fig. 9. When 2 MHz  $< f_{\text{REF}} < 3$  MHz,  $R_1$  is programmed to 5 k $\Omega$ , and when 3 MHz  $< f_{REF} < 4$  MHz,  $R_1$  is programmed to 4 k $\Omega$ . This further allows us to replace  $\Delta V_{DD}$  with an equivalent variation in  $f_{\text{REF}}$  ( $\Delta f_{\text{REF}}$ ), roughly approximated from Fig. 9 to be 100 kHz/mV. Therefore, by simply associating the measured  $f_{\text{REF}}$  quantity with  $(R_1 \text{ and } R_2)$  pairs, the fundamental tradeoff between IR-PN and HD3 PVT tolerance (now quantified by  $K_{\text{norm,VCO}}$  and  $\Delta f_{\text{REF}}$ , respectively) can be relaxed.

# D. Considerations for Variation-Aware VCO Linearization

Although  $f_{\text{REF}}$  can be directly observed and digitized, HD3 is only indirectly inferred, thus requiring the appropriate mapping from  $f_{\text{REF}}$  to  $(R_1 \text{ and } R_2)$  to provide the variation-aware VCO linearization. The considerations for this mapping are highly dependent on both the programmed  $R_1/R_2$ ratio and the expected VT operating environment. Suppose that the designer operates the VCO with a  $R_1/R_2$  ratio of 7.  $\Delta V_{\text{DD}}$  of 40 mV means that HD3 remains below -65 dBc for an equivalent  $\Delta f_{\text{REF}}$  of 4 MHz, e.g., 2 MHz  $< f_{\text{REF}} < 6$  MHz. With such a wide notch valley, the  $f_{\text{REF}} \rightarrow (R_1, R_2)$  mapping can be very imprecise (i.e., computed with simulated data) such that the nonlinearity calibration would not be required.



Fig. 9. Simulated VCO HD3 characteristics at different resistor values versus  $V_{\text{DD}}$  for an  $R_1/R_2$  ratio of 3.

This represents a "blind" in-field nonlinearity compensation. At the other design extreme, a  $R_1/R_2$  ratio below 1 yields  $\Delta V_{\text{DD}}$  of less than 3 mV. HD3 remains below -65 dBc for  $\Delta f_{\text{REF}}$  of merely 300 kHz. In this case, very little margin of error is afforded to the  $f_{\text{REF}} \rightarrow (R_1, R_2)$  mapping. Expensive on-chip digital calibration procedures may be unavoidable. A far more practical solution that avoids any on-chip nonlinearity calibration is to operate with a  $R_1/R_2$  ratio of around 3, with a  $\Delta f_{\text{REF}}$  of 1 MHz ( $\Delta V_{\text{DD}}$  of 10 mV), as visualized with the example in Fig. 9. This would offer sufficient error margins in the  $f_{\text{REF}} \rightarrow (R_1, R_2)$  mapping to accommodate a post-silicon factory two- $V_{\text{DD}}$ -point nonlinearity calibration procedure.<sup>2</sup>

Consequently, to anticipate a realistic production scenario where only the factory calibration is available and to not significantly sacrifice power efficiency, we only consider the  $R_1/R_2$  ratio somewhere between 2 and 4. It is also worth noting that the PVT states can theoretically be inferred from the common-mode component of  $D_{out\pm}$ . Unfortunately, this would imply minimal re-configurability of the main VCOs, so as to not require a factory-level characterization across multidimensional combinations of VCO tuning knob configurations. In contrast, the beat-frequency interface simultaneously allows run-time programmability of the main VCOs in the feedforward path while retaining the frequency information solely dependent on PVT variations through  $f_{REF}$ .

The variation-aware VCO linearization approach can be extended further by also accounting for temperature variations, as shown with the contour plot in Fig. 10. The dark shaded regions, enclosed by the blue lines, correspond to the specific VT combinations where the VCO's HD3 is below -65 dBc (i.e., linearized). Interestingly, the white-dashed lines representing  $f_{\text{REF}}$  appear almost in parallel with this linearized



Fig. 10. Simulated VCO HD3 (shaded contours) and  $f_{\text{REF}}$  (white-dashed lines) characteristics at different  $V_{\text{DD}}$  and temperature combinations for  $R_1 = 4 \text{ k}\Omega$  and  $R_2 = 1.3 \text{ k}\Omega$ .

HD3 shaded region. This suggests that, within the specified VT range, the HD3 can be indirectly inferred by solely observing  $f_{\text{REF}}$ , without requiring independent knowledge on both the temperature and  $V_{\text{DD}}$ .

## IV. ANALOG PHASE-DOMAIN SIGNAL PROCESSING

# A. APSP Signal Flow

Fig. 11 shows the diagram of a single complementary half of the APSP chain. All of the 16 buffered differential outputs of the main and replica VCOs  $(O_n(1:16), O_n(1:16))$ and  $\text{REF}_n(1:16)$ ,  $\text{REF}_n(1:16)$ , respectively) are inputted into a matrix of digital mixers composed of XOR-PDs to generate four-phase differential (octal<sup>3</sup>) beat-frequency waveforms  $(P_p(1:4), P_n(1:4))$ . A 4× passive PI network generates 12 additional differential phases. The phases are re-ordered into separate bundles of nets (categorized as A, B, C, and D) with  $45^{\circ}$  equidistant separation in phase space (IA<sub>p</sub> $\langle 1:4 \rangle$ , IA<sub>n</sub> $\langle 1:4 \rangle$  to ID<sub>p</sub> $\langle 1:4 \rangle$ , ID<sub>n</sub> $\langle 1:4 \rangle$ ). Two cascaded stages of ×2 PF logic gates spatially interleave all of the nets within a bundle into a single differential wire, yielding  $QA_p, QA_n$  to  $QD_p, QD_n$  at the output of the PF network. After each PF stage (I2i and I4i), the generated waveforms are weakly coupled through a resistive averaging ring [44] to maintain their phase relationship in the presence of large delay skew between all PF units. These resistive averaging rings can be further utilized to generate more interpolated phases or, alternatively, to reduce the number of phase folders (PFs) required at each APSP stage. In this prototype, a significant reduction in power consumption could have been obtained with a more optimized partitioning of both the folding and interpolation factor.

Non-idealities incurred along the APSP chain are firstorder noise shaped at the digital output as a consequence of operating directly on the phase information, achieving a level of error resilience for low input frequencies (i.e., within

<sup>&</sup>lt;sup>2</sup>The  $f_{\text{REF}}$  margin considerations are very similar to the delay guard-bands required for digital processor dynamic variation tolerance [39], [43].

<sup>&</sup>lt;sup>3</sup>This is in analogy to radio frequency receivers that down-convert the received signal to quadrature or octal [45] IF representation.



Fig. 11. Signal-flow diagram of APSP pseudo-differential half.

the signal BW). Furthermore, the use of mainly XOR and inverter gates preserves the digital nature of VCO-based ADCs. These "analog" logic gates display elasticity in terms of speed requirements as driving strength changes coherently to the PVT-induced VCO frequency shifts, differently from synchronous digital circuits, whose operations are marked by a clock signal.

## B. SNR Degradation

Timing skews in the gate delays perturb the ideal locations of the phase zero-crossings and impose run-time constraints on  $f_{4BF,max}$ . To prevent phase-mismatch-induced spectral components near  $f_{CLK}/2$  from aliasing in-band after the phasesampling operation and ensure correct and proper functionality of the XOR-based FDC, the minimum delay between rising/falling edges at stage I4 (i.e.,  $0.5/f_{4BF}$  at  $V_{in} = 0.2$  V, without mismatch-induced timing deviations) must be larger than the clock sampling period. Consequently,  $f_{4BF,max}$  is limited to approximately 70% of  $f_{CLK}/2$  so as to allocate a safety margin. A lower bound of 3 MHz is also outlined to ensure that the spurious content of the modulation spectra occurs far out of the signal band [46].

It is also interesting to note that the PN of the replica VCO degrades the  $SNR_{PN}$  of the single-ended ADC half by approximately 3 dB. The noise power is doubled at stage P/I1, given the frequency subtraction operation of the mixer stage and the fact that the main and replica VCOs exhibit the same level of uncorrelated PN. Fortunately, this represents merely a common-mode noise source and is, thus, canceled in the pseudo-differential ADC configuration.

Operating under the constraints stated above, Fig. 12 plots the simulated SNR versus BW from 80 to 200 kHz



Fig. 12. Simulated SNR versus BW for  $f_{in} = 50$  kHz and with 65 536 FFT points (five Monte-Carlo runs), including transient noise, when phases are digitized at the pseudo-differential output of the main VCO (O), PI (I1), and  $\times 2$  PF stages (I2 and I4).

 $(f_{\rm in} = 50 \text{ kHz} \text{ and } A_{\rm in} = -17 \text{ dBV})$  of the FDC-digitized frequency information at stages O, I1, I2, and I4 within the APSP chain. Transient noise was enabled to include the thermal PN contribution of the BF-VCO network, while five Monte-Carlo simulations were run to investigate the effect of transistor mismatch on SNR degradation. For the 160-kHz BW configuration, the mean SNR at the input–output of the APSP chain (in the pseudo-differential configuration, thus  $O_+ - O_-$  and  $I4_+ - I4_-$ ) is 62.9/62.6 dB. Only minor degradation is observed due to the VCO-based ADC's noise-shaping property, as phase mismatch translates into an additive quantization error component, while the common-mode  $f_{\text{REF}}$  PN is canceled out.

# C. Group Delay Dispersion

The instantaneous phase  $\theta_{VCO}(t)$  of the VCO, modulated by the input sinewave  $x(t) = A_{in} \sin(2\pi f_{in}t)$ , is expressed as

$$\theta_{\rm VCO}(t) = 2\pi \int_{-\infty}^{t} K_{\rm VCO} \cdot x(t) dt$$
(9)

which is the signal to be processed by the APSP chain. Its output phase signal  $\theta_{4BF}$  will be distorted

$$\theta_{4BF}(t) = \theta_{VCO}(t) + \theta_{APSP,tot}(t)$$
(10)

by the amount  $\theta_{APSP,tot}(t) = 2\pi K_{APSP,tot} \cdot x(t)$  with  $K_{APSP,tot} = K_{APSP,I1} + K_{APSP,I2} + K_{APSP,I4}$ , where  $K_{APSP,tot}$  represents the total input voltage-to-phase conversion gain ( $V_{in}$ -to- $\theta_{APSP,tot}$ ), due to the individual gain contribution at nodes I1, I2, and I4, corresponding to the output of PI and  $\times 2$  PFs, respectively,

$$K_{\text{APSP},i} = k_{\text{PM},i} \cdot \frac{C_i}{g_{m,i}}, \quad i = \{I1, I2, I4\}.$$
 (11)

Each APSP stage is modeled by a nonlinear dispersion term  $k_{PM,i}$  [with dimension rad/(V · s) and where  $k_{PM,I1}$ ,  $k_{PM,I2}$ , and  $k_{PM,I4}$  are proportional to  $K_{VCO}$ ] scaled by a linear time constant determined by its transconductance  $g_{m,i}$ and capacitive load  $C_i$ . This delay dispersion term arises from each of the APSP stages, which process FM waveforms



Fig. 13. Schematic-level simulations of SFDR versus input-frequency (32768 pt. FFT) when phases are digitized at the output of the main VCO (O), PI (I1), and  $\times 2$  PF stages (I2 and I4) for different capacitive loading conditions.

with non-saturated voltage swing and non-constant rise/fall times. The non-constant propagation delay of the logic gates, in turn, introduces an undesired modulation of the phase information. In other words, if the APSP input had a constant amplitude and rise/fall times (i.e., a strongly buffered signal), the APSP would not generate any phase distortion  $\theta_{APSP,tot}$ . The derivative (PM-to-FM conversion) introduces a deviation of the instantaneous frequency at *I*4

$$f_{4BF}(t) = \frac{d}{dt} \theta_{VCO}(t) + \frac{d}{dt} \theta_{APSP,tot}(t)$$
  
=  $A_{in} [K_{VCO} \sin(2\pi f_{in}t) + f_{in} K_{APSP,tot} \cdot \cos(2\pi f_{in}t)].$  (12)

Since, at low input frequencies,  $f_{in}K_{APSP,tot} \ll 1$ , the lowfrequency THD is dominated by the VCO's V-to-f nonlinearity, i.e., by the nonlinear  $K_{\rm VCO}$ . This phenomenon is clearly visible from Fig. 13, which plots the simulated SFDR of the FDC-digitized frequency information along the APSP chain at stages O, I1, I2, and I4, versus the input signal frequency from 10 to 150 kHz ( $A_{in} = -17$  dBV). At  $f_{in}$ of 10 kHz, the SFDR (dominated by HD3) is 74 dB at all processing stages, corresponding to the linearized  $K_{\rm VCO}$ (i.e., the one obtained from  $R_1$ - $R_2$  configured such that the optimal HD3 cancellation occurs). At fin of 100 kHz, SFDR values of 74/70/67/64 dB, respectively, at stages O/I1/I2/I4 demonstrate that HD3 increases (i.e., worse SFDR) not only versus input frequency [see (12)] but also along the APSP chain [see (11)] since the propagation delay dispersion accumulates toward the output. Similarly, the delay dispersion is a function of the capacitive load of each stage; the SFDR at stage I4 is 69/60 dB, for C, respectively, equal to 5/45 fF (which are 20 fF lower/higher than the estimated post-layout extracted capacitive load of 25 fF per net).

# V. APSP CIRCUITS

# A. PD Mixer Matrix

The XOR-PDs, organized as in Fig. 14(a), work in tandem to extract the beat frequency of incoming square waves (namely, vectors O and REF). When the frequencies of O ( $f_{VCO}$ ) and REF ( $f_{REF}$ ) are not identical, the alternating constructive and destructive interference patterns generate a waveform that



Fig. 14. Single slice of the multi-phase PD mixer: (a) circuit-level implementation and (b) associated timing waveforms.

"beats" at the periodicity of their frequency difference. In principle, only a single-phase XOR-PD (e.g.,  $O_{p/n}\langle 1 \rangle \oplus \text{REF}_{p/n}\langle 1 \rangle$ ) could be used, but the beat frequency would be relatively close to, or even overlap with,  $f_{\text{REF}}$ , the  $f_{\text{VCO}}$  modulation sideband, their integer multiples, and the summing frequency components, which cannot be sufficiently filtered out with a low-pass filter.

By utilizing instead all of the 16 equidistant phases of O and REF, the entire complement of pulsewidth-modulated (PWM) signals are combined at the analog summation node  $P\langle 1 \rangle$  [47]. Spurious carrier tones are up-converted by  $16 \times$  and readily filtered out through a resistor–capacitor (*RC*) network, formed by purely wiring parasitic resistance, the drain–source capacitances, and channel resistance at the XOR PD output to reconstruct the *clean* beat-frequency waveform. From the time-domain perspective, this carrier up-conversion increases the transition density. The voltage ripple superimposed with the ideal beat waveform is minimized so that phase zero-crossings occur exactly at time intervals of  $0.5/f_{BF}$ .

The PD differential output can be expressed as

$$P_{p}\langle 1 \rangle = \frac{1}{16} \sum [O_{p}\langle 1:16 \rangle] \oplus [\operatorname{REF}_{p}\langle 1:16 \rangle]$$
$$P_{n}\langle 1 \rangle = \frac{1}{16} \sum [O_{n}\langle 1:16 \rangle] \oplus [\operatorname{REF}_{n}\langle 1:16 \rangle].$$
(13)

The redundant information available in the phase space is exploited to compute the octal phase-shifted versions of (13).



Fig. 15. Passive phase interpolation: (a) circuit-level implementation and (b) associated timing waveforms.

A  $45^{\circ}$  shift is obtained with the appropriate circular rotation of vector *O* by physically re-ordering the input phases as follows:

$$P_{p}\langle 2\rangle = \frac{1}{16} \sum \left[ O_{p}\langle 5:16\rangle, O_{n}\langle 1:4\rangle \right] \oplus \left[ \operatorname{REF}_{p}\langle 1:16\rangle \right]$$
$$P_{n}\langle 2\rangle = \frac{1}{16} \sum \left[ O_{n}\langle 5:16\rangle, O_{p}\langle 1:4\rangle \right] \oplus \left[ \operatorname{REF}_{n}\langle 1:16\rangle \right]. \quad (14)$$

Further incremental shifts of 45° in the input phase ordering of vector *O* complete the set  $P_p(1:4)$ ,  $P_n(1:4)$ .

The level-sensitive XOR-PDs are simple by design and do not need to generate narrow pulses with sharp digital transition edges, unlike edge-triggered PD topologies. This is an important advantage because it reduces the switching speed requirements considerably. The XOR transistor length is upsized to 50 nm to conserve power consumption and suppress mismatch at the cost of increased area and capacitive loading at both input and output terminals. The width of MOS devices is sufficiently large ( $W_{pMOS} = 3 \ \mu m$  and  $W_{nMOS} = 1 \ \mu m$ ) to maintain a near rail-to-rail voltage swing of the beat-frequency waveforms.

# B. Passive Phase Interpolation

The locally interpolated nodes, e.g.,  $IB_p\langle 1 \rangle$ ,  $IC_p\langle 1 \rangle$ , and  $ID_p\langle 1 \rangle$  in Fig. 11, are formed by the resistor-ladder voltage division between two adjacent rising (falling) reference transition edges  $IA_p\langle 1 \rangle$  and  $IA_p\langle 2 \rangle$ , as shown in Fig. 15. The intermediate transition edges subdivide the phase-quantization space into approximately equal partitions. Phase interpolation requires differential reference signals, where  $ID_p\langle 4 \rangle$  connects into  $IA_n\langle 1 \rangle$ , and vice versa,  $ID_n\langle 4 \rangle$  into  $IA_p\langle 1 \rangle$ . This, in fact, creates a fully symmetric and circular resistive ring to obtain monotonic interpolated phases in the range  $0-2\pi$  at  $f_{BF}$ . The near-sinusoidal beat-frequency waveforms result in the rise-fall times of transition edges to be longer than the timing delay

difference between such zero-crossings, thus preventing flat transition regions in the interpolated waveforms from severely degrading the quality of the interpolation. The interpolation resistor value is chosen to be 40 k $\Omega$ . For a larger resistor value (aside from its impractical area overhead and routing complexity), the interpolation would be weak. The finite parasitic capacitance at the interpolated nodes induces an *RC* time constant which prevents intermediate nodes from following the reference transition edges instantaneously [48]. On the other hand, a small resistor value would improve the PI linearity, but the lateral current paths would increase the static power consumption and reduce the voltage swing necessary to drive the succeeding phase folding circuitry.

## C. Phase Folding Logic

Signal folding in voltage-domain flash ADCs automatically compresses multiple level-crossings into a single wire. Bridging into the domain of frequency generation, clock multipliers and phase/edge combiners using windowed multiplexing [42] or self-switching [49] perform the identical task. To remain consistent with the ADC terminology, we refer to this concept as phase folding, as explicitly defined by the folding of spatially distributed phase-domain zero-crossings.

In this prototype, a single  $\times 2$  PF unit, as shown in Fig. 16(a), consisting of two CMOS XOR gates and a differential buffer stage, accepts the 90°-separated quadrature input phases [50]. The output is a 50% duty-cycle square wave at twice the input frequency. The tree of double-cascaded  $\times 2$  PF stages (*I2* and *I4*), thus, accepts  $2 \times 4$  phases  $45^{\circ}$  apart (each phase bundle at stage *I*1), which corresponds to the implicit frequency quadrupling.

Local mismatches are in part mitigated through circuit sizing—the aspect ratio of each XOR-based PF is 16  $\mu$ m/40 nm and 8  $\mu$ m/40 nm for pMOS and nMOS devices, respectively. In addition, resistive rings (80-k $\Omega$  units) are placed at the output of each PF stage to average out the timing skews and ensure the monotonicity of the phase quantization levels. Finally, the buffer stage (L = 40 nm) not only acts to steepen the transition edges through positive feedback but also performs realignment of the pseudo-differential edges. Common-mode shifts are corrected in the feedforward manner to stabilize the propagating multi-phase waveforms from accumulating duty-cycle errors. This gain stage further isolates the inter-stage parasitic capacitance so as to limit the signal attenuation and suppress the conversion gain of RDF  $V_t$ variations into timing delay mismatch.

While the techniques described above proved sufficient for the target performance, more demanding BW specifications may necessitate using 30-nm-length transistors and minimization of capacitive loading or better served with alternative mismatch and dispersion-free PF schemes, such as [42].

## VI. EXPERIMENTAL RESULTS

Fig. 17 shows the chip micrograph of the proposed 0.2-V VCO-based ADC, fabricated in TSMC 28-nm LP CMOS and occupying an active area of 0.12 mm<sup>2</sup>. Peripheral circuitry includes the input clock buffer and internal clock generation,



Fig. 16. Phase folding logic for each phase bundle: (a) circuit-level implementation and (b) associated timing waveforms. Note that the phase-balancing resistors indicated in Fig. 11 are not shown.



Fig. 17. Chip micrograph of the VCO-based ADC prototype.

serial-to-parallel interface (SPI), and digital read-out (array of multiplexers, buffers, and level shifters).

The output spectrum of a 50-kHz,  $0.4-V_{pp}$  (-17 dBV) differential input sinewave sampled at 34.6 MS/s (digital output at the decimated 8.65-MS/s rate) is presented in Fig. 18(a). The ADC prototype demonstrates a peak SNR/SNDR of 61.7 dB/59.9 dB over a 160-kHz BW. The SNR and HD2 of the *single-ended* digital output [see Fig. 18(b)] are 55.8 dB and -21.6 dBc, respectively, demonstrating effective cancellation of both  $f_{\rm REF}$  PN and even-order harmonic distortion components by the pseudo-differential ADC configuration.

The characterization of SNR/SNDR versus input signal amplitude ( $A_{in}$ ) and frequency ( $f_{in}$ ) is presented in Fig. 19 for 80- and 160-kHz BW configurations. An SFDR of 67.2 dB at  $f_{in}$  of 50 kHz (an input frequency that places the third harmonic near the edge of the stated 160-kHz signal BW) results from dynamic distortion and contributes to ~2-dB drop in SNDR at high input frequencies.



Fig. 18. Measured (a) pseudo-differential and (b) single-ended ADC output spectrum of a 50-kHz 0.4- $V_{pp}$  differential input sinewave, sampled at 34.6 MS/s and decimated by 4 on-chip.

The effect of dynamic distortion is more appropriately investigated through two-tone tests at various input frequencies  $(A_{in} = -23 \text{ dBV})$  within the signal band (see Fig. 20). The worst case third-order intermodulation distortion (IM3) is -66.5 dBc at 150 kHz when  $f_1 = 130$  kHz and  $f_2 = 140$  kHz, while the second-order intermodulation distortion (IM2) is more pronounced when  $f_1 = 65$  kHz and  $f_2 = 70$  kHz, yielding -65 dBc at 135 kHz.

The total power consumed (excluding the decimationfiltering back-end for a fair comparison), as shown in Fig. 21(a), is 15.9  $\mu$ W, of which the power budget is 22%, 12%, 12%, 29%, and 25% for the VCOs, VCO buffers, PD matrix, PFs, and FDCs, respectively. For the sake of completeness, the power breakdown versus supply voltage is further displayed in Fig. 21(b)-(d) for the VCO, APSP, and FDC blocks. In addition, the resistor-divider network consumes current through its input path. With a  $R_1$  and  $R_2$  values of 5 and 2.5 k $\Omega$ , respectively, and at the common-mode input of 0.1 V, an average current of around 15  $\mu$ A flows through both  $V_{in+}$  and  $V_{in-}$ . This amounts to an extra power overhead of 3  $\mu$ W. The introduction of phase folding represents an FDC power saving of 45%, before considering additional digital back-end and clocking overhead, which would incur greater FDC parallelism. The beat-frequency extraction consumes less than 2  $\mu$ W. Only 22% of the power budget is allocated to the VCOs; therefore, performance gains may be achievable with power/PN tradeoff optimization. The resulting Walden and Schreier FoM is 73.3/61.5 fJ/c-s and 161.4/159.9 dB, respectively, over 80-/160-kHz BW configurations.



Fig. 19. Measured SNR and SNDR versus (a) input signal amplitude and (b) input frequency for 80- and 160-kHz BW configurations, respectively.



Fig. 20. Measured two-tone test (-23 dBV) with input frequencies: (a) 130 and 140 kHz, (b) 65 and 70 kHz, and (c) 11 and 13 kHz.

The 80-kHz BW SNR/SNDR characterization versus VCO and APSP supply voltages ( $V_{DD,VCO}$  swept from 170 to 225 mV, while  $V_{DD,APSP} = V_{DD,VCO} + 15$  mV) is shown in Fig. 22. The SNDR of 64 dB occurs at  $V_{DD,VCO}$  of 185 mV and remains above 62 dB from 182.5 mV to 192.5 mV ( $\Delta V_{DD}$  of 10 mV,  $f_{REF}$  ranging from 3.7 to 4.4 MHz) for the input resistor configuration  $R_1/R_2 = 5.0/2.5$  k $\Omega$ . From



Fig. 21. Measured power consumption of (a) ADC core and constituent, (b) VCO, (c) APSP, and (d) FDC blocks versus supply voltage. Peak performance is obtained for  $V_{\text{DD,VCO}}$ ,  $V_{\text{DD,APSP}}$ , and  $V_{\text{DD,FDC}}$  of 185, 200, and 210 mV, respectively.

192.5 to 205 mV ( $f_{\text{REF}}$ : 4.4–5.4 MHz) and 205 to 215 mV ( $f_{\text{REF}}$ : 5.4–6.4 MHz), the input resistor network should be set to  $R_1/R_2 = 3.6/1.1 \text{ k}\Omega$  and  $R_1/R_2 = 3.0/0.8 \text{ k}\Omega$ , respectively. Utilizing the sensed  $f_{\text{REF}}$  to actuate the resistor values accordingly extends >10-ENOB (80-kHz BW) performance by at least 30 mV ( $\approx$ 20% supply variation). Similarly, at 160-kHz BW, a lower bound SNDR of 58 dB is achieved across a supply voltage variation of 45 mV (180–225 mV) and can be further extended with wider resistor programmability.

Fig. 23 extends the SNDR characterization versus  $V_{DD,VCO}$  across four IC samples to demonstrate the effect of inter-die process variations. Taking 185 mV as the optimal performance for IC1 (64-dB SNDR and  $f_{REF}$  of 3.9 MHz), it is evident that IC2 exhibits nearly identical characteristics, as observed by the overlapping SNDR and  $f_{REF}$  curves. IC3 is the fastest among the reported samples, with  $f_{REF}$  of 4.8 MHz. Inspecting Fig. 22, this corresponds to an equivalent upward  $V_{DD,VCO}$  shift of 12.5 mV; thus, the input resistor configuration "B" should be used. Indeed, SNDR recovers to greater than 62 dB instead of 56 dB by exploiting the *transferred*  $f_{REF}$  information obtained with IC1. IC4 is slightly slower than IC3 (equivalent to a downward  $V_{DD,VCO}$  shift of 5 mV) and is at the transition point of input resistor configurations "A" and "B."

Table I compares the proposed 0.2-V open-loop VCObased ADC, which is an architectural advancement of [16], with state-of-the-art low-voltage  $\Delta \Sigma$  ADCs and digitally intensive VCO-based ADC architectures. This work achieves the highest signal BW (SNDR  $\geq 60$  dB) compared to ADCs with supply  $\leq 0.4$  V by exploiting open-loop, multiphase parallelism extended across analog-digital domains. It demonstrates the power efficiency FoM rivaling stronginversion VCO-based ADC architectures and near-threshold continuous-time  $\Delta \Sigma$  ADCs. Most importantly, it is the only work among deep-subthreshold designs to showcase

TABLE I PERFORMANCE SUMMARY AND COMPARISON WITH STATE-OF-THE-ART SUBTHRESHOLD AND VCO-BASED ADCS

|                           | This Work           |                   | [16]<br>SSCL18                               | [14]<br>ESSCIRC07       | [13]<br>ESSCIRC06 | [51]<br>JSSC12                      | [52]<br>JSSC19  | [53]<br>JSSC18   | [54]<br>JSSC15                      | [27]<br>JSSC13                       | [22]<br>JSSC15         | [20]<br>JSSC17         | [26]<br>ASSCC19      | [23]<br>JSSC17        | [28]<br>JSSC19              |
|---------------------------|---------------------|-------------------|----------------------------------------------|-------------------------|-------------------|-------------------------------------|-----------------|------------------|-------------------------------------|--------------------------------------|------------------------|------------------------|----------------------|-----------------------|-----------------------------|
| Architecture              | Open-loop<br>VCO    |                   | Open-loop<br>VCO                             | Open-loop<br>VCO        | Open-loop<br>VCO  | DT-ΔΣ                               | СТ-ΔΣ           | СТ-ΔΣ            | DT-ΔΣ                               | Background<br>digital<br>calibration | Multi-<br>stage<br>VCO | Closed-<br>loop<br>VCO | Second-<br>order VCO | Third-<br>order VCO   | Non-<br>uniform<br>sampling |
| PVT Tolerance             | Variation<br>aware* |                   | On-chip<br>calibration<br><u>required</u> ** | Not<br>reported         | Not<br>reported   | Boot-<br>strapped<br>switch-<br>cap | Not<br>reported | Gm<br>regulation | Boot-<br>strapped<br>switch-<br>cap |                                      |                        |                        |                      |                       |                             |
| Process (nm)              | 28                  |                   | 28                                           | 90                      | 90                | 130                                 | 130             | 90               | 130                                 | 65                                   | 40                     | 130                    | 65                   | 65                    | 65                          |
| Supply (V)                | <0.21               |                   | 0.2                                          | 0.2                     | 0.2               | 0.25                                | 0.3             | 0.4              | 0.4                                 | 1.0                                  | 0.9                    | 1.2                    | 1.0                  | 1.2                   | 1.05                        |
| Area (mm <sup>2</sup> )   | 0.12                |                   | 0.12                                         | 0.016                   | 0.02              | 0.34                                | 0.74            | 0.144            | 0.33                                | 0.075                                | 0.017                  | 0.13                   | 0.06                 | 0.01                  | 0.13                        |
| Power (µW)                | 15.9                |                   | 7                                            | 7.5                     | 0.44              | 7.5                                 | 26.3            | 26.4             | 63                                  | 17,500                               | 2,570                  | 1,050                  | 100                  | 3,700                 | 19,700                      |
| F <sub>CLK</sub> (MHz)    | 34.6                |                   | 30                                           | 12                      | 3.4               | 1.4                                 | 6.4             | 10.4             | 3.2                                 | 1,600                                | 1,600                  | 250                    | 32.6                 | 1,600                 | 4,000                       |
| BW (kHz)                  | 80                  | 160               | 60                                           | 20                      | 20                | 10                                  | 50              | 50               | 20                                  | 12,500                               | 40,000                 | 3,000                  | 1,500                | 10,000                | 200,000                     |
| SNR <sub>peak</sub> (dB)  | 66.17               | 61.7              | 70.02                                        | 68.9                    | 47.4              | -                                   | 68.7            | 74.4             | 77.7                                | 75                                   | 60.7                   | -                      | 73.1                 | 66.2                  | 60.1                        |
| SNDR <sub>peak</sub> (dB) | 64.4                | 59.9              | 67.42                                        | 60.3                    | 44.2              | 61                                  | 68.5            | 74.4             | 76.1                                | 74                                   | 59.5                   | 70.2                   | 72.7                 | 65.7                  | 58.5                        |
| SFDR <sub>peak</sub> (dB) | 71.2                | 67.2              | 71.19                                        | -                       | -                 | 70                                  | 82.6            | 85.2             | -                                   | 83                                   | -                      | 84                     | 81                   | 75.5                  | -                           |
| ENOB (bits)               | 10.4                | 9.66              | 10.91                                        | 9.7                     | 7.0               | 9.8                                 | 11              | 12.1             | 12.35                               | 12                                   | 9.59                   | 11.37                  | 11.78                | 10.62                 | 9.43                        |
| FoM <sub>W</sub> (fJ/c-s) | 73.3                | 61.5              | 29.9                                         | 221.7                   | 83                | 410                                 | 121             | 61.5             | 310                                 | 171                                  | 42                     | 66.2                   | 9.9                  | 117.5                 | 71.6                        |
| FoM <sub>S</sub> (dB)     | 161.4               | 159.9             | 166.8                                        | 154.6                   | 150.8             | 152.2                               | 161.3           | 167.2            | 161.1                               | 163                                  | 161.4                  | 164.8                  | 174.5                | 160                   | 158.6                       |
| * <2 dB degra             | adation<br>radatio  | n in SN<br>n in S | NDR with 1                                   | 0% supply<br>10% supply | variation.        |                                     |                 | FoM              | v = Powe                            | er/(2 <sup>ENOB</sup> •2E            | BW), Fo                | oM <sub>s</sub> = S    | NDR + 10             | log <sub>10</sub> (BV | V/Power)                    |

\*\* >18 dB degradation in SNDR with 10% supply variation.





Fig. 22. Measured SNR/SNDR at (a) 80- and (b) 160-kHz BWs versus VCO supply voltage (VDD, VCO) for different input resistor network configurations with (c) corresponding  $f_{\text{REF}}$  state.  $V_{\text{DD,APSP}}$  is set to be  $V_{\text{DD,VCO}}$ +15 mV, while FDC clock sampling is fixed at 35 MS/s.

moderate-to-high performance (in the region of 9-11 bits over 50-200 kHz BW), taking into account inter-die process and supply variations (<2-dB SNDR degradation within 10%  $V_{DD}$ 

Fig. 23. Measured SNDR at (a) 80- and (b) 160-kHz BWs versus  $V_{\text{DD},\text{VCO}}$ with (c) corresponding  $f_{\text{REF}}$  state for four IC samples.

variation, instead of the 18-dB degradation in [16]), achieved through practical embedded feedforward and feedback PVTaware capabilities.

# VII. CONCLUSION

The massive deployment of IoT networks has been supported by energy harvesters that often supply voltages well below the threshold level of MOS transistors. This article, through the demonstration of a 0.2-V open-loop VCO-based ADC, overcomes the limitations of deep-subthreshold operation to realize the potential of digitally intensive mixed-signal circuits for medium-to-high data conversion performance. The constituent beat-frequency VCO network delivers variationaware analog linearization for low-noise and highly linear V-to-f conversion with an embedded replica VCO. APSP circuits for beat-frequency extraction decouples the VCO gain requirements from its rest oscillation frequency and suppress large PVT-induced VCO frequency shifts in the feedforward manner, while phase interpolation and phase folding reduce significantly the digital hardware overhead. High-speed FDCs and multi-phase operation distributed across both analogdigital domains improve the ADC BW by more than  $2\times$ . A peak SNDR of 64.4/59.9 dB over 80-/160-kHz BW is achieved while consuming 15.9  $\mu$ W. The corresponding FoMs rival those of state-of-the-art digital-intensive VCO-based ADCs. The robustness is further verified through measurements across multiple ICs and supply voltages.

## ACKNOWLEDGMENT

The authors would like to thank the TSMC University Shuttle for the chip fabrication; Dr. Hsieh-Hung Hsieh (TSMC) and Dr. Teerachot Siriburanon for help with the tapeout; the Microelectronic Circuits Centre Ireland (MCCI) for technical and administrative support; Suoping Hu and Hieu Minh Nguyen for technical discussions; and Erik Staszewski and Sunisa Staszewski for measurement and lab assistance. They would also like to acknowledge the anonymous reviewers for their invaluable comments and suggestions.

## REFERENCES

- S. Bandyopadhyay and A. P. Chandrakasan, "Platform architecture for solar, thermal, and vibration energy combining with MPPT and single inductor," *IEEE J. Solid-State Circuits*, vol. 47, no. 9, pp. 2199–2215, Sep. 2012.
- [2] Y. Jiang, M.-K. Law, Z. Chen, P.-I. Mak, and R. P. Martins, "Algebraic series-parallel-based switched-capacitor DC–DC boost converter with wide input voltage range and enhanced power density," *IEEE J. Solid-State Circuits*, vol. 54, no. 11, pp. 3118–3134, Nov. 2019.
- [3] Y.-T. Lin, N. Pourmousavian, C.-C. Li, M.-S. Yuan, C.-H. Chang, and R. B. Staszewski, "A 180 mV 81.2%-efficient switched-capacitor voltage doubler for IoT using self-biasing deep N-well in 16-nm CMOS FinFET," *IEEE Solid-State Circuits Lett.*, vol. 1, no. 7, pp. 158–161, Jul. 2018.
- [4] A. Esmailiyan, F. Schembari, and R. B. Staszewski, "A 0.36-V 5-MS/s time-mode flash ADC with Dickson-charge-pump-based comparators in 28-nm CMOS," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 67, no. 6, pp. 1789–1802, Jun. 2020.
- [5] S. Yang, J. Yin, H. Yi, W.-H. Yu, P.-I. Mak, and R. P. Martins, "A 0.2-V energy-harvesting BLE transmitter with a micropower manager achieving 25% system efficiency at 0-dBm output and 5.2-nW sleep power in 28-nm CMOS," *IEEE J. Solid-State Circuits*, vol. 54, no. 5, pp. 1351–1362, May 2019.
- [6] H. Yi, W.-H. Yu, P.-I. Mak, J. Yin, and R. P. Martins, "A 0.18-V 382-μW Bluetooth low-energy receiver front-end with 1.33-nW sleep power for energy-harvesting applications in 28-nm CMOS," *IEEE J. Solid-State Circuits*, vol. 53, no. 6, pp. 1618–1627, Jun. 2018.
- [7] S. Hu et al., "A type-II phase-tracking receiver," IEEE J. Solid-State Circuits, vol. 56, no. 2, pp. 427–439, Feb. 2021.

- [8] C. Li *et al.*, "All-digital PLL for Bluetooth low energy using 32.768-kHz reference clock and ≤0.45-V supply," *IEEE J. Solid-State Circuits*, vol. 53, no. 12, pp. 3660–3671, Dec. 2018.
- [9] C.-C. Li, M.-S. Yuan, Y.-T. Lin, C.-C. Liao, C.-H. Chang, and R. B. Staszewski, "A 0.2-V three-winding transformer-based DCO in 16-nm FinFET CMOS," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 67, no. 12, pp. 2878–2882, Dec. 2020.
- [10] Z. Zhang, G. Zhu, and C. P. Yue, "A 0.25–0.4-V, sub-0.11-mW/GHz, 0.15–1.6-GHz PLL using an offset dual-path architecture with dynamic charge pumps," *IEEE J. Solid-State Circuits*, vol. 56, no. 6, pp. 1871–1885, Jun. 2021.
- [11] A. Wang and A. Chandrakasan, "A 180-mV subthreshold FFT processor using a minimum energy design methodology," *IEEE J. Solid-State Circuits*, vol. 40, no. 1, pp. 310–319, Jan. 2005.
- [12] S. Hanson *et al.*, "Exploring variability and performance in a sub-200-mV processor," *IEEE J. Solid-State Circuits*, vol. 43, no. 4, pp. 881–891, Apr. 2008.
- [13] U. Wismar, D. Wisland, and P. Andreani, "A 0.2 V 0.44 μW 20 kHz analog to digital ΣΔ modulator with 57 fJ/conversion FoM," in *Proc.* 32nd Eur. Solid-State Circuits Conf., Montreux, Switzerland, Sep. 2006, pp. 187–190.
- [14] U. Wismar, D. Wisland, and P. Andreani, "A 0.2 V, 7.5 μW, 20 kHz ΣΔ modulator with 69 dB SNR in 90 nm CMOS," in *Proc. IEEE Eur. Solid-State Circuits Conf.*, Munich, Germany, Sep. 2007, pp. 206–209.
- [15] M. Alioto, "Ultra-low power VLSI circuit design demystified and explained: A tutorial," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 59, no. 1, pp. 3–29, Jan. 2012.
- [16] V. Nguyen, F. Schembari, and R. B. Staszewski, "A 0.2-V 30-MS/s 11b-ENOB open-loop VCO-based ADC in 28-nm CMOS," *IEEE Solid-State Circuits Lett.*, vol. 1, no. 9, pp. 190–193, Sep. 2018.
- [17] A. Babaie-Fishani and P. Rombouts, "Highly linear VCO for use in VCO-ADCs," *Electron. Lett.*, vol. 52, no. 4, pp. 268–270, 2016.
- [18] M. Z. Straayer and M. H. Perrott, "A 12-bit, 10-MHz bandwidth, continuous-time ΣΔ ADC with a 5-bit, 950-MS/s VCO-based quantizer," *IEEE J. Solid-State Circuits*, vol. 43, no. 4, pp. 805–814, Apr. 2008.
- [19] H. Wang, V. Nguyen, F. Schembari, and R. B. Staszewski, "An adaptiveresolution quasi-level-crossing delta modulator with VCO-based residue quantizer," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 67, no. 12, pp. 2828–2832, Dec. 2020.
- [20] S. Li, A. Mukherjee, and N. Sun, "A 174.3-dB FoM VCO-based CT ΔΣ modulator with a fully-digital phase extended quantizer and trilevel resistor DAC in 130-nm CMOS," *IEEE J. Solid-State Circuits*, vol. 52, no. 7, pp. 1940–1952, Jul. 2017.
- [21] K. Ragab and N. Sun, "A 12-b ENOB 2.5-MHz BW VCO-based 0–1 MASH ADC with direct digital background calibration," *IEEE J. Solid-State Circuits*, vol. 52, no. 2, pp. 433–447, Feb. 2017.
- [22] X. Xing and G. G. E. Gielen, "A 42 fJ/step-FoM two-step VCO-based delta-sigma ADC in 40 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 50, no. 3, pp. 714–723, Mar. 2015.
- [23] A. Babaie-Fishani and P. Rombouts, "A mostly digital VCO-based CT-SDM with third-order noise shaping," *IEEE J. Solid-State Circuits*, vol. 52, no. 8, pp. 2141–2153, Aug. 2017.
- [24] F. Cardes et al., "0.04-mm<sup>2</sup> 103-dB-A dynamic range second-order VCO-based audio ΣΔ ADC in 0.13-μm CMOS," IEEE J. Solid-State Circuits, vol. 53, no. 6, pp. 1731–1742, Jun. 2018.
- [25] A. Jayaraj *et al.*, "Highly digital second-order ΔΣ VCO ADC," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 66, no. 7, pp. 2415–2425, Jul. 2019.
- [26] A. Jayaraj et al., "8.6 fJ/step VCO-based CT 2nd-order ΔΣ ADC," in Proc. IEEE Asian Solid-State Circuits Conf., Nov. 2019, pp. 197–200.
- [27] G. Taylor and I. Galton, "A reconfigurable mostly-digital delta-sigma ADC with a worst-case FOM of 160 dB," *IEEE J. Solid-State Circuits*, vol. 48, no. 4, pp. 983–995, Apr. 2013.
- [28] T.-F. Wu and M. S.-W. Chen, "A noise-shaped VCO-based nonuniform sampling ADC with phase-domain level crossing," *IEEE J. Solid-State Circuits*, vol. 54, no. 3, pp. 623–635, Mar. 2019.
- [29] W. Jiang, V. Hokhikyan, H. Chandrakumar, V. Karkare, and D. Markovic, "A ±50-mV linear-input-range VCO-based neural-recording front-end with digital nonlinearity correction," *IEEE J. Solid-State Circuits*, vol. 52, no. 1, pp. 173–184, Jan. 2017.
  [30] J. Daniels *et al.*, "A 0.02 m<sup>2</sup> 65 nm CMOS 30 MHz BW all-digital
- [30] J. Daniels et al., "A 0.02 m<sup>2</sup> 65 nm CMOS 30 MHz BW all-digital differential VCO-based ADC with 64 dB SNDR," in Proc. Symp. VLSI Circuits, Jun. 2010, pp. 155–156.
- [31] M. Baert and W. Dehaene, "A 5-GS/s 7.2-ENOB time-interleaved VCObased ADC achieving 30.5 fJ/cs," *IEEE J. Solid-State Circuits*, vol. 55, no. 6, pp. 1577–1587, Jun. 2020.

- [32] J. Kim, T. K. Jang, Y. G. Yoon, and S. Cho, "Analysis and design of voltage-controlled oscillator based analog-to-digital converter," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 57, no. 1, pp. 18–30, Jan. 2010.
- [33] J. Borgmans, R. Riem, and P. Rombouts, "The analog behavior of pseudo digital ring oscillators used in VCO ADCs," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 68, no. 7, pp. 2827–2840, Jul. 2021.
- [34] A. Quintero et al., "A coarse-fine VCO-ADC for MEMS microphones with sampling synchronization by data scrambling," *IEEE Solid-State Circuits Lett.*, vol. 3, pp. 29–32, 2020.
- [35] C. Perez, A. Quintero, P. Amaral, A. Wiesbauer, and L. Hernandez, "A 73 dB-A audio VCO-ADC based on a maximum length sequence generator in 130 nm CMOS," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, early access, Apr. 13, 2021, doi: 10.1109/TCSII.2021.3073085
- [36] T.-H. Kim, R. Persaud, and C. H. Kim, "Silicon odometer: An onchip reliability monitor for measuring frequency degradation of digital circuits," *IEEE J. Solid-State Circuits*, vol. 43, no. 4, pp. 874–880, Apr. 2008.
- [37] S. Kundu, M. Liu, S.-J. Wen, R. Wong, and C. H. Kim, "A fully integrated digital LDO with built-in adaptive sampling and active voltage positioning using a beat-frequency quantizer," *IEEE J. Solid-State Circuis*, vol. 54, no. 1, pp. 109–120, Jan. 2019.
- [38] T. Anand, K. A. A. Makinwa, and P. K. Hanumolu, "A VCO based highly digital temperature sensor with 0.034 °C/mV supply sensitivity," *IEEE J. Solid-State Circuits*, vol. 51, no. 11, pp. 2651–2663, Nov. 2016.
- [39] F. U. Rahman *et al.*, "A unified clock and switched-capacitor-based power delivery architecture for variation tolerance in low-voltage SoC domains," *IEEE J. Solid-State Circuits*, vol. 54, no. 4, pp. 1173–1184, Apr. 2019.
- [40] N. Pourmousavian, F. Kuo, T. Siriburanon, M. Babaie, and R. B. Staszewski, "A 0.5-V 1.6-mW 2.4-GHz fractional-N all-digital PLL for Bluetooth LE with PVT-insensitive TDC using switchedcapacitor doubler in 28-nm CMOS," *IEEE J. Solid-State Circuits*, vol. 53, no. 9, pp. 2572–2583, Sep. 2018.
- [41] H. Pan and A. A. Abidi, "Signal folding in A/D converters," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 51, no. 1, pp. 3–14, Jan. 2004.
- [42] J. Yin *et al.*, "A time-interleaved ring-VCO with reduced 1/f<sup>3</sup> phase noise corner, extended tuning range and inherent divided output," *IEEE J. Solid-State Circuits*, vol. 51, no. 12, pp. 2979–2991, Dec. 2016.
- [43] K. A. Bowman *et al.*, "A 45 nm resilient microprocessor core for dynamic variation tolerance," *IEEE J. Solid-State Circuits*, vol. 46, no. 1, pp. 194–208, Jan. 2011.
  [44] J.-M. Chou, Y.-T. Hsieh, and J.-T. Wu, "Phase averaging and inter-
- [44] J.-M. Chou, Y.-T. Hsieh, and J.-T. Wu, "Phase averaging and interpolation using resistor strings or resistor rings for multi-phase clock generation," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 53, no. 5, pp. 984–991, May 2006.
  [45] I. Madadi *et al.*, "A high IIP2 SAW-less superheterodyne receiver with
- [45] I. Madadi *et al.*, "A high IIP2 SAW-less superheterodyne receiver with multi-stage harmonic rejection," *IEEE J. Solid-State Circuits*, vol. 51, no. 2, pp. 332–347, Feb. 2016.
- [46] E. Gutierrez, L. Hernandez, F. Cardes, and P. Rombouts, "A pulse frequency modulation interpretation of VCOs enabling VCO-ADC architectures with extended noise shaping," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 65, no. 2, pp. 444–457, Feb. 2018.
- [47] B. Drost, M. Talegaonkar, and P. K. Hanumolu, "Analog filter design using ring oscillator integrators," *IEEE J. Solid-State Circuits*, vol. 47, no. 12, pp. 3120–3129, Dec. 2012.
- [48] S. Henzler, S. Koeppe, D. Lorenz, W. Kamp, R. Kuenemund, and D. Schmitt-Landsiedel, "A local passive time interpolation concept for variation-tolerant high-resolution time-to-digital conversion," *IEEE J. Solid-State Circuits*, vol. 43, no. 7, pp. 1666–1676, Jul. 2008.
  [49] R. Tseng, H. Li, D. H. Kwon, Y. Chiu, and A. S. Y. Poon,
- [49] R. Tseng, H. Li, D. H. Kwon, Y. Chiu, and A. S. Y. Poon, "A four-channel beamforming down-converter in 90-nm CMOS utilizing phase-oversampling," *IEEE J. Solid-State Circuits*, vol. 45, no. 11, pp. 2262–2272, Nov. 2010.
- [50] A. Khashaba, A. Elkholy, K. M. Megawer, M. G. Ahmed, and P. K. Hanumolu, "A low-noise frequency synthesizer using multiphase generation and combining techniques," *IEEE J. Solid-State Circuits*, vol. 55, no. 3, pp. 592–601, Mar. 2020.
- [51] F. Michel and M. S. J. Steyaert, "A 250 mV 7.5 μW 61 dB SNDR SC ΔΣ modulator using near-threshold-voltage-biased inverter amplifiers in 130 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 47, no. 3, pp. 709–721, Mar. 2012.
- [52] L. Lv *et al.*, "Inverter-based subthreshold amplifier techniques and their application in 0.3-V  $\Delta\Sigma$ -modulators," *IEEE J. Solid-State Circuits*, vol. 54, no. 5, pp. 1436–1445, May 2019.
- [53] L. Lv et al., "A 0.4-V G<sub>m</sub>-C proportional-integrator-based continuoustime ΔΣ modulator with 50-kHz BW and 74.4-dB SNDR," *IEEE J. Solid-State Circuits*, vol. 53, no. 11, pp. 3256–3267, Nov. 2018.

[54] Y. Yoon, D. Choi, and J. Roh, "A 0.4-V 63 μW 76.1 dB SNDR 20 kHz bandwidth delta-sigma modulator using a hybrid switching integrator," *IEEE J. Solid-State Circuits*, vol. 50, no. 10, pp. 2342–2352, Oct. 2015.



**Viet Nguyen** (Student Member, IEEE) was born in Hanoi, Vietnam, in 1994. He received the B.Sc. and M.E. degrees in electronic engineering from University College Dublin (UCD), Dublin, Ireland, in 2016 and 2017, respectively, where he is currently pursuing the Ph.D. degree in microelectronics.

In 2016, he was an IC Design Intern with Xilinx, Dublin, for nine months. His current research interests include ultra-low-voltage circuit design and time-mode data conversion.



**Filippo Schembari** (Member, IEEE) was born in Codogno, Italy, in 1988. He received the B.Sc. degree in biomedical engineering and the M.Sc. and Ph.D. degrees in electrical engineering from the Politecnico di Milano, Milan, Italy, in 2010, 2012, and 2016, respectively.

During his M.Sc. and Ph.D. degrees, he worked on low-noise multi-channel readout ASICs for X- and  $\gamma$ -ray spectroscopy and imaging applications. From 2016 to 2019, he was a Post-Doctoral Researcher with the University College Dublin (UCD), Dublin,

Ireland, focusing on deep-subthreshold time-mode analog-to-digital converters (ADCs), level-crossing-sampling ADCs, mismatch-calibrated successive approximation register (SAR) ADCs, and SAR time-to-digital converters (TDCs). During that period, he also worked for six months as an IC Design Intern at Xilinx, Dublin. In 2019, he joined Huawei Technologies, Milan, as an RFIC Designer.

Dr. Schembari was a recipient of the 2018 IEEE Emilio Gatti and Franco Manfredi Best Ph.D. Thesis Award in Radiation Instrumentation and the Marie Sklodowska-Curie European Individual Fellowship (EU-IF) in 2017.



**Robert Bogdan Staszewski** (Fellow, IEEE) was born in Bialystok, Poland. He received the B.Sc. (*summa cum laude*), M.Sc., and Ph.D. degrees in electrical engineering from The University of Texas at Dallas, Richardson, TX, USA, in 1991, 1992, and 2002, respectively.

From 1991 to 1995, he was with Alcatel Network Systems, Richardson, involved in SONET crossconnect systems for fiber optics communications. He joined Texas Instruments Inc., Dallas, TX, USA, in 1995, where he was elected as a Distinguished

Member of Technical Staff (limited to 2% of technical staff). From 1995 to 1999, he was engaged in advanced CMOS read channel development for hard disk drives. In 1999, he co-started the Digital RF Processor (DRP) Group within Texas Instruments Inc. with a mission to invent new digitally intensive approaches to traditional RF functions for integrated radios in deeply scaled CMOS technology. He was appointed as the CTO of the DRP Group from 2007 to 2009. In 2009, he joined the Delft University of Technology, Delft, The Netherlands, where currently he holds a guest appointment of Full Professor (Antoni van Leeuwenhoek Hoogleraar). Since 2014, he has been a Full Professor with the University College Dublin (UCD), Dublin, Ireland. He has authored or coauthored five books, seven book chapters, and 150 journal and 210 conference publications and holds 200 issued U.S. patents. His research interests include nanoscale CMOS architectures and circuits for frequency synthesizers, transmitters, and receivers, as well as quantum computers.

Dr. Staszewski was a recipient of the 2012 IEEE Circuits and Systems Industrial Pioneer Award.