## UC Santa Barbara

**UC Santa Barbara Previously Published Works** 

### Title

Design and Analysis of Collective Pulse Oscillators

### Permalink

https://escholarship.org/uc/item/0dj52940

### Journal

IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 28(5)

**ISSN** 1063-8210

### **Authors**

Mukim, Prashansa Dalakoti, Aditya McCarthy, David <u>et al.</u>

### **Publication Date**

2020-05-01

### DOI

10.1109/tvlsi.2019.2959532

Peer reviewed

# Design and Analysis of Collective Pulse Oscillators

Prashansa Mukim<sup>®</sup>, *Student Member, IEEE*, Aditya Dalakoti, David McCarthy, *Student Member, IEEE*, Carrie Segal, Merritt Miller<sup>®</sup>, James F. Buckwalter<sup>®</sup>, *Senior Member, IEEE*, and Forrest Brewer, *Member, IEEE* 

Abstract-Collective pulse oscillators (CPOs) are novel designs constructed using pulse regenerative amplifiers that exhibit timevariant gate delay based on the residual charge from past state. This property makes it possible to achieve precise phase resolutions smaller than a pulse gate delay and/or provide identical phase taps at multiple physical locations. CPOs exhibit temporal phase error correction that results in an improvement in frequency stability  $\propto -10 \log p$  for power  $\propto p$  across all timescales beyond the correction settling time. While CPOs result in device noise-based figure of merits (FoMs) comparable to that of ring oscillators, they are more resilient to power-coupled and impulse noise. This article presents a systematic time-domain analysis of the properties of CPOs based on an abstract model that captures the time-variant delay of pulse gates. Closedform analytic solutions for CPOs disturbed by impulse noise are derived, and higher order CPOs with continuous noise injection are analyzed using behavioral simulations and characterized using Allan deviation. Hspice simulation results are presented to validate the model and compare CPOs with ring oscillators. Allan deviation and phase noise measurements on CPOs of 8 and 40 gates fabricated in GFUS8RF (130-nm) technology corroborate the theory and simulation results.

*Index Terms*—Allan deviation, collective dynamics, multiphase oscillators, nonlinear amplifiers, pulse logic, voltage-controlled oscillators (VCOs).

#### I. INTRODUCTION

VOLTAGE-controlled oscillators (VCOs) are required in a broad range of digital, mixed-signal, and RF integrated circuit (IC) designs for logic timing, sampling, and frequency synthesis. Applications such as time-to-digital converters (TDCs) [1], [2], analog-to-digital converters (ADCs) [3], [4], clock and data recovery (CDR) [5], and microprocessors [6] require low integrated timing noise and often utilize multiple oscillator phases. To this end, ring oscillators, consisting of inverters or differential limiting amplifiers, are commonly used as VCOs due to their small footprint, broad tuning range, and availability of multiple clock phases.

Widely used techniques to improve phase noise (P.N.) of ring oscillators include transistor sizing [7], [8], jitter

Manuscript received May 30, 2019; revised August 19, 2019, October 27, 2019, and November 23, 2019; accepted December 8, 2019. (*Prashansa Mukim and Aditya Dalakoti contributed equally to this work.*) (*Corresponding author: Prashansa Mukim.*)

The authors are with the Department of Electrical and Computer Engineering, University of California at Santa Barbara, Santa Barbara, CA 93106 USA (e-mail: prashansa@ece.ucsb.edu; aditya@ece.ucsb.edu; davidmc@ece.ucsb.edu; chsegal@ece.ucsb.edu; merrittmiller@ece.ucsb.edu; buckwalter@ece.ucsb.edu; forrest@ece.ucsb.edu).

Color versions of one or more of the figures in this article are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TVLSI.2019.2959532

minimization [9], transmission line stabilization [10], and spatial coupling [11], [12]. P.N. is improved by increasing the power or by coupling to a high-Q resonator. Increasing the power yields a P.N. improvement of  $-10\log p$ , where the oscillator power is  $\propto p$ . On the other hand, linear spatial coupling between p-identical oscillators yields a P.N. improvement of  $-10\log p$  only near the carrier frequency [12]. At frequency offsets far from the carrier or equivalently over small timescales, the improvement in frequency stability is less than  $\propto 1/\sqrt{p}$ . This is due to the finite time associated with correction of noise perturbations in a weakly coupled system [13]. However, the possibility of achieving multiple clock phases with resolutions independent of the smallest gate delay for a given technology [11] and ease of low-skew lowjitter timing distribution [14] make their use viable for a variety of applications [15]–[18].

This article presents collective pulse oscillators (CPOs) that are realized using pulse regenerative amplifiers [19] as the main delay element. With pulses traversing the loop as opposed to edges, CPOs can be constructed with either even or odd number of pulse buffers, thus providing even or odd number of phases. Further, they can be operated in multiple modes by injecting different numbers of pulses at start-up, providing precise phases with resolution smaller than the buffer delay if the number of pulses does not divide the number of phase taps. In particular, operation at high frequencies is supported independent of the number of available phase outputs. It is also possible to design CPOs that provide identically timed image clock phases at multiple physical locations.

For widely analyzed ring or *LC* oscillators, it is known that phase shifts due to noise persist indefinitely [20]. However, the behavior of CPOs is distinct in this regard. CPOs exhibit temporal degradation of phase error to a magnitude smaller than the initial injected phase error. This is achieved by partial retention of past state in the form of residual gate charge. Effectively, each gate of the CPO shows local negative timing feedback and corrects phase errors, leading to improvement of the global frequency stability. CPOs exhibit self-correction of phase error at timescales that are close to the oscillation frequency. As a result, their behavior is similar to that of spatially coupled oscillators with frequency stability improvement  $\propto 1/\sqrt{p}$  across all timescales beyond the correction settling time. Multipulse CPOs also provide a mechanism to improve frequency stability and P.N. without increasing the total power density. Compared to ring oscillators operating at similar frequencies, CPOs indeed consume more power in order to

1063-8210 © 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications\_standards/publications/rights/index.html for more information.



Fig. 1. (a) CPOs formed by ring of pulse gates. (b) Voltage at pulse gate nodes during pulse generation.

provide phase correction. Thus, for a figure of merit (FoM) that only takes into account device noise, along with frequency and power, the performance of CPOs is similar to that of ring oscillators. However, for spatially correlated or impulsive noise sources (e.g., power-coupled noise and single-event upsets), CPOs achieve superior power versus noise tradeoffs. This article presents a systematic analysis of the properties of CPOs and their dependence on design parameters.

Anomalously stable timing characteristics of pulse gatebased circuits were observed in serial links and arbiters [19]. Fairbanks and Moore [21] observed that asynchronous tokens fired into a ring had dynamics that led to uniform distribution of the tokens around the ring. Winters and Greenstreet [22] used pulse asynchronous circuits to create precision pipelines. The existence of predictable delay-separation local dynamics in pulse gates, methods to modify the dynamics via gate construction or device tuning, and results on stability of multipulse oscillators have also been presented [23]. In a similar vein, self-timed ring oscillators (STROs) [24], constructed using Muller-C gates, have been proposed, which utilize gate dynamics to yield integrated P.N. improvements. To the best of our knowledge, this is the first work that presents detailed timedomain analysis of the behavior of oscillators that exploit local time-variant gate delays to achieve precise phase resolutions and also improve the global frequency stability. The analysis is strongly supported by simulation and measurement results.

This article is organized as follows. Description of pulse gates, oscillator architecture, and operation are presented in Section II. Time-domain closed-form analytic solutions for phase error of simple loops when disturbed by impulse noise are derived in Section III. In Section IV, Allan deviation as the time-domain frequency stability metric for oscillators is overviewed. In Section V, the analytic model is expanded into a generic behavioral tool, enabling the analysis of more complex loops with arbitrarily imposed noise. The behavioral model is validated against Hspice simulations and the effect of power-law noise on different parameter CPOs is analyzed. A comparison between CPOs and ring oscillators in terms of their response to impulse noise, uncorrelated white device noise, and power coupled noise is drawn in Section VI.



Fig. 2. (a) Different delay-separation dynamics in amplifiers. (b) Delayseparation curve for a typical pulse gate showing different regions of operation.

Finally, measurements on CPOs of 8 and 40 gates fabricated in GFUS8RF (130 nm) process are presented in Section VII.

#### **II. SYSTEM DESCRIPTION**

A pulse gate is a nonlinear, shape-preserving amplifier shown in Fig. 1(a), derived from a self-resetting pulse generator. The critical node voltage,  $V_{crit}$ , is pulled down by the input pulse signal at  $V_{in}$ , which pulls up the output,  $V_{out}$ , and triggers the pull down of  $V_{reset}$ , causing the pMOS to pull up  $V_{crit}$ , resetting the gate. The keeper loop restores charge on the  $V_{crit}$  node and prevents it from floating when it is not actively driven by an input pulse or the resetting pMOS transistor. After a suitable delay,  $V_{reset}$  returns to its steady state. Fig. 1(b) shows the voltages at different nodes of the pulse buffer during pulse generation. Here,  $t_{Dp}$  is the delay of the forward path from the input to the output pulse and  $t_{Dn}$  is the delay of the reset loop. A CPO is formed by connecting pulse gates in a loop.

The relative values of  $t_{Dp}$  and  $t_{Dn}$  lead to interesting dynamics. These dynamics are represented through a delay– separation curve that shows how the input–output gate delay  $(t_{Dp})$  is modulated by the time separation ( $\tau$ ) between consecutive input pulses at that gate. Fig. 2(a) shows delay–separation curves for repulsive, constant, or attractive dynamics in amplifiers. Here, *m* is the slope of the curve and b is the y-intercept. While this slope is negative in pulse gates, typical inverters have a very slightly positive slope, that is, the delay of the gate decreases as pulses approach in time. This leads to the well-known "settling" of ring oscillators to the lowest possible mode. Fig. 2(b) shows the delay–separation curve for a typical pulse gate. This curve comprises of three regions which are as follows.

- 1) Region-1: If the separation  $\tau[i+1]$  between consecutive pulses p[i] and p[i+1] is small such that  $V_{\text{crit}}$  is still pulled low when pulse p[i+1] occurs, the pulse  $p\_o[i+1]$  at  $V_{\text{out}}$  will coalesce with the pulse  $p\_o[i]$ . In effect, the input pulse p[i+1] is rejected.
- 2) Region-2: If the separation  $\tau[i+1]$  between consecutive pulses p[i] and p[i+1] is such that  $V_{crit}$  has been pulled-up substantially, but the gate is in its reset phase when pulse p[i+1] occurs, that is,  $V_{reset}$  is active (low), trying to pull-up  $V_{crit}$ , a distinct pulse  $p_o[i+1]$  at  $V_{out}$ will be generated. However, with both the pull-down and pull-up transistors active, the time to discharge  $V_{crit}$ increases, increasing the delay of the gate, effectively causing repulsion of  $p_o[i+1]$ . This behavior leads to a high-slope delay–separation region.
- 3) Region-3: If the separation  $\tau[i+1]$  between consecutive pulses p[i] and p[i+1] is such that the reset phase has concluded when p[i+1] occurs, the stored charge in the keeper hysteresis loop slightly delays the pulse  $p_o[i+1]$  at  $V_{out}$ . This leads to a low-slope delay-separation region.

By changing the relative delays of a pulse gate's forward and feedback paths ( $t_{Dp}$  and  $t_{Dn}$ , respectively), both *b* and *m* can be tuned. Methods for this are described in [23]. When the pulse gates are connected in a loop as is the case with CPOs, pulses distribute uniformly in time around the loop due to the repulsive dynamics of the gates. Hence, a precise phase tap is available at each CPO gate. A CPO ring can be constructed with either even or odd number of pulse gates, making it possible to generate both even or odd number of clock phases. A *g*-gate CPO operating with *p*-pulses where g/p is an integer will exhibit g/p different phases, with each phase available at unique phase with period  $(g/p) * t_{Dp}$  and phase resolution  $t_{Dp}/p$ . Thus, a CPO can easily produce pulses with phase resolution finer than  $t_{Dp}$ .

For a CPO that has *p* pulses traveling around the loop, the oscillation frequency depends on the ratio of gates per pulse (g/p) and  $t_{Dp}$ , and is equal to  $(g/p) * t_{Dp}$ . The ratio g/p also sets the region of operation on the delay–separation curve. By increasing the number of pulses *p* while keeping the g/p ratio a constant, the rate of pulse arrival at each gate stays unchanged and the oscillator operates at the same frequency. However, the total power increases  $\propto p$ , and the effective frequency stability is enhanced. The duty cycle of the oscillator is determined by the ratio of the width of pulses (set by the delay of the reset loop,  $t_{Dn}$ ) and the oscillator period.

Since it is possible for a g gate CPO to run in multiple modes set by the number of pulses p, it is crucial to ensure a reliable start-up in the desired mode. The  $V_{\text{fire}}$  input of the



Fig. 3. Startup circuit and waveforms for g = 8, p = 2 CPO.



Fig. 4. Period of pulse arrival at different gates of g = 8, p = 2 CPO after startup settles to a constant value in less than ten cycles.

pulse gate is used to inject a start-up pulse into the CPO. Starting a mode with p pulses can be done in two ways: 1) p pulses can be injected sequentially into the  $V_{\text{fire}}$  node of a single pulse gate with relative separations close to the expected period of the CPO and 2) the  $V_{\text{fire}}$  nodes of p uniformly spaced gates can be injected with a pulse simultaneously. As long as the separation of start-up pulses is large enough to avoid pulse coalescence, the CPO will start-up reliably. If the pulse injection period at start-up does not exactly match the period of a stably operating CPO or there is a skew between the fire inputs to the different gates, the repulsive dynamics in pulse gates will still lead to uniform distribution of pulses, and the CPO will rapidly settle to a stable state. Fig. 3 shows the startup circuit and output waveforms for a g = 8, p = 2 CPO. The fire pulse is generated by converting an external edge to a pulse and is injected into gates 1 and 5 of the CPO. Fig. 4 shows the period of pulse arrival obtained from Hspice simulations for a g = 8, p = 2 CPO in GFUS8RF technology at different gates after start-up. Since, at start-up, the  $V_{crit}$  nodes of the CPO do not store any residual charge, the gate delay and period of the CPO are different from the stable period. Once oscillations are sustained, the pulse arrival period settles to a stable value at all gates in a few cycles.

Widely used models for the design of oscillators like the linear time-variant (LTV) model using impulse sensitivity function (ISF) [20] and nonlinear analysis techniques of stable oscillators in the presence of perturbations [25] assume that phase errors due to noise persist indefinitely, with their magnitude equal to the initial injected error. They do not account for nonlinear phase error corrections with time, a property fundamental to the operation of CPOs, and hence cannot be directly applied to analyze CPOs. To overcome this challenge, this article formulates a new sequential time delay model that explains the noise properties of CPOs as a function of various oscillator parameters. This model is presented in Section III.

#### III. ANALYTIC MODEL OF PULSE ARRIVAL TIME UNDER IMPULSE NOISE PERTURBATION

This section presents time-domain analytic solutions for the pulse arrival time for CPOs perturbed by a noise impulse. Although a bilinear approximation for the delay–separation curve is more accurate [see Fig. 2(b)] for CPOs, the model assumes small perturbations for which the CPO operates in either one of the linear regions (Region 2 or Region 3). Analytic solutions are obtained by modeling the arrival time of pulses at each pulse gate via a set of linear difference equations. The goal is to derive exact solutions for a few low-order loops, which can be used to construct a general perturbation solution.

#### A. Gates(g) = 1, Pulses(p) = 1

Consider a single-gate CPO with the output of a pulse buffer looping back to its input (see Fig. 5) that has been started by an external event. Let x[k] be the arrival time of the *k*th pulse at the gate. Then x[k] is given by (1), where *b* and *m* are constants

$$x[k] = x[k-1] + b + m(x[k-1] - x[k-2]).$$
(1)

The arrival time of a pulse at the gate is the sum of the arrival time of the previous pulse and the delay of the gate. The delay here is represented as a linear function of the separation that is set by the arrival times of the previous two pulses. In the case of ideal (noise-less) operation, the period (x[k] - x[k-1] = x[k-1] - x[k-2]) will be a constant given by b/(1-m). If an impulse of timing error of magnitude  $\epsilon$  is introduced into the system at k = 1, the initial conditions are given by

$$x[0] = 0, \quad x[1] = \epsilon + \frac{b}{1-m}.$$
 (2)

Solving this linear difference equation, the *k*th pulse arrival time is given by

$$x[k] = \frac{bk}{1-m} + \frac{\epsilon}{1-m} - \frac{\epsilon m^k}{1-m}.$$
(3)

The solution consists of three distinct terms which are as follows.

1) The first term represents the ideal pulse arrival time, if no noise were introduced into the system.



Fig. 5. One-pulse traveling in a one-pulse-gate loop.

- 2) For m < 0 (the case for CPOs), the second term represents the residual phase error after a sufficiently long interval. It can be observed that an injected impulse error of  $\epsilon$  is reduced in magnitude to a residual error of  $\epsilon/(1-m)$ .
- 3) For m < 0 and |m| < 1 (again the case for CPOs), the third term represents a rapidly diminishing transient that reduces the magnitude of the phase error to its final residual value. However, it can be seen that if m > 0 or if |m| > 1, the phase error can grow with time, making the oscillator unstable.

Since m > 0 or |m| > 1 have been shown to make the oscillator unstable, the subsequent analysis excludes these cases and is strictly for CPOs with |m| < 1. Next, we will evaluate the case of a single-pulse traveling in a loop comprising of two pulse gates.

#### B. Gates(g) = 2, Pulses(p) = 1

Let x[k] and y[k] be the arrival times of the *k*th pulses at gates 1 and 2, respectively (see Fig. 6). Then x[k] and y[k] are given by (4) and (5), where *b* and *m* are constants:

$$y[k] = x[k] + b + m(x[k] - x[k-1])$$
(4)

$$x[k] = y[k-1] + b + m(y[k-1] - y[k-2]).$$
(5)

In the case of ideal (noise-less) operation, the pulse arrival periods at the two gates (x[k] - x[k-1] = y[k] - y[k-1] = y[k-1] - y[k-2]) will be constants given by 2b/(1-2m), with each gate contributing to half of the delay. If an impulse of timing error of magnitude  $\epsilon$  is introduced when a pulse arrives at gate 2 at k = 1, the initial conditions are given by

$$x[0] = 0, \quad y[0] = \frac{b}{1 - 2m}$$
  
$$x[1] = \frac{2b}{1 - 2m}, \quad y[1] = \epsilon + \frac{3b}{1 - 2m}.$$
 (6)

Solving this system of linear difference equations, the kth pulse arrival times are given by

$$x[k] = \frac{2bk}{1-2m} + \frac{\epsilon}{1-2m} - \frac{\epsilon * 2^{-k}}{2(1-2m)} * o[k]$$
(7)

where o[k] is given by

$$o[k] = \frac{1}{\sqrt{m(m+4)}} \times [(m(m+2-\sqrt{m(m+4)}))^k + (m(m-2+\sqrt{m(m+4)}))^k + (m-2)((m(m+2-\sqrt{m(m+4)}))^k - (m(m+2+\sqrt{m(m+4)}))^k)].$$
(8)



Fig. 6. One-pulse traveling in a two-pulse-gate loop.

For CPOs with m < 0 and |m| < 1, this loop operates at a period that is slightly smaller than twice the period of the previous case, where g = 1, p = 1. The magnitude of the residual phase error is further reduced compared to the g = 1, p = 1 case. The transient term, however, is a fairly complex function of time and the delay-separation slope mas is expected of a third-order system. This transient term represents the settling trajectory (and time) of the loop to a stable state, in response to the impulse injection.

#### C. Gates(g) = 2, Pulses(p) = 2

Let x[k] and y[k] be the arrival times of the *k*th pulses at gates 1 and 2, respectively (see Fig. 7). Then x[k] and y[k] are given by (9) and (10), where *b* and *m* are constants:

$$y[k] = x[k-1] + b + m(x[k] - x[k-1])$$
(9)

$$x[k] = y[k-1] + b + m(y[k] - y[k-1]).$$
(10)

In the case of ideal (noise-less) operation, the pulse arrival periods at the two gates (x[k] - x[k - 1] = y[k] - y[k - 1]) will be constants given by b/(1 - m). If an impulse of timing error of magnitude  $\epsilon$  is introduced when a pulse arrives at gate 2 at k = 1, the initial conditions are given by

$$x[0]=0, y[0]=0, x[1]=\frac{b}{1-m}, y[1]=\epsilon+\frac{b}{1-m}.$$
 (11)

Solving this system of linear difference equations, the kth pulse arrival times are given by

$$x[k] = \frac{bk}{1-m} + \frac{\epsilon}{2(1-m)} - \frac{\epsilon * 2^{-k}m^k}{2(1-m)} * o_x[k]$$
(12)

$$y[k] = \frac{bk}{1-m} + \frac{\epsilon}{2(1-m)} - \frac{\epsilon * 2^{-k}(-m)^k}{2(1-m)} * o_y[k] \quad (13)$$

where  $o_x[k]$  and  $o_y[k]$  are given by

$$o_{x}[n] = \frac{1}{\sqrt{1 + m(m+6)}} \times ((-m-1-\sqrt{1 + m(m+6)})^{n} - (-m-1+\sqrt{1 + m(m+6)})^{n}) \quad (14)$$
$$o_{y}[n] = -o_{x}[n]. \quad (15)$$

The first term of the solution in this case exactly matches the first term of the case where g = 1, p = 1. This also matches the intuitive expectation, as fixing the g/p ratio and changing p should result in the same oscillation frequency. The magnitude of the residual phase error [the second term in (12) and (13)] is exactly half, compared to the g = 1, p = 1case. This illustrates that by doubling the effective mass of the ring (and its power consumption), while keeping its oscillation



Fig. 7. Two pulses traveling in a two-pulse-gate loop.

frequency constant, the magnitude of the residual phase error is made twice as small. Once again, the transient terms for the two gates are fairly complex, but also show a symmetry. The two gates act in conjunction such that their respective phase errors symmetrically approach the final residual phase error in the loop.

Based on the three exact solutions derived so far, the normalized phase error (obtained by subtracting the ideal noiseless arrival time from the derived solutions and normalizing it with respect to the magnitude of the injected impulse  $\epsilon$ ) is plotted in Fig. 8. The following observations can be made from these plots.

- 1) Fig. 8(a) compares the normalized phase error for oscillators exhibiting attractive (m > 0), constant (m = 0), and repulsive (m < 0) dynamics. It can be seen that constant dynamics result in a constant phase error of magnitude equal to the impulse noise injection, as is the case with conventional oscillators. To model a conventional ring oscillator, the p = 1 mode was chosen, as inverter-based ring oscillators only involve circulation of a single event around the ring. The phase error for attractive dynamics increases in magnitude, making the oscillator unstable. With repulsive dynamics, the oscillator settles to a phase error smaller in magnitude than the initial injected impulse.
- 2) Increasing the number of gates (g), number of pulses (p), or the magnitude of the (negative) delay-separation slope (m) reduces the magnitude of the residual phase error.
- 3) The settling time, that is, the time taken by the transient term to diminish in magnitude, is a complex function of the loop topology and operating slope. While it can be inferred that a multipulse ring tends to have a longer settling time, increasing the magnitude of the operating slope can increase or decrease the settling time, as can be seen in Fig. 8(b) and (c). For the g = 1, p = 1 and g = 2, p = 1 CPOs increasing the magnitude of *m* from -0.05 to -0.5 increases the settling time. Whereas for the g = 2, p = 2 CPO, it causes the settling time to decrease.

#### D. Generalization: Gates(g), Pulses(p)

Obtaining exact solutions for higher order loops (order > 3) is difficult. However, based on the three exact solutions derived, the fixed (noise-less) pulse arrival time and magnitude of the residual error can be inferred to have forms



Fig. 8. Normalized phase error obtained from analytical solutions for (a) different delay-separation dynamics, (b) weakly repulsive dynamics, and (c) strongly repulsive dynamics.

shown in

$$x[k] = \frac{gbk}{p(1 - (g/p)m)} + \frac{\epsilon}{p(1 - (g/p)m)} + \epsilon * \operatorname{transient}_{x}[k, g, p, m]$$
(16)

or

$$x[k] = T_o k + \Phi[k]. \tag{17}$$

The general solution of (16) is rewritten in (16) as the sum of: 1) the nominal arrival time (with  $T_o$  as the nominal period of the CPO), represented by the first term of (16) and 2) phase deviation  $\Phi[k]$ , represented by the sum of second and third terms of (16). The correctness of this general solution has been verified against both behavioral and Hspice simulations presented in Section V. The general analytic solution shows the following.

- 1) For a fixed number of gates g and (negative) m, CPOs operating in different modes set by the number of pulses p, see a reduction in period that is less than  $\propto p$ .
- 2) The magnitude of both the residual phase error and the transient term is directly proportional to the magnitude of the injected noise impulse. This makes the settling time independent of the magnitude of noise injection.
- 3) Increasing the number of gates g, the number of pulses p, or the magnitude of the (negative) delay–separation slope m reduces the magnitude of the residual phase error.
- 4) For a fixed oscillation frequency (obtained by having g/p and m constant), the residual phase error reduction for a noise impulse is proportional to the number of pulses p and hence the total power consumption. Qualitatively, this shows that for a loop, each gate's noise injection is scaled by the number of pulses which react as an ensemble to reduce the magnitude of the injected error.

These analytic solutions are based on impulse noise injection into a single CPO gate. To analyze the frequency stability with continuous power-law noise injection, Allan deviation, a time domain stability measurement metric is used.

#### IV. TIME DOMAIN MEASUREMENT OF STABILITY

We use Allan deviation as the analysis metric as it measures the stability of CPOs at different timescales. It characterizes the fractional frequency fluctuations (F[k]) in CPOs given by (18), where  $\Phi[k]$  is the phase deviation (in seconds) as a discrete function of time and  $\tau$  is the measurement interval

$$F[k] = \frac{\Phi[k+\tau] - \Phi[k]}{\tau}.$$
(18)

Noise in circuits generally exhibits a power law given by  $S_F(f) \propto f^{\alpha}$ , where  $S_F(f)$  is the autospectral density of fractional frequency fluctuations F[k] and the exponent  $\alpha$  ranges from -3 to +2 typically. The well analyzed noise sources in circuits, white FM and flicker FM, have  $\alpha$  of 0 and -1, respectively. Allan deviation is the same as the ordinary standard deviation of fractional frequency fluctuation values for white FM noise, but has the advantage, for more divergent noise types such as flicker noise, of converging to a value that is independent on the number of samples [26]. Allan deviation is given by the square root of  $\sigma^2(\tau)$  in (19), where  $\Phi_i$  is the *i*th phase error value spaced by the measurement interval  $\tau$  and N is the number of samples of phase error values averaged over  $\tau$ :

$$\sigma^{2}(\tau) = \frac{1}{2(N-2)\tau^{2}} \sum_{i=1}^{N-2} [\Phi_{i+2} - 2\Phi_{i+1} + \Phi_{i}]^{2}.$$
 (19)

Overlapping Allan deviation is a variant of the original Allan deviation that provides better statistical confidence [26]. Modified Allan deviation given by the square root of (20) is another variant that can additionally distinguish between noise behaviors having  $\alpha \ge 1$  [27]. Here, the measurement interval  $\tau = a\tau_o$ , where *a* is the averaging factor and  $\tau_o$  is the basic measurement interval:

$$Mod \ \sigma^{2}(\tau) = \frac{1}{2a^{2}\tau^{2}(n-3a+1)} \times \sum_{j=1}^{N-3a+1} \left\{ \sum_{i=j}^{j+a-1} [\Phi_{i+2a} - 2\Phi_{i+a} + \Phi_{i}] \right\}^{2}.$$
(20)

Different frequency domain noise profiles can be identified by measuring the slope of modified Allan deviation  $(Mod \sigma^2(\tau))$  on a log-log scale. White FM and Flicker FM have a slope of -1/2 and 0, respectively. Sections V-VII of this article use modified Allan deviation to characterize the frequency stability of CPOs at different timescales. The dynamic phase error correction properties and the settling time associated with them are represented on modified Allan deviation plots by an initial high-slope (<-1/2) region. For comparing the jitter of CPOs operating at different frequencies, the Allan deviation values are scaled by the oscillation period ( $T_o$ ) and plotted as a function of number of clock cycles. This metric is termed "jitter stability" in this article and the *k*-cycle jitter stability, J[k] is defined as follows:

Jitter Stability, 
$$J[k] = Mod \ \sigma(kT_o) * T_o.$$
 (21)

#### V. BEHAVIORAL SYSTEM SIMULATOR

The aim of building the behavioral simulator was to: 1) verify the residual phase error expression in the general analytic solution of CPOs [see (16)] when injected by impulse noise; 2) analyze settling time behaviors of higher order loops; and 3) analyze the dependence of design parameters on frequency stability of CPOs under continuous power-law noise injection. Fundamentally, the simulator is built upon the same abstract model as the analytic solutions. The details of the behavioral simulator are summarized as follows.

1) Pulse gates are abstracted as nodes that compute the arrival time of the output pulse based on the arrival time of the input pulse and the gate delay. The gate delay is modeled by linear and repulsive (m < 0) delay–separation dynamics and is given by

$$t_{Dp} = b + m\tau. \tag{22}$$

- 2) Pulses are assumed to be traversing a simple loop of pulse gates as shown in Fig. 1.
- 3) Pulse arrival times are the state variables initialized for the first two cycles and then updated sequentially.
- 4) Each gate can be injected by noise when a pulse arrives at its input. The noise injection is modeled by a perturbation of the pulse arrival time. This perturbation can either be an impulse, where only one gate is injected by noise at a given pulse arrival, or can be modeled as injection of white FM and/or flicker FM noise sources at each gate and at each pulse arrival.
- 5) For a g-gate CPO, the maximum number of pulses p equals the number of gates g.

#### A. Impulse-Noise Analysis

CPOs with varying parameters (g, p, and m) were constructed and pulse arrival times under ideal (noiseless) operation were determined. Their behavior was then analyzed with impulse-noise injection. The ideal operating period  $(T_o)$  and residual phase error exactly match the first and second terms of the general analytic solution of (16). This comparison is shown in Fig. 9. Since the general analytic solution does not predict the transient term, the behavioral simulator is utilized to analyze the transient behavior in terms of the settling time. The settling time is obtained by computing the time taken for the phase error of all gates of the CPO to reach 1% of the final residual phase error after injection of a noise impulse.

Settling times as a function of |m| for 8-gate and 40gate CPOs (m < 0) running in different modes are shown



Fig. 9. Comparison of (a)  $T_o$  and (b) residual phase error values obtained from the general analytic solution and behavioral simulator.



Fig. 10. Effect of *m* on settling time with impulse-noise injection obtained from behavioral simulator. (a) g = 8 CPO. (b) g = 40 CPO.



Fig. 11. Impulse noise simulation setup in Hspice.

in Fig. 10(a) and (b), respectively. These plots indicate that for p = 1 CPOs, larger values of |m| (for a constant g) lead to longer settling times. However, for multipulse CPOs (p > 1), the settling time is nonmonotonic with respect to |m|; increasing |m| leads to lower settling times until a minimum is reached, beyond which the settling time increases as |m| increases. It can also be observed that for a fixed m, increasing the density of a CPO ring (by increasing p for a fixed g) or increasing the ring diameter (or g) for a fixed ring density (or p/g) both result in longer settling times. Interestingly, the residual phase error diminishes in magnitude for larger ring densities and diameters. Hence, from a design perspective, there exists a tradeoff between the desired residual phase error and the amount of time taken to settle to the residual value.

#### B. Verification of Behavioral Simulations Using Hspice

The analysis so far on the behavior of CPOs has been based on an abstract model constructed using the linear and repulsive delay-separation dynamics of pulse gates. To verify that this model accurately captures salient properties of CPOs, impulse-noise results obtained from the behavioral simulator were verified against Hspice simulations. The simulation setup used in Hspice is shown in Fig. 11. g = 8 and g = 12 CPOs were tested, in the p = 2 and p = 3 modes, respectively. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS



Fig. 12. Comparison of normalized residual (impulse) phase error obtained from behavioral and Hspice simulations. (a) g = 8, p = 2. (b) g = 12, p = 3.

After starting the oscillators by firing two and three pulses, in about ten cycles, stable operation of the oscillators was achieved with a constant non-time-varying period. To model an impulse time perturbation, a small current impulse (1% of the peak output current) was injected into the input node of a single-pulse gate. The pulse arrival times at each gate were calculated at the rising-edge zero-crossings of the pulses when the pulse voltage was at half the maximum voltage ( $V_{dd}/2$ ). Phase errors were obtained by subtracting the pulse arrival times after noise injection and those in an ideal noise-less simulation.

Phase errors obtained using Hspice and the behavioral simulator for two and three equidistant gates of the g = 8 and g = 12 CPOs, respectively, normalized with respect to the magnitude of the initial injected error are shown in Fig. 12(a) and (b). The delay-separation slope m in the behavioral simulations was set to the same value as obtained from Hspice simulations. It can be seen that for both simulations there is a very close match between the trajectory of phase-error correction, settling time as well as the magnitude of the residual phase error. This validates that the abstract model and analytical solutions accurately capture the behavior of CPOs and can be used as a substitute to relatively slow Hspice simulations.

#### C. Power-Law Noise Analysis

Discrete-time noise sequences generated by the algorithm presented in [28] were used for simulating powerlaw noise sources. For analyzing the resulting phase data using Allan deviation, the IEEE Standard Allan deviation tool Stable-32 [29] was used. Frequency stability over different timescales was analyzed for CPOs of various gate counts g, operating in various modes (set by p) and values of delay-separation slopes m. Figs. 13–15 show modified Allan deviation plots across the design space with both white FM and flicker FM noise injection. The magnitude of the flicker noise component was set to be 1/10th of the white noise component. The "conventional" oscillator in these plots corresponds to a single-event oscillator (p = 1) that does not exhibit any dynamics (m = 0). The error bars depict the 95% confidence limits.

Fig. 13 shows the modified Allan deviation of CPOs of different diameters (or g), but same pulse density (or p/g) and m. For larger values of p, the frequency stability substantially



Fig. 13. Improvement in frequency stability of CPOs as a function of p for fixed g/p and m obtained from behavioral simulations.



Fig. 14. Improvement in frequency stability of CPOs as a function of m for fixed g, p, and  $T_o$  obtained from behavioral simulations.

improves beyond  $\approx$ 3–4 ns, whereas the initial instability and higher slope region correspond to the larger settling time associated with multipulse CPOs. The single-pulse CPO does not exhibit any initial instability, due to its shorter settling time. The CPOs in this simulation naturally operate at identical frequencies (=5 GHz) and hence the improvement in frequency stability and jitter stability [see (21)] is the same. Comparing the jitter stability values obtained from this experiment, it was inferred that beyond the high-slope region, J[k] is  $\propto \sqrt{g}/p(1-(g/p)m)$ . The denominator is identical to the residual phase error derived in (16). The  $\sqrt{g}$  term in the numerator can be explained by the increase in the total (uncorrelated) noise power injection as the CPO diameter is increased. This indicates that at equal pulse density, equal frequency CPOs will exhibit an improvement in low-frequency P.N. that equals  $-10 \log p$ , where the oscillator power is  $\propto p$ . A unique feature of this class of CPOs is that any added power is distributed in space, and thus frequency stability is improved without increasing the power density.

Fig. 14 shows the modified Allan deviation for g = 40, p = 5 CPOs operating at different values of m. Compared to the conventional oscillator, the frequency stability of the m = -0.005 and m = -0.05 CPOs shows a slight deterioration for  $\approx 3$  ns. This is due to the relatively smaller corrections made by CPOS with smaller |m|, which increases the time for multipulse rings to reach an equilibrium, leading to a larger degradation in the frequency stability over short timescales.



Fig. 15. Improvement in (a) frequency stability and (b) jitter stability of CPOs as a function of p for fixed g obtained from behavioral simulations.

Whereas, for the m = -0.5 CPO, the disequilibrium between multiple pulses quickly falls to a small value, and as the errors get smaller, the corrections also get smaller, increasing the time to settle to the final (smaller) residual error. In this simulation, the CPOs were made to operate at the same frequency (3 GHz) by tuning their *b* values. Hence, the improvement in frequency stability and jitter stability of the compared CPOs is again the same. The J[k] values computed in this experiment match the relation between J[k] and CPO parameters inferred from the previous experiment.

Fig. 15 shows the frequency and jitter stability of g = 40 CPOs running in different modes and at different values of m. These values of m were chosen to simulate designs nearly identical to the fabricated 40-gate CPOs, the results of which are presented in Section VII. These CPOs all operate at different frequencies (550 MHz, 580 MHz, 2.8 GHz, and 4.6 GHz) and hence the jitter stability plots aid better in comparing their stability as a function of oscillation cycles. The larger initial instability and longer high-slope regions for CPOs running in a high-p mode correspond to their longer settling times. This simulation also validates the relation between J[k] and CPO parameters.

Results obtained from the behavioral simulations have helped us understand better the frequency stability of CPOs at different timescales and their dependence on design parameters. The main conclusions that can be drawn are that: 1) the phase error correction properties of CPOs improve their frequency stability across timescales beyond the settling time of the correction; 2) jitter stability of CPOs is  $\propto \sqrt{g}/p(1 - (g/p)m)$  and 3) CPOs identical in terms of frequency, pulse density (p/g), and m provide P.N. improvement of  $-10 \log p$ , where the power consumption is  $\propto p$  and distributed in space. Although for the results presented in this section, CPOs shown better long-term stability than ring oscillators, they also consume more power due to the more complicated design of pulse gates. Analyzing this power-performance tradeoff is beyond the scope of the behavioral simulator and hence is done through Hspice simulations presented in Section VI.

#### VI. HSPICE-BASED COMPARISON OF CPOS AND RING OSCILLATORS

To compare the FoM of ring oscillators and CPOs based on frequency, power, and device noise, as well as their response to power-coupled and impulse-noise, Hspice simulation results are presented. Comparisons are made between two 5-NAND gate rings with different transistor strengths and g = 4CPOs with different delay–separation slopes (m), as well as a g = 8, p = 2 CPO. The value of m for the CPOs was tuned by changing the relative delay of the feed-forward and feedback paths in the pulse gate, and the feedback path was built using four inverters for better controllability of m. NAND-gate rings were chosen for comparison, as they allow ring oscillators to be enabled/disabled easily. The ring was built using all NAND gates instead of one NAND gate and four inverters to ensure that the phases of the ring are uniformly spaced. This makes the comparison of ring oscillators to CPOs fair as CPOs also have additional inputs to start and enable/disable the ring. The schematics of the ring oscillator and CPOs used for comparison are shown in Fig. 16. The pull-down network of the pulse gates in this CPO contains additional inputs that allow the oscillator or the firing of a start-up pulse to be disabled by making the nodes enable\_ring and enable\_fire low, respectively.

The effect of device noise was evaluated by transient noise Hspice simulations conducted at a temperature of 45 °C and supply voltage of 1.5 V in GFUS8RF 130-nm technology. P.N. estimation with the Shooting Newton engine in Hspice or Specter is based on the assumption that the oscillator follows an LTV model [31], and injected phase errors persist indefinitely with a magnitude that equals the initial injected phase error. This assumption is not valid for CPOs as they exhibit temporal phase error correction, which in fact improves their frequency stability. This makes the simulated P.N. estimation for CPOs inaccurate. Fig. 17 shows the simulated P.N. for a g = 4, p = 1 CPO and  $w_p = w_n = 8 \ \mu m$  5-NAND-gate ring oscillator. The rms phase jitter  $J_{RMS}$ , obtained by integrating the P.N., is also shown on this plot. The relationship between P.N. and rms jitter is as follows [31]:

$$J_{\rm RMS} = \frac{1}{2\pi f_o} \sqrt{2 \int 10^{L(f)/10} df}.$$
 (23)

Here  $f_o$  is the nominal oscillation frequency and L(f) is the single-sideband P.N. While P.N. simulation results predict





Fig. 16. Schematics of compared (a) 5-NAND-gate ring and (b) g = 4 CPO.

the CPO P.N. to be  $\approx 3$  dB higher than that of the 5-NANDgate ring oscillator and consequently higher integrated rms phase jitter for the CPO, the jitter obtained from time-domain simulations (see Fig. 18) clearly shows the opposite trend. The *k*-cycle rms phase jitter was calculated from N = 100 runs of transient noise simulations based on the *k*-cycle phase error  $\Phi[k]$  as follows:

$$J_{\rm RMS}[k] = \sqrt{\frac{\sum_{i=1}^{N} \Phi_i[k]^2}{N}}.$$
 (24)

This validates the inadequacy of P.N. simulations and hence the comparisons in this section are based on time-domain simulations. The frequency, power, and 500-cycle rms phase jitter obtained from transient noise simulations for the oscillators are listed in Table I. Table I also includes the FoM improvement of CPOs with respect to that of ring oscillators. The improvement for the g = 4, p = 1 CPOs was calculated against the  $w_p = w_n = 8 \ \mu m$  NAND-gate ring, while that of the g = 8, p = 2CPO was calculated using the  $w_p = w_n = 16 \ \mu m$  NAND-ring as follows:

FoM improvement = 
$$20 \log \left(\frac{f_{CPO}}{f_{RO}}\right) - 10 \log \left(\frac{P_{CPO}}{P_{RO}}\right)$$
  
-  $20 \log \left(\frac{J_{RMS}[k = 500]_{CPO}}{J_{RMS}[k = 500]_{RO}}\right).$  (25)

Here  $f_{CPO}$ ,  $P_{CPO}$ , and  $J_{RMS}[k = 500]_{CPO}$  are the frequency, power, and 500-cycle rms phase jitter of the CPO and  $f_{RO}$ ,  $P_{RO}$ ,  $J_{RMS}[k = 500]_{RO}$  are those of the ring oscillator against which the CPO is compared. While the CPOs operate at a higher frequency and power than the ring oscillators, the jitter

TABLE I Performance Comparison of CPOs and Ring Oscillators (Obtained From Hspice Simulations)

| Oscillator                                    | Frequency<br>(GHz) | Power<br>(mW) | $\begin{array}{l} J_{RMS}[k=\\500] \text{ (ps)} \end{array}$ | FoM<br>improvement<br>(dB) |
|-----------------------------------------------|--------------------|---------------|--------------------------------------------------------------|----------------------------|
| 5-Nand Ring,<br>$w_p = w_n = 8 \mu m$         | 5.91               | 3.62          | 0.91                                                         | -                          |
| g=4, p=1, m=-0.17                             | 6.75               | 4.53          | 1.43                                                         | -3.76                      |
| g=4, p=1, m=-0.22                             | 6.89               | 4.79          | 1.04                                                         | -1.06                      |
| g=4, p=1, m=-0.25                             | 6.99               | 5.04          | 0.94                                                         | -0.24                      |
| g=4, p=1, m=-0.3                              | 7.12               | 5.37          | 0.89                                                         | 0.06                       |
| g=4, p=1, m=-0.35                             | 7.22               | 5.69          | 0.82                                                         | 0.68                       |
| g=4, p=1, m=-0.43                             | 7.42               | 6.81          | 0.69                                                         | 1.6                        |
| 5-Nand Ring,<br>$w_p = w_n = 16 \mu \text{m}$ | 5.92               | 7.26          | 0.58                                                         | -                          |
| g=8, p=2, m=-0.35                             | 7.22               | 11.38         | 0.51                                                         | 0.88                       |



Fig. 17. Simulated P.N. and rms phase jitter obtained by integrating P.N.

in CPOs strongly depends on the value of m. It can be seen that larger magnitudes of m lead to lower jitter as expected and the FoMs show a trend similar to the trend in jitter. These results suggest that by operating CPOs at relatively high values of |m|, the phase correction properties lead to similar FoMs as ring oscillators. Jitter stability plots for the 5-NAND-gate rings, g = 4, p = 1 and g = 8, p = 2 CPOs also obtained from transient noise simulations are shown in Fig. 19. The higher powered NAND-gate rings and CPOs show jitter stability values  $\approx \sqrt{2}$  smaller than their respective lower powered counterparts. For both CPOs, the jitter stability values are higher for approximately ten cycles, while the farout values are better than the corresponding ring oscillators as expected.

Fig. 20 shows the magnitude of residual phase error in the CPOs and ring oscillators after the injection of an impulse of noise current, of magnitude 2  $\mu$ A. Although the ring oscillators see a smaller initial timing deviation for the same magnitude of injected noise, the CPOs show significantly smaller residual error values. The g = 4, p = 1 CPO has a residual error 38% better than the  $w_p = w_n = 8 \ \mu m$  NAND-gate ring, while the g = 8, p = 2 CPO has a residual error 35% better than the  $w_p = w_n = 16 \ \mu m$  NAND-gate ring. Fig. 21 compares the performance of CPOs and ring oscillators under the influence of sinusoidal power-coupled noise sources of amplitude 75 mV (5%  $V_{dd}$ ). The cycle–jitter and cycle–cycle jitter [32] for the g = 4, p = 1 CPO are 29 and 46% smaller



Fig. 18. RMS phase jitter obtained from transient noise simulations.



Fig. 19. Jitter stability obtained from transient noise simulations.



Fig. 20. Simulated normalized phase error of CPOs and ring oscillators with impulse noise injection.

than the  $w_p = w_n = 8 \ \mu m$  NAND-gate ring. These results indicate that CPOs designed to operate at high values of |m|offer significant improvements over ring oscillators in terms of both impulse noise and power-coupled noise rejection. The jitter values for the higher powered g = 8, p = 2 CPO in Fig. 21 are almost identical to that of the g = 4, p = 1CPO, indicating that the lower jitter values with power noise are a result of the corrections due to a larger value of |m| and not multiple pulses. Finally, Fig. 22 shows the effect of supply voltage and temperature on the frequency of ring oscillators and CPOs, indicating that the tuning range of CPOs largely resembles that of ring oscillators.



Fig. 21. Comparison of CPOs and ring oscillators in terms of (a) cycle jitter and (b) cycle–cycle jitter with power-coupled noise simulation.



Fig. 22. Frequency as a function of (a) voltage and (b) temperature for 5-NAND-gate ring oscillator and g = 4, p = 1 CPO.

#### VII. MEASUREMENT RESULTS

CPO with 8 and 40 gates fabricated in the GFUS8RF (130nm) process have been tested. The chip micrograph is shown in Fig. 23(a) and CPO layouts are shown in Fig. 23(b) and (c). The g = 8 CPO can be run in either the p = 1 or p = 2 mode, and the g = 40 CPO can run in ten modes corresponding to p = 1-10. The pulse gate topology in the fabricated designs is similar to Fig. 16(b), with four inverters in the feedback loop and pull-down transistors to enable firing of a start-up pulse as well as enable the ring itself. To generate a start-up pulse, an external rising edge was driven into the chip and converted into a pulse. The pulse was used to drive the fire inputs of multiple gates of the two CPOs. For the g = 8CPO, the fire input on gate-1 and gate-5 [Fig. 16(b), node  $b_1$ ] was driven by the fire pulse, with a separate enable\_fire signal (node  $b_2$ ) for gate-1 and gate-5. The CPO mode was set by

| Oscillator,<br>Mode | Frequency<br>$f_o$ (GHz),<br>$V_{dd} = 1.5V$ | Power<br>(mW),<br>$V_{dd} = 1.5V$ | Frequency range (GHz), $V_{dd} = 0.6 - 1.6V$ | 1MHz Phase<br>Noise<br>$@f_o(dBc/Hz)$ | $\begin{array}{c} 10 MHz \\ \text{Phase Noise} \\ @f_o(dBc/Hz) \end{array}$ | $\frac{J[k=100], g, p=1}{J[k=100], g, p}$ | # of<br>Phases | FoM<br>(dB) |
|---------------------|----------------------------------------------|-----------------------------------|----------------------------------------------|---------------------------------------|-----------------------------------------------------------------------------|-------------------------------------------|----------------|-------------|
| g=8, p=1            | 2.86                                         | 5                                 | 0.52-3.02                                    | -94.6                                 | -115.6                                                                      | 1.0                                       | 8              | 157.7       |
| g=8, p=2            | 4.87                                         | 8.5                               | 0.88-5.11                                    | -96.8                                 | -118.8                                                                      | 2.92                                      | 4              | 163.2       |
| g=40, p=1           | 0.58                                         | 5                                 | 0.09-0.62                                    | -109.4                                | -127.7                                                                      | 1.0                                       | 40             | 156.0       |
| g=40, p=2           | 1.15                                         | 9.8                               | 0.18-1.22                                    | -106.9                                | -126.5                                                                      | 1.98                                      | 20             | 157.8       |
| g=40, p=3           | 1.71                                         | 13.5                              | 0.28-1.81                                    | -104.8                                | -125.1                                                                      | 3.02                                      | 40             | 158.5       |
| g=40, p=4           | 2.26                                         | 18.5                              | 0.37-2.38                                    | -103.6                                | -124.2                                                                      | 4.06                                      | 10             | 158.6       |
| g=40, p=5           | 2.78                                         | 22.5                              | 0.45-2.93                                    | -102.2                                | -122.6                                                                      | 4.73                                      | 8              | 158.0       |
| g=40, p=6           | 3.29                                         | 27                                | 0.54-3.47                                    | -100.9                                | -122.7                                                                      | 5.97                                      | 20             | 158.7       |
| g=40, p=7           | 3.80                                         | 31.9                              | 0.62-4                                       | -99.6                                 | -120.3                                                                      | 6.42                                      | 40             | 156.8       |
| g=40, p=8           | 4.27                                         | 36                                | 0.71-4.49                                    | -100.9                                | -122.6                                                                      | 9.48                                      | 5              | 159.6       |
| g=40, p=9           | 4.51                                         | 38.2                              | 0.8-4.73                                     | -100.5                                | -122.7                                                                      | 10.08                                     | 40             | 160.0       |
| g=40, p=10          | 4.64                                         | 39.3                              | 0.86-4.85                                    | -100.5                                | -121.9                                                                      | 9.6                                       | 4              | 159.3       |

TABLE II GFUS8RF (130 nm) MEASURED RESULTS



Fig. 23. (a) Micrograph of fabricated chip in GFUS8RF (130 nm). Layouts of fabricated CPOs: (b) 8-gate CPO. (c) 40-gate CPO.

making one ( $p = 1 \mod e$ ) or both ( $p = 2 \mod e$ ) enable\_fire signals active by external control signals. The enable\_ring signal (node  $a_2$ ) was used to enable or disable oscillations in the ring. Similarly, for the g = 40 CPO, every fourth gate was driven by the fire pulse, with independent fire\_enable signals for each of the ten gates to run the CPO is different modes and a ring\_enable signal to enable the CPO.

The CPOs were characterized by both P.N. and frequencyjitter stability measurements. P.N. measurements were made using a Keysight PXA Signal Analyzer (N9030B). For obtaining the Allan deviation values, the CPO output waveforms captured on an Agilent Infiniium oscilloscope were sampled at 80 GSamples/s. The zero crossings of the rising edge of the waveforms (at  $V_{dd}/2$ ) were computed, and using the resultant phase data, Allan deviation and jitter stability values were calculated in IEEE Stable-32 [29]. The error bars on these plots depict the 95% confidence values. Power supply was regulated by an on-board Analog Devices LT3042 regulator. No on-chip regulator was used for these measurements. The measured frequency, power, tuning range, P.N. at 1- and 10-MHz offsets, number of phases, and FoM calculated at an offset frequency of 10 MHz are summarized in Table II. It also includes the normalized jitter stability at k = 100 cycles for different modes of the two CPOs. These values were computed by dividing the J[k = 100] for the p = 1 mode of each CPO by the J[k = 100] for the other modes of the same CPO. The FoM is calculated as follows:

$$FoM = 20\log\left(\frac{f_o}{\Delta f}\right) - P.N. - 10\log\left(\frac{P_o}{1mW}\right) \quad (26)$$

where  $f_o$  is the oscillation frequency,  $\Delta f$  is the frequency offset at which the P.N. is measured, and  $P_o$  is the oscillator power consumption in mW.

For the g = 8 CPO, the p = 2 mode operates at a higher frequency and power than the p = 1 mode as expected. The modified Allan deviation and jitter stability for the two modes of the g = 8 CPO are plotted in Fig. 24(a) and (b). This CPO shows better jitter stability in the p = 2 mode than the p = 1mode beyond  $\approx 2-3$  cycles. This is attributed to both smaller residual phase error for multipulse CPOs, as well as operation of the two-pulse and one-pulse CPOs in Region 2 and Region 3 of the delay-separation curve of Fig. 2(b), respectively, and hence at different magnitudes of m. An increase in the delayseparation slope (of relatively small magnitude) results in both smaller settling time (see Fig. 10) and smaller jitter stability, which improves both short- and long-term behavior as can be seen for the g = 8 oscillator. The value of J[k = 100]is 2.9 times smaller in the p = 2 mode compared to the p = 1 mode. If the two modes operated at identical (small) slopes, the improvement would have been  $\approx 2$  times instead. Consequently, the P.N. in the p = 2 mode is  $\approx 3$  dB better than in the p = 1 mode, resulting in an improvement in FoM of  $\approx$ 5 dB. These results show that multipulse CPOs can not only support higher operating frequencies, but also exhibit lower P.N., resulting in an improved FoM. Fig. 24(c) shows the measured P.N. for the g = 8, p = 2 CPO compared against the simulation values. As explained in Section VI, since phase error correction is not taken into account by P.N. simulators, the measured values at 1- and 10-MHz phase offsets are indeed lower than the simulated values. However, the simulated value

MUKIM et al.: DESIGN AND ANALYSIS OF CPOs



Fig. 24. Measurement results. (a)  $Mod \sigma(\tau)$  for g = 8 CPO. (b) J[k] for g = 8 CPO. (c) Comparison of simulated and measured P.N. for g = 8, p = 2 CPO. (d)  $Mod \sigma(\tau)$  for g = 40 CPO. (e) J[k] for g = 40 CPO. (f) Oscilloscope waveform for g = 40, p = 1 CPO.

TABLE IIIPERFORMANCE COMPARISON OF FABRICATED g = 8, p = 2 CPOWITH SIMILAR PREVIOUS WORKS

| Work         | Technology | Frequency<br>(GHz) | Power<br>(mW) | Phase Noise<br>(dBc/Hz),<br>offset<br>frequency<br>(MHz) | FoM<br>(dB) |
|--------------|------------|--------------------|---------------|----------------------------------------------------------|-------------|
| This<br>work | 130 nm     | 4.87               | 8.5           | -96.8, 1                                                 | 163.2       |
| [13]         | 90 nm      | 3.16               | 13            | -103.4, 1                                                | 162.25      |
| [24]         | 65 nm      | 5.0                | 1             | -101, 4                                                  | 162.94      |
| [33]         | 135 nm     | 4.7                | 14.8          | -97.5, 1, 4                                              | 159.24      |
| [34]         | 180 nm     | 0.9                | 65.5          | -106.1, 0.6                                              | 151.46      |

at 100-MHz offset is better than the measured value due to higher settling time associated with instabilities at large frequency offsets and high-frequency power-coupled noise during measurements.

For the g = 40 CPO operating in different modes, the frequency scaling with respect to the number of pulses, p is less than proportional to the increase in p, which matches our analytic predictions. The modified Allan deviation and jitter stability for three modes (p = 1, p = 5, and p = 10) are plotted in Fig. 24(d) and (e), respectively. The p = 1and p = 2 modes of the g = 8 CPO mode present nearly identical frequency (same g/p ratio) comparison points to the p = 5, p = 10 modes of the g = 40 CPO. As expected, the g = 40 CPO outperforms the corresponding modes of the g = 8 CPO at longer timescales. While at shorter timescales, the g = 8 CPO modes show better jitter stability, as they involve interactions between a smaller number of pulses and exhibit shorter settling times. As can be seen from Table II, the P.N. of the g = 40, p = 5 CPO is  $\approx 7.5$  dB better than the g = 8, p = 1 CPO. This confirms the inference made from behavioral simulations about the improvement in P.N. to be  $\propto \sqrt{p}$  for power  $\propto p$ . Table II also shows the jitter stability for the g = 40 CPO in different modes at k = 100cycles, compared against the p = 1 mode of this CPO. The improvement is close to the mode value p for modes p = 1-6, which is expected of CPOs operating at a small magnitude of *m*. The improvement in jitter stability dips slightly for the p = 7 mode, followed by a significant improvement for modes p = 8-10. One factor leading to this shift in the trend of jitter stability values is the switch in CPO operation from Region 3 to Region 2 of the delay-separation curve of Fig. 2(b). The frequency and power consumption rise as the mode (p)is increased, the P.N. degrades slightly and the FoM across all modes lies in a 3-4-dB range. Our model currently does not capture the effects of wire-length mismatches, substrate coupling, and power noise, which could all contribute to some nonmonotonicity in P.N. as can be seen for the p = 7-10 modes of this CPO. This CPO can also provide up to 40 phases in certain modes with phase resolution as small as 5.56 ps ( $p = 9 \mod e$ ) in 130-nm technology. Fig. 24(f) shows an oscilloscope waveform for the p = 1 mode of the g = 40 CPO. Table III shows the performance comparison of the g = 8; p = 2 fabricated CPO with similar previous works.

#### VIII. CONCLUSION

CPOs present a new design space for high-performance multiphase ring oscillators that can provide precise phase 14

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS

resolutions, not limited by the smallest achievable gate delay for a given technology. Pulse gates forming CPOs exhibit timevariant gate dynamics that cause pulses to distribute uniformly around the ring, enabling the existence of precise phases in even or odd numbers as well as mirror phase taps. Further, these dynamics result in temporal phase error corrections that can be utilized to improve the overall frequency stability of the oscillator beyond the first few cycles. This article presents detailed time-domain analysis of the behavior of CPOs, including closed-form analytic solutions that illustrate the effect of design parameters on noise properties. The analysis is strongly supported by behavioral and Hspice simulations, as well as measurements on fabricated designs.

The analysis and results presented in this article show that to first order, the properties of CPOs are solely governed by the local gate delay-separation dynamics. For equal frequency, equal pulse density CPOs, scaling power by a factor of pimproves the frequency stability and P.N. by a factor of  $1/\sqrt{p}$  or  $-10 \log p$ . A unique feature of such multipulse CPOs is that P.N. is improved by adding power that is distributed in space, and hence P.N. improvement is obtained without increasing the power density. CPOs achieve device noise-based FoMs similar to that of ring oscillators. However, CPOs are more resilient to noise that shows correlation among different gates, such as power noise. Finally, for systems dominated by noise profiles that are impulsive, CPOs present a potential to achieve frequency stability improvements  $\propto 1/p$ , significantly improving the power-versus-noise tradeoff.

#### ACKNOWLEDGMENT

The authors would like to thank MMI, Cadence, Synopsys, and Mentor Graphics for their software support, and Global Foundries and MOSIS for their student runs of GFUS8RF (130 nm) technology. They would also like to thank the anonymous reviewers for their constructive feedback.

#### REFERENCES

- J. Yu, F. F. Dai, and R. C. Jaeger, "A 12-bit Vernier ring time-to-digital converter in 0.13 μm CMOS technology," *IEEE J. Solid-State Circuits*, vol. 45, no. 4, pp. 830–842, Apr. 2010.
- [2] A. Liscidini, L. Vercesi, and R. Castello, "Time to digital converter based on a 2-dimensions Vernier architecture," in *Proc. IEEE Custom Integr. Circuits Conf.*, vol. 9, Sep. 2009, pp. 45–48.
- [3] M. Park and M. H. Perrott, "A VCO-based analog-to-digital converter with second-order sigma-delta noise shaping," in *Proc. IEEE Int. Symp. Circuits Syst.*, May 2009, pp. 3130–3133.
- [4] W. Yu, J. Kim, K. Kim, and S. Cho, "A time-domain high-order MASH ΔΣ ADC using voltage-controlled gated-ring oscillator," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 60, no. 4, pp. 856–866, Apr. 2013.
- [5] S.-H. Lee *et al.*, "A 5-Gb/s 0.25-μm CMOS jitter-tolerant variableinterval oversampling clock/data recovery circuit," *IEEE J. Solid-State Circuits*, vol. 37, no. 12, pp. 1822–1830, Dec. 2002.
- [6] N. A. Kurd, J. S. Barkatullah, R. O. Dizon, T. D. Fletcher, and P. D. Madland, "A multigigahertz clocking scheme for the Pentium(R) 4 microprocessor," *IEEE J. Solid-State Circuits*, vol. 36, no. 11, pp. 1647–1653, Nov. 2001.
- [7] B. Razavi, "A study of phase noise in CMOS oscillators," *IEEE J. Solid-State Circuits*, vol. 31, no. 3, pp. 331–343, Mar. 1996.
- [8] A. A. Abidi, "Phase noise and jitter in CMOS ring oscillators," *IEEE J. Solid-State Circuits*, vol. 41, no. 8, pp. 1803–1816, Aug. 2006.
- [9] T. C. Weigandt, B. Kim, and P. R. Gray, "Analysis of timing jitter in CMOS ring oscillators," in *Proc. IEEE Int. Symp. Circuits Syst. (ISCAS)*, vol. 4, May 1994, pp. 27–30.

- [10] K. Takinami, R. Walsworth, S. Osman, and S. Beccue, "Phasenoise analysis in rotary traveling-wave oscillators using simple physical model," *IEEE Trans. Microw. Theory Techn.*, vol. 58, no. 6, pp. 1465–1474, Jun. 2010.
- [11] J. G. Maneatis and M. A. Horowitz, "Precise delay generation using coupled oscillators," *IEEE J. Solid-State Circuits*, vol. 28, no. 12, pp. 1273–1282, Dec. 1993.
- [12] H.-C. Chang, X. Cao, U. K. Mishra, and R. A. York, "Phase noise in coupled oscillators: Theory and experiment," *IEEE Trans. Microw. Theory Techn.*, vol. 45, no. 5, pp. 604–615, May 1997.
- [13] M. M. Abdul-Latif and E. Sánchez-Sinencio, "Low phase noise wide tuning range N-push cyclic-coupled ring oscillators," *IEEE J. Solid-State Circuits*, vol. 47, no. 6, pp. 1278–1294, Jun. 2012.
- [14] L. Hall, M. Clements, W. Liu, and G. Bilbro, "Clock distribution using cooperative ring oscillators," in *Proc. 17th Conf. Adv. Res. VLSI*, Sep. 1997, pp. 62–75.
- [15] Y.-W. Lin and S. S. H. Hsu, "A Sierpinski space-filling clock tree using multiply-by-3 fractal-coupled ring oscillators," *IEEE J. Solid-State Circuits*, vol. 52, no. 11, pp. 2947–2962, Nov. 2017.
- [16] G. Taylor and I. Galton, "A reconfigurable mostly-digital delta-sigma ADC with a worst-case FOM of 160 dB," *IEEE J. Solid-State Circuits*, vol. 48, no. 4, pp. 983–995, Apr. 2013.
- [17] S. D. Vamvakos et al., "A 8.125–15.625 Gb/s SerDes using a subsampling ring-oscillator phase-locked loop," in Proc. IEEE Custom Integr. Circuits Conf., Sep. 2014, pp. 1–4.
- [18] J. Carnes, I. Vytyaz, P. K. Hanumolu, K. Mayaram, and U.-K. Moon, "Design and analysis of noise tolerant ring oscillators using Maneatis delay cells," in *Proc. 14th IEEE Int. Conf. Electron., Circuits Syst.*, Dec. 2007, pp. 494–497.
- [19] M. Miller, "Realization and formal analysis of asynchronous pulse communication circuits," Ph.D. dissertation, Dept. Elect. Comput. Eng., Univ. California, Santa Barbara, Santa Barbara, CA, USA, 2015.
- [20] A. Hajimiri, S. Limotyrakis, and T. H. Lee, "Jitter and phase noise in ring oscillators," *IEEE J. Solid-State Circuits*, vol. 34, no. 6, pp. 790–804, Jun. 1999.
- [21] S. Fairbanks and S. Moore, "Analog micropipeline rings for high precision timing," in *Proc. 10th Int. Symp. Asynchronous Circuits Syst.*, Apr. 2004, pp. 41–50.
- [22] B. D. Winters and M. R. Greenstreet, "A negative-overhead, self-timed pipeline," in *Proc. 8th Int. Symp. Asynchronous Circuits Syst.*, Apr. 2002, pp. 37–46.
- [23] A. Dalakoti, M. Miller, and F. Brewer, "Pulse ring oscillator tuning via pulse dynamics," in *Proc. IEEE Int. Conf. Comput. Design (ICCD)*, Nov. 2017, pp. 469–472.
- [24] O. Elissati, A. Cherkaoui, A. El-Hadbi, S. Rieubon, and L. Fesquest, "Multi-phase low-noise digital ring oscillators with sub-gate-delay resolution," AEU—Int. J. Electron. Commun., vol. 84, pp. 74–83, Feb. 2018.
- [25] A. Demir, A. Mehrotra, and J. Roychowdhury, "Phase noise in oscillators: A unifying theory and numerical methods for characterization," *IEEE Trans. Circuits Syst. I, Fundam. Theory Appl.*, vol. 47, no. 5, pp. 655–674, May 2000.
- [26] W. J. Riley, Handbook of Frequency Stability Analysis, vol. 1065. Gaithersburg, MD, USA: NIST Special Publication, 2008. [Online]. Available: https://tsapps.nist.gov/publication/get pdf.cfm?pub id=50505
- [27] D. W. Allan and J. A. Barnes, "A modified 'Allan variance' with increased oscillator characterization ability," in *Proc. 35th Annu. Freq. Control Symp.*, May 1981, pp. 470–475.
- [28] N. J. Kasdin and T. Walter, "Discrete simulation of power law noise (for oscillator stability evaluation)," in *Proc. IEEE Freq. Control Symp.*, vol. 1, May 1992, pp. 274–283.
- [29] W. J. Riley. IEEE Stable 32 Software for Frequency Stability Analysis. Accessed: Aug. 22, 2019. [Online]. Available: https://ieeeuffc.org/frequency-control/frequency-control-software/stable32/
- [30] Virtuoso Spectre Circuit Simulator and Accelerated Parallel Simulator RF Analysis User Guide Version 12.1.1, San Jose, CA, USA: Cadence Design Systems, May 2013.
- [31] C. Azeredo-Leme, "Clock jitter effects on sampling: A tutorial," *IEEE Circuits Syst. Mag.*, vol. 11, no. 3, pp. 26–37, 3rd Quart., 2011.
- [32] F. Herzel and B. Razavi, "A study of oscillator jitter due to supply and substrate noise," *IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process.*, vol. 46, no. 1, pp. 56–62, Jan. 1999.
- [33] S. S. A. Saleh and N. Masoumi, "Wide-tuning-range, low-phase-noise quadrature ring oscillator exploiting a novel noise canceling technique," *AEU*—Int. J. Electron. Commun., vol. 66, no. 5, pp. 372–379, 2012.
- [34] Z.-Q. Lu, J.-G. Ma, and F.-C. Lai, "A low-phase-noise 900-MHz CMOS ring oscillator with quadrature output," *Analog Integr. Circuits Signal Process.*, vol. 49, no. 1, pp. 27–30, Oct. 2006.