# A 200-Mbps~2-Gbps Continuous-Rate Clock-and-Data-Recovery Circuit

Rong-Jyi Yang, Student Member, IEEE, Kuan-Hua Chao, and Shen-Iuan Liu, Senior Member, IEEE

Abstract—A 200-Mbps~2-Gbps continuous-rate clock-anddata-recovery (CDR) circuit using half-rate clocking is presented. To detect the data with wide-range bit rates, a frequency tracing circuit (FTC) is used to aid the frequency acquisition. A wide-range and low gain voltage-controlled oscillator (VCO) is also presented by using analog and digital controlled mechanisms. A two-level bang-bang phase detector is utilized to improve the jitter performance and speed up the locking process. This CDR circuit has been realized in a 2P4M 0.35- $\mu$ m CMOS process. The experimental results show that this CDR circuit with the proposed FTC can receive  $2^{31} - 1$  pseudorandom bit stream when the bit rate ranges from 200 Mbps to 2 Gbps without the harmonic-locking issue. All measured bit error rates are below  $10^{-12}$ . The measured root-mean-square and peak-to-peak jitters are 5.86 ps and 41.8 ps, respectively, at 2 Gbps.

*Index Terms*—Clock-and-data-recovery (CDR), continuous rate, frequency detector, voltage-controlled oscillator (VCO).

## I. INTRODUCTION

**7**ITH the aid of wave-length division multiplexing technique, several optical communication systems, such as SDH/SONET and 10 GBased Ethernet, allow the information to be exchanged with different bit rates at the same time in the optical domain. The retiming circuits have to detect the bit rate from the incoming data and the harmonic locking must be prevented. To receive the data with different bit rates over a wide range without the harmonic-locking issue, two kinds of clock-and-data-recovery (CDR) circuits existed in literature [1]–[5]. One is the multirate CDR circuit [1], [2] and the other is the continuous-rate CDR circuit [3]-[5]. For the multirate CDR circuits, the frequency detection mechanism for multiple bit rates is based on either multiple reference clocks [1] or single reference clock with a programmable divider [2]. Compared with multirate CDR circuits, the continuous-rate CDR circuits can detect various bit rates over a wide range. However, a complex frequency detection circuit would be required, such as a frequency synthesizer [4] or a time-to-digital converter [5].

To realize either continuous-rate CDR circuits or multirate ones, two important issues must be considered: one is to have a harmonic-free frequency detector and the other is to have a wide-range voltage-controlled oscillator (VCO) with a low conversion gain. A VCO with a high conversion gain will degrade

The authors are with Graduate Institute of Electronics Engineering & Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan 10617, R.O.C. (e-mail: lsi@cc.ee.ntu.edu.tw).

Digital Object Identifier 10.1109/TCSI.2005.862071



Fig. 1. Proposed continuous-rate CDR circuit.

the jitter performance of a CDR circuit. Conventional quadricorrelator frequency detectors (QFDs) are widely used in CDR circuits [3]. However, the locking range of a QFD is only  $\pm 20\%$ for full-rate systems [6] and  $\pm 15\%$  for half-rate systems [7]. For a QFD, the harmonic locking will occur when the frequency range of a VCO exceeds more than twice bit rate of data. Additional auxiliary circuit is required to prevent this issue [3].

In this paper, several techniques are used to realize a continuous-rate CDR circuit. To extend the frequency acquisition range, a simple and harmonic-free frequency tracing circuit (FTC) is proposed. Its operating principle and limitation would be discussed. Another modified FTC is also presented to improve the unexpected false operation phenomenon. Meanwhile, a VCO with the wide tuning range and low conversion gain is presented by using analog/digital controlled mechanisms. A two-level half-rate bang-bang phase detector (BBPD) [8] is also adopted to improve the jitter performance and speed up the locking process. Moreover, an accessory digital loop [9], [10] to select the desired frequency band is implemented outside the chip.

# **II. CIRCUIT DESCRIPTION**

The block diagram of the proposed half-rate CDR circuit is shown in Fig. 1. It is composed of the proposed FTC, a widerange VCO, a half-rate two-level BBPD, and two charge pump circuits. To enlarge the frequency range and reduce the conversion gain, this VCO is divided into three overlapped bands which are controlled by a three-digit thermometer code. When the system is turned on, the control voltage Vc is discharged to ground to let the VCO start from the lowest frequency. The FTC will distinguish whether the VCO is lower than half of the bit rate or not. Suppose that it is true and the loop filter will

Manuscript received March 24, 2005; revised July 18, 2005. This paper was recommended by Associate Editor J. Silva-Martinez.



Fig. 2. Timing diagrams of (a)  $f_{\rm vco} = 1/2$  bit rate (b)  $f_{\rm vco} < 1/2$  bit rate.



Fig. 3. (a) State diagram. (b) Truth table. (c) Implementation for simple FTC.

be charged. Once Vc is larger than the preset reference voltage (= 3 V), the off-chip digital controlled circuit will select the higher frequency band of the VCO and discharge the loop filter to ground. The process will be repeated until the VCOs frequency is correct. After the frequency acquisition process was done, the two-level half-rate BBPD will adjust Vc according to the phase error between the incoming data and the VCO. Finally, this CDR circuit will lock to generate the retimed data and the recovered clock. The details would be explained as follows:

## A. Frequency Tracing Circuit

Assume that the CDR circuit is locked correctly with the incoming data as shown in Fig. 2(a). Both rising and falling edges of the clock should align with the middles of data, and at least one rising edge of the clock will appear between two consecutive rising edges of data [11]. When the clock is slower than the half bit rate, the timing relation is shown in Fig. 2(b). Note that there is no rising edge of the clock between two consecutive rising edges of data. Based on this principle, a simple FTC can be realized via the state diagram shown in Fig. 3(a). The truth table and corresponding implementation for this simple FTC are shown in Fig. 3(b) and (c), respectively. Assume that the VCO



Fig. 4. Timing diagrams of FTC for (a) proper operation, (b) slightly lag operation, (c) slightly lead operation, respectively, for the rising edges of the data-to-clock signals.

start from the lowest frequency. Initially, the FTC stays in State I. Assume the phase difference between the data and clock is large enough. When the data rises to high, it will move to State II. In State II, two possible operations can occur. Once the clock rises first, the FTC will go back to State I. Once the data rises first, the FTC will move to State III and outputs a signal "UP" to speed up the VCO. A typical timing diagram is illustrated in Fig. 4(a). As long as the control voltage of the VCO is discharged to ground first, the FTC will adjust the VCO from the slowest frequency to the right one.

Although this simple FTC will stop once the clock is equal to or higher than half bit rate, some unallowable false operations still may take place. Assume that the VCO's frequency is equal to half the bit rate and their phase difference,  $\Delta \Phi$ , is small. When a clock rises and the FTC initially stays in State II, the FTC must move to State I. If the clock rises slightly before the data, the FTC may go to State III first and then go back to State I as shown in Fig. 4(b). This false operation for the FTC will happen when the phase difference  $\Delta \Phi$  is

$$\Delta \Phi < T_{\rm RD} - T_{\rm ST} \tag{1}$$

where  $T_{\rm RD}$  is the required delay time to reset the FSM after the clock arises and  $T_{\rm ST}$  is the required delay time for state



Fig. 5. (a) State diagram. (b) Truth table. (c) Implementation for a modified FTC.

transition. Assume that all D-type flip-flops (DFFs) are matched and all logic delays are the same and the maximum value of  $\Delta \Phi$ can be expressed as

$$\Delta \Phi_{\text{lead}} = T_{\text{RD}} - T_{\text{ST}}$$
  
=  $(T_{D,C-Q} + T_{D,\text{reset}}) - T_{D,C-Q}$   
=  $T_{D,\text{reset}}$  (2)

where  $T_{D,C-Q}$  is the delay from the clock to the output of a DFF and  $T_{D,\text{reset}}$  is the required delay for the output of a DFF to be stabled after the reset signal arises. A similar situation would happen when the clock rises slightly after the data rises as shown in Fig. 4(c). This false operation will happen when  $\Delta\Phi$  is smaller than the required time,  $T_{\text{DD}}$ , for the logic output ready for the DFF  $Q_C$ . The maximum value of  $\Delta\Phi$  can be expressed as

$$\Delta \Phi_{\text{lag}} = T_{\text{DD}} = T_{D,\text{setup}} + T_{D,C-Q} + T_{D,\text{logic}} \qquad (3)$$

where  $T_{D,\text{setup}}$  is the setup time of a DFF and  $T_{D,\text{logic}}$  is the delay of an AND gate.

The false operation phenomenon as mentioned above will make the VCO oscillating faster than half data rate. The loop parameter should be well designed especially when the loop is in the under-damping condition. As long as the frequency deviation does not exceed the pull-in range, this frequency error can be compensated by the phase detector. However, it is better that the false operations can be eliminated. To eliminate these false operations caused by finite delays from DFFs and logic circuits, an extra State IV is added to generate the UP signal instead of State III. Those unexpected glitches as mentioned before would not affect the FTC and the combinational logics are also simplified. The modified FTC without false operations is shown in Fig. 5 and the corresponding charge pump circuit is also illustrated in Fig. 6. The maximum operating speed is limited by finite delays from digital circuits. The minimum charging time,



Fig. 6. Charge-pump circuit for the modified FTC.

 $T_{\rm charge}$ , plus the total delay time for normal operation can not exceed the minimum time between two consecutive rising edges of data. It can be expressed as

$$T_{\rm charge} + T_{\rm DD} + T_{\rm RD} < 2T_{\rm bit}.$$
 (4)

The maximum allowable bit rate of the FTC can be expressed as

$$\frac{1}{T_{\text{bit}}} < \frac{2}{2T_{D,C-Q} + T_{D,\text{setup}} + T_{D,\text{reset}} + T_{D,\text{log}ic} + T_{charge}}.$$
(5)

Simulation results show that  $T_{D,C-Q}$ ,  $T_{D,\text{setup}}$ ,  $T_{D,\text{reset}}$ , and  $T_{D,\text{logic}}$  are 100, 30, 30, and 30 ps, respectively, in a 0.35- $\mu$ m 2P4M CMOS process. If  $T_{\text{charge}}$  is 200 ps, the maximum allowable bit rate for the modified FTC is around 4.3 Gbps.

#### B. Wide-Range Voltage-Controlled Oscillator

To achieve a wide-tuning range, ring oscillators using transmission gates have been published [12], [13]. In [12], the widetuning characteristic was achieved by adding transmission gates among every delay stage. However, adding transmission gates in the signal path makes the voltage swing of the clock nonconstant over the entire frequency range, especially at high frequency. Also the single-ended structure is sensitive to power supply noise. Although the differential architecture in [13] could overcome power supply noise, the waveform may distort significantly at low frequency.

The proposed delay cell for a 4-stage VCO is shown in Fig. 7. It can generate clocks with nearly 50% duty cycle and maintains a constant swing over a wide frequency range. Two transistors  $M_{N1}$  and  $M_{P1}$  ( $M'_{N1}$  and  $M'_{P1}$ ) form the fundamental inverter as the input devices. Two transistors  $M_{C1}$  and  $M_{C2}$ ( $M'_{C1}$  and  $M'_{C2}$ ) serve as a resistor which is controlled by the analog control voltage Vc. Another four nMOS transistors are connected to VDD to form a fixed path between  $M_{P1}$  and  $M_{N1}$ . The cross-coupled pair  $M_{P2}$  and  $M'_{P2}$  speeds up the output transition. Moreover, the output swing is fixed and almost rail-to-rail over a wide frequency range. Six additional transistors controlled by a three-digit thermometer code  $D1 \sim D3$ , to divide the whole frequency range into three bands. In this way, not only



Fig. 7. Proposed VCO delay cell.



Fig. 8. Die photo.

the tuning range of the VCO could be increased, but also the conversion gain of a VCO could be reduced.

## **III. EXPERIMENTAL RESULTS**

The proposed CDR circuit has been fabricated in a 0.35- $\mu$ m 2P4M CMOS process. Fig. 8 shows the die photo of the proposed CDR circuit. Its active area is 0.4 mm<sup>2</sup>, not including the loop filter. Two of three passive components in the loop filter are integrated in the die, except the largest 2-nF capacitor C1. The values of R1 and C2 are 500  $\Omega$  and 24 pF, respectively, so that the phase margin is large enough over the entire frequency range. The power dissipation is 170 mW when receiving random data of 2 Gbps. The one third of the power consumption is consumed by the wide range VCO and another one third is consumed by the VCO buffers. Both circuits could be further optimized to save more power. Fig. 9(a) shows the measured and simulated transfer curves of the proposed VCO, which cover from 100 MHz to 1 GHz with three bands. The measured phase noise plot at 600 MHz is also shown in Fig. 9(b). From Fig. 9(a), the measured result is similar to the simulation one at slow-slow



Fig. 9. (a) Measured and simulated VCO transfer curves for different digital control codes. (b) Measured phase noise at 600 MHz.

case because of the large parasitic effect contributed by the digital controlled mechanism. The parasitic effect also degrades the maximum oscillating frequency of VCO; hence, the maximum



Fig. 10. Frequency acquisition process for different data patterns. A, B, and C represent the NRZ data with PRBS of  $2^7 - 1$ ,  $2^{15} - 1$  and  $2^{31} - 1$ , respectively.



Fig. 11. (a) Eye diagram and (b) clock jitter histogram @  $2^{31} - 1$  PRBS of 2 Gbps.

bit rate can not achieve the theoretical value of 4.3 Gbps. The measured tuning range is slightly larger than simulations because the parasitic effect was over estimated. Fig. 10 shows the frequency tracing process. The VCO is initialized at 782 MHz while the incoming data rate is 2 Gbps. The measurement result shows that the VCO can lock data at the right frequency of 1 GHz without the QFD. The curves A, B, and C represent the frequency acquisition processes when receiving nonreturn-tozero (NRZ) data with pseudorandom bit stream (PRBS) of  $2^7 - 1$ ,  $2^{15} - 1$  and  $2^{31} - 1$ , respectively.





Fig. 12. Measured eye diagrams for  $2^{31} - 1$  PRBS of (a) 622.08 Mbps and (b) 200 Mbps.

TABLE I

PERFORMANCE SUMMARY

| Technology                     |                               |         |              | 0.35µm 2P4M CMOS    |        |
|--------------------------------|-------------------------------|---------|--------------|---------------------|--------|
| Power Supply                   |                               |         |              | 3.3V                |        |
| Active Area                    |                               |         |              | 0.4 mm <sup>2</sup> |        |
| Power Consumption              |                               | FTC     |              | 19mW@ 2Gbps         |        |
|                                |                               | VCO     |              | 60mW@ 2Gbps         |        |
|                                |                               | Buffers |              | 65mW@ 2Gbps         |        |
|                                |                               | BBPD    |              | 26mW@ 2Gbps         |        |
|                                |                               | To      | tal          | 170mW@ 2Gbps        |        |
| Data Rate                      |                               |         |              | 200Mbps ~ 2Gbps     |        |
| VCO Operating Frequency        |                               |         |              | 100MHz ~ 1.1GHz     |        |
| VCO Conversion Gain 011<br>001 |                               |         | 131.3MHz /V  |                     |        |
|                                |                               |         | 172.9 MHz /V |                     |        |
|                                |                               |         | 231.3 MHz /V |                     |        |
| CP Current                     | Two-Level BBPD                |         | 50uA / 100uA |                     |        |
|                                | FTC                           |         |              | 400uA               |        |
| Loop Filter                    | R1 : 500Ω, C1 : 2nF, C2: 24pF |         |              |                     |        |
| Root-Mean-Square Jitter        |                               |         | 5.86 ps      | @2Gbps              |        |
|                                |                               |         | 5.99 ps      | @1.25Gbps           |        |
|                                |                               |         | 10.89 ps     | @622Mbps            |        |
|                                |                               |         | 18.81 ps     | @200Mbps            |        |
|                                |                               |         |              | 41.8 ps             | @2Gbps |
| Peak-to-Peak Jitter            |                               |         | 47.8 ps      | @1.25Gbps           |        |
|                                |                               |         | 84.4 ps      | @622Mbps            |        |
|                                |                               |         | 120 ps       | @200Mbps            |        |

The measured eye diagram and the jitter histogram of the retimed clock are shown in Fig. 11(a) and (b), respectively, when input data rate is a  $2^{31} - 1$  PRBS of 2 Gbps. The measured root-mean-square and peak-to-peak jitters are 5.9 ps and 41.8 ps, respectively. Fig. 12(a) and (b) shows the measured eye diagrams for the  $2^{31} - 1$  PRBS of 622.08 and 200 Mbps, respectively. The measured bit error rates from 200 Mbps to 2 Gbps are all below  $10^{-12}$ . The performance summary of this work is listed in Table I.

## IV. CONCLUSION

The continuous-rate CDR circuit with the proposed FTC is realized in a 0.35- $\mu$ m 2P4M CMOS process. The proposed FTC can aid the frequency acquisition for the data over a wide bitrate range without harmonic-locking issue as long as the initial frequency of the clock is lower than half data rate. Accompanied with the proposed wide-range VCO with reasonable conversion gain, the CDR circuit can receive NRZ data of  $2^{31} - 1$  PRBS from 200 Mbps to 2 Gbps.

### ACKNOWLEDGMENT

The authors would like to thank National Chip Implementation Center, Taiwan, for fabricating this chip.

#### REFERENCES

- [1] J. Scheytt, G. Hanke, and U. Langmann, "A 0.155, 0.622 and 2.488 Gb/s automatic bit-rate selecting clock-and-data-recovery IC for bit-rate transparent SDH systems," *IEEE J. Solid-State Circuits*, vol. 34, no. 12, pp. 1935–1943, Dec. 1999.
- [2] D. Belot, L. Dugoujon, and S. Dedieu, "A 3.3 V power adaptive 1244/622/155 Mbit/s transceiver for ATM, SONET/SDH," *IEEE J. Solid-State Circuits*, vol. 33, no. 7, pp. 1047–1058, Jul. 1998.
- [3] D. Postson and A. Buchholz, "A 143-360 Mb/s auto-rate selecting dataretimer chip for serial-digital video signals," in *Dig. Tech. Papers IEEE Int. Solid-State Circuit Conf.*, Feb. 1996, pp. 196–197.
- [4] J. Frambach, R. Heijna, and R. Krosschell, "Single reference continuous rate clock-and-data-recovery from 30 MBit/s to 3.2 GBit/s," in *Proc. IEEE Custom Integ. Circuits Conf.*, May 2002, pp. 375–378.
- [5] J. Park and W. Kim, "An auto-ranging 50–210 Mb/s clock recovery circuit with a time-to-digital converter," in *Dig. Tech. Papers IEEE Int. Solid-State Circuit Conf.*, Feb. 1999, pp. 350–351.
- [6] B. Stilling, "Bit rate and protocol independent clock-and-data-recovery," *Electron. Lett.*, vol. 36, pp. 824–825, Apr. 2000.
- [7] R.-J. Yang, S.-P. Chen, and S.-I. Liu, "A 3.125 Gbps clock-and-data-recovery circuit for 10-Gbase-LX4 Ethernet," *IEEE J. Solid-State Circuits*, vol. 39, no. 8, pp. 1356–1360, Aug. 2004.
- [8] M. Ramezani and C. Salama, "A 10 Gb/s CDR with a half-rate bangbang phase detector," in *Proc. IEEE Int. Sym. Circuits and Systems*, vol. II, May 2003, pp. 181–184.
- [9] W. B. Wilson et al., "A CMOS self-calibrating frequency synthesizer," IEEE J. Solid-State Circuits, vol. 35, no. 10, pp. 1437–1444, Oct. 2000.
- [10] T.-H. Lin *et al.*, "A 900-MHz 2.5-mA CMOS frequency synthesizer with an automatic SC tuning loop," *IEEE J. Solid-State Circuits*, vol. 36, no. 3, pp. 424–431, Mar. 2001.
- [11] T. H. Toifl and P. Moreira, "Simple frequency detector circuit for biphase and NRZ clock recovery," *Electron. Lett.*, vol. 34, pp. 1922–1923, Oct. 1998.
- [12] N. Retdian, S. Takagi, and N. Fujii, "Voltage controlled ring oscillator with wide tuning range and fast voltage swing," in *Proc. IEEE Asia-Pacific Conf. Advanced Syst. Integr. Circuits*, Aug. 2002, pp. 201–204.

[13] I.-C. Hwang and S.-M. Kang, "A self-regulating VCO with supply sensitivity of <0.15%-delay/1%-supply," in *Dig. Tech. Papers IEEE Int. Solid-State Circuit Conf.*, Feb. 2002, pp. 140–141.



**Rong-Jyi Yang** (S'03) was born in Taipei, Taiwan, R.O.C., in 1973. He received the B.S. degree in electrical engineering from National Central University, Jhongli, Taiwan, R.O.C., in 1998. He is currently working toward the Ph.D. degree in electronics engineering, National Taiwan University, Taipei, Taiwan, R.O.C.

His research interests include both analog and digital approaches of phase-locked loops, delay-locked loops, and high-speed CMOS data-communication circuits for multiple gigabit applications.



Kuan-Hua Chao was born in Taipei, Taiwan, R.O.C., on February 26, 1980. He received the B.S. degree in electronics engineering from the National Chiao Tung University, Hsin Chu, Taiwan, R.O.C., and the M.S. degree in electronics engineering from the National Taiwan University, Taipei, Taiwan, R.O.C., in 2002 and 2004, respectively.

His research interests include phase-locked loops, delay-locked loops (DLLs) and high-speed serial links. He is currently working at MediaTek Inc., Hsin Chu, Taiwan, R.O.C., where he focuses on

analog and mix-mode circuit design.



Shen-Iuan Liu (S'88-M'93-SM'03) was born in Keelung, Taiwan, R.O.C., in 1965. He received both the B.S. and Ph.D. degrees in electrical engineering from National Taiwan University (NTU), Taipei, Taiwan, R.O.C., in 1987 and 1991, respectively.

During 1991–1993 he served as a Second Lieutenant in the Chinese Air Force. During 1991–1994, he was an Associate Professor in the Department of Electronic Engineering, National Taiwan Institute of Technology, Taiwan, R.O.C. He joined in the Department of Electrical Engineering, NTU in 1994 and he

has been a Professor since 1998. His research interests are in analog and digital integrated circuits and systems.

Dr. Liu has served as a Chair on IEEE SSCS Taipei Chapter from 2004. He has served as a General Chair on the 15th VLSI Design/CAD Symposium, Taiwan, 2004 and as a Program Co-chair on the Fourth IEEE Asia-Pacific Conference on Advanced System Integrated Circuits, Japan, in 2004. He was the recipient of the Engineering Paper Award from the Chinese Institute of Engineers, in 2003, the Young Professor Teaching Award from MXIC Inc., the Research Achievement Award from NTU, and the Outstanding Research Award from National Science Council, in 2004. He served as a Technical Program Committee member for A-SSCC and ISSCC, in 2005. He currently serves as an Associate Editor of IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS. He is a Member of IEICE.