# A 7 bit 1 GS/s pipelined folding and interpolating ADC with coarse-stage-free joint encoding 

Mingshuo Wang ${ }^{1 \mathrm{a})}$, Li Lin ${ }^{2}$, Fan Ye ${ }^{1}$, and Junyan Ren $\left.{ }^{1 \mathrm{~b}}\right)$<br>${ }^{1}$ State Key Laboratory of ASIC \& System, Shanghai 201203, China<br>${ }^{2}$ Analog Devices Inc, Shanghai 200021, China<br>a) $081021030 @ f u d a n . e d u . c n$<br>b) jyren@fudan.edu.cn


#### Abstract

This paper presents a single-channel 1.0-GS/s 7-bit pipelined folding and interpolating analog-to-digital converter (PL-FAIADC ) used in ultra wide band (UWB) system. An improved joint encoding method is proposed to eliminate the coarse sub-ADC and reduce the power consumption. Double-diode bootstrapped inter-stage switch is adopted to reach the pipelined working and improve the overall efficiency of speed. The ADC implemented in $0.13-\mu m$ CMOS technology achieves the signal-to-noise-and-distortion ratio (SNDR) of 37.89 dB and the spurious-free dynamic range (SFDR) of 45.89 dB for 498 MHz input frequency at the rate of $1.0 \mathrm{GS} / \mathrm{s}$. The power consumption is 98 mW with sampling rate of $1.0 \mathrm{GS} / \mathrm{s}$ and supply voltage of 1.2 / 2.5 V. The peak figure-of-merit (FoM) is $1.54 \mathrm{pJ} /$ conversion-step.


Keywords: analog-to-digital converter, pipelined folding, interpolating, UWB
Classification: Integrated circuits

## References

[1] C. C. Hsu, C. C. Huang and Y. H. Lin: Symposium on VLSIC (2007). DOI:10. 1109/VLSIC.2007.4342768
[2] Z. H. Cao, S. L. Yan and Y. C. Li: ISSCC Dig. Tech. Papers (2008). DOI:10. 1109/ISSCC.2008.4523297
[3] E. Alpman, H. Lakdawala, L. R. Carley and K. Soumyanath: ISSCC Dig. Tech. Papers (2009). DOI:10.1109/ISSCC.2009.4977315
[4] Y. H. Chung and J. T. Wu: Symposium on VLSIC (2011).
[5] M. J. Choe, B. S. Song and K. Bacrania: Symposium on VLSIC (1999). DOI: 10.1109/VLSIC.1999.797243
[6] Y. C. Li and E. S. Sinencio: IEEE J. Solid-State Circuits 38 (2003) 1405. DOI:10.1109/JSSC.2003.814429
[7] R. C. Taft, C. A. Menkus and M. R. Tursi: ISSCC Dig. Tech. Papers (2004). DOI:10.1109/ISSCC.2004.1332689
[8] X. C. Jiang and M. F. Chang: IEEE J. Solid-State Circuits 40 (2005) 532. DOI:10.1109/JSSC.2004.841033
[10] R. C. Taft, P. A. Francese and M. R. Tursi: ISSCC Dig. Tech. Papers (2009). DOI:10.1109/ISSCC.2009.4977316
[11] T. Yamase, H. Uchida and H. Noguchi: Symposium on VLSIC (2011).
[12] L. Lin, J. Y. Ren and F. Ye: Analog Integr. Circuits Signal Process. 58 (2009) 71. DOI:10.1007/s10470-008-9222-5
[13] F. Munoz, J. R. Angulo and A. L. Martin: Electron. Lett. 39 (2003) 701. DOI: 10.1049/el:20030464
[14] L. Wang, W. J. Yin and J. Y. Ren: Electron. Lett. 42 (2006) 1275. DOI:10. 1049/el:20062344
[15] B. Razavi: Principles of Data Conversion System Design (IEEE Press, New York, 1995).

## 1 Introduction

Low-power analog-to-digital converters (ADCs) with GHz sampling frequency is a key part in broadband communication systems such as mm-wave receivers, UWB, OFDM-based 60 GHz receivers and future optical communication. These applications demand ultrahigh speed ADCs with low power consumption and a large bandwidth, but the demand of the resolution is not very high. Generally, the effective number of bits (ENOB) is about six. Currently, such kind of ADCs focuses on three architectures. One is TimeInterleaved (TI) with Successive-Approximation-Register-Based (SAR) ADC as sub-ADCs [1, 2, 3]. The second is the sub-ranging architecture [4]. The third is the architecture proposed in this paper.

In the first one, SAR ADC has a simple architecture and a less power consumption. It is very suitable for the TI architecture as sub-ADCs. However, this kind of ADCs has some flaws. Firstly, the implementation process limits the minimum value of one unit capacitor, which makes the sampling rate of a single channel SAR ADC not reach very high. Then, if the TI ADC gets higher sampling rate, the number of channels embedded in TI ADC have to be more. Mismatches among channels and the yield of channels become more and more critical [3]. Though a method has been proposed in [2] to improve the sampling rate, the complex digital control logic makes the performance of the ADC system down greatly. Meanwhile, the value of total capacitances not only defines the noise floor of SAR ADCs but also limit the sampling signals' setup time. Considering this tradeoff, the bandwidth does not reach very large $[1,2,3]$.

In the second one, this kind of ADCs can reach GHz sampling rate in a single channel and a low power consumption, which is a big challenge for the design proposed in this work. However, the ENOB of the ADC system goes down greatly with high input frequency, so this kind of ADCs is also not suitable for real-time communication systems.

In the third one, FAI is an efficient way to reach gigahertz sampling rate. Folding and interpolating are both methods of reducing the power consumption in the flash-type ADCs $[5,6,7,8,9,10,11]$. The folding architecture reduces the number of comparators and the interpolating architecture reduces
the number of pre-amplifiers. This kind of ADCs also inherits the high sampling rate of the flash-type ADCs and reduces the power consumption, therefore, which is very suitable for the above proposed applications. However, this kind of ADCs also has some drawbacks. For example, it is sensitive to the offset voltage and mismatches among folders and interpolators. The delay of analog signal pre-processing paths decides the sampling rate, and folders will bring the frequency doubling effect to limit the input bandwidth. Some methods have proposed in previous works to improve these flaws. For example, a cascaded folding architecture is proposed to improve the bandwidth [12]. A power-efficient averaging method is proposed to improve mismatches among folders or interpolators [8, 9]. A pipelined folding interpolating architecture is proposed to reach higher sampling rate [7, 10]. But the method how to reduce the power consumption is not proposed in previous designs. Though someone adopt one bit folding stage to take place of the conventional pipelined stage to reduce the power consumption [11], it also need an high-linearity pre-amplifier in the first stage because of the larger range between two adjacent reference voltages. In this paper, a new fine and coarse joint encoding is proposed to reduce the power consumption. Meanwhile, a new double-diode bootstrapped switch based on [13] is proposed to reduce the delay between the adjacent stages. What's more, the linearity of signals limits the supply voltage [7, 10], so a rail-to-rail preamplifier is proposed to meet a low voltage supply application.

## 2 Coarse-stage-free architecture

### 2.1 Pipelined folding and interpolating architecture

As shown in Fig. 1, four stages form the pipelined folding and interpolating architecture adopted in this paper according to the signal path. The first stage contains a track-and-hold circuit ( $\mathrm{T} / \mathrm{H}$ ), a voltage buffer and a reference voltages ladder. The single $\mathrm{T} / \mathrm{H}$ block is to sample the continuous analog signal. Behind the T/H block is the source follower as a voltage buffer to avoid the kick-back noise and drive the next preamplifiers array. A fine preamplifiers array, a coarse pre-amplifier, a first folding stage and the first inter-stage sampling switches array in the next stage. The 35 level voltages generated by the former reference voltages ladder divide the whole quantified range into 34 sections for the fine preamplifiers array. 12 folder signals with 35 zero-crossings generated by the first folding stage are tracked and held by the first interstage sampling switches in this stage. The coarse pre-amplifier is used to reach the joint encoding. A second folding stage, three stages interpolating blocks and a second inter-stage sampling switches form the third stage. 12 folded signals through the second folding stage become 4 folded signals with 35 zerocrossings. And then 4 folded signals are interpolated by a 3 -stages interpolating block with the interpolating factor $2 \times$ of each stages. Comparators array and digital encoding block accomplish the quantified work and encoding work.

As shown in Fig. 1, the coarse stage is marked by the red dashed line and the fine stage is marked by the blue dashed line. However, the former is dif-
ferent from the conventional one due to using the joint encoding. In the next part, it will be explained in detail. The later is the same as the traditional one.


Fig. 1. Block diagram of the ADC

### 2.2 Coarse-stage-free joint encoding

A new improved joint encoding method is proposed to free the coarse preamplifiers. In this design, the fine stage is 5 bit, and the coarse stage is 3 bit. In conventional coarse stage case, circuits contain 9 preamplifiers and 9 comparators according to the total folding factor $3 \times 3=9$ as shown in Fig. 1 with the black dashed line [6]. If the joint encoding method is used in this design, circuits just contain 1 preamplifier and 4 comparators as shown in Fig. 1 with the red line and the blue line. Coarse-Stage-Free joint encoding method includes two critical points:
(1) The first critical point is to use folding technique in the coarse stage, which reduces the number of comparator used in the conventional coarse stage.
(2) The second critical point is that the folding zero-crossing points needed by the coarse stage are got from the fine stage, which reduces the number of preamplifiers used in the conventional coarse stage.
In this paper, the number of initial zero-crossing points generated by the fine preamplifiers is 36 , which are coded from 1 to 36 . The codes $4,8,12,16$, $20,24,28,32$ and 36 are needed by the coarse stage encoding as shown in the Fig. 2(a) marked by the black boxes. If the folding technique is used in the coarse stage with the folding factor 3 , the zero-crossing points is marked by the blue boxes. Due to the folding technique using in the coarse stage, the repeat regions Reg1 and Reg2 need distinguished by a extra zero-crossing point as shown in Fig. 2(a) marked by red ' $x$ '. Using the folding technique in the conventional coarse stage reduce the number of the coarse comparators from 9 to 1 in this design. What's more, the coarse zero-crossing points also exist on the first folding output signals marked by blue codes and the second
folding output signals marked by red codes in fine stage as shown in Fig. 2(b). In Fig. 2(b), the 12 columns stand for the 12 folders' outputs of the first folding stage in the part I. In each column, the codes are the initial zerocrossing points. The 4 lines stand for the 4 folders' outputs of the second folding stage in the part II. In each line, the codes are the initial zero-crossing points. As we all known, the folding stage doesn't generate new zero-crossing points, so the coarse codes $4,8,12,16,20,24,28,32$ and 36 can be got from the first folding outputs or the second folding outputs. In this paper, one time folding technique with the folding factor 3 used in the coarse stage, so the coarse codes are got from the first folding outputs as shown in Fig. 1 with the blue lines. Then this method can reduce the number of coarse preamplifiers from 9 to 1 . In conclusion, the conventional coarse stage is nearly canceled using joint encoding methods proposed in this paper. It is helpful to reduce the power consumption and the chip area of ADC.

What's more, the mismatches between the coarse stage and the fine stage may generate the mismatch error of the zero-crossing points between the coarse stage and the fine stage in the conventional encoding. But the zerocrossing points are generated in the same signal path avoiding the bad effect using the joint encoding method.

(a)

| Part I: All zero-crossing points of the first folding stage |  |  |  |  |  |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| (1) | (2) | (3) | (4) | (5) | (6) | (7) | (8) | (9) | (10) | (11) | (12) |
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
| 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 |
| 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 |
| Part II: All zero-crossing points of the second folding stage |  |  |  |  |  |  |  |  |  |  |  |
| (1) | 1 | 5 | 9 | 13 | 17 | 21 | 25 | 29 | 33 |  |  |
| (2) | 2 | 6 | 10 | 14 | 18 | 22 | 26 | 30 | 34 |  |  |
| (3) | 3 | 7 | 11 | 15 | 19 | 23 | 27 | 31 | 35 |  |  |
| (4) | 4 | 8 | 12 | 16 | 20 | 24 | 28 | 32 | 36 |  |  |

(b)

Fig. 2. (a) Coarse-stage-free joint encoding (b) Zerocrossing points in folding stages

## 3 Circuit implementation

### 3.1 Double-diode bootstrapped inter-stage switches with the reset MOS

In the pipelined structure, inter-stage sampling switch is inserted into the analog signal pre-processing paths, which is one of the most important modules in this design. For inter-stage sampling switches, their on-resistance should be small enough to minimize the analog signal delay introduced by sampling switches. Furthermore, inter-stage sampling switches are used in an array in this design. So their circuits should be as simple as possible. In published works [5, 7, 10], CMOS switches are used as inter-stage sampling switches, but for low-voltage applications, the on-resistance is increased due
to the lower overdrive voltage of CMOS switches, which introduces a longer signal pre-processing delay. As an alternative, a real bootstrapped switch is useful to lower the on-resistance and reduce this delay in the low supply voltage application. However, the complicated bootstrapped circuits [14] are not suitable for an array mode in terms of chip area and power efficiency. In this design, an improved double-diode bootstrapped switch based on [13] is proposed.

Two stages inter-stage sampling switches are used in this paper. One is located between the two stages of folding amplifiers, and the other is located between the interpolating stage and comparators array. The conversion time of the fine analog preprocessing is extended from half period to one.

A kind of Single-Diode Bootstrapped NMOSFET sampling switches is shown in Fig. 3(a) [13]. $\mathrm{M}_{1}$ is the NMOSFET switch and $\mathrm{C}_{\mathrm{Hn}}$ is the next stage input capacitor. $\mathrm{M}_{2}$ and C compose the bootstrapped circuit. The substrate and the drain of $\mathrm{M}_{2}$ are connected together ( N -well technology is adopted) so that $\mathrm{M}_{2}$ works as a diode. The source of the diode-connected $\mathrm{M}_{2}$ as the positive terminal is led to $\mathrm{V}_{\mathrm{DD}}$, and the gate, drain and substrate are all led to the gate of $\mathrm{M}_{1}$ as the negative terminal Vg , which prevents the PN junction formed between the N -well and the $\mathrm{P}+$ source diffusion from becoming forward biased, and therefore allows positive voltage swings larger than the threshold voltage of a PN junction over $\mathrm{V}_{\mathrm{DD}}$. That increases the overdrive voltage of $\mathrm{M}_{1}$ and reduces the on-resistance of it. The analog signal-delay would be reduced.

During a positive edge of $\Delta \mathrm{V}_{\mathrm{CLK}}, \mathrm{Vg}$ would have a positive step of $\Delta \mathrm{Vg}$ along with CLK and $\mathrm{M}_{1}$ would turn on. Similarly, a negative edge of CLK awakes a negative step of Vg and $\mathrm{M}_{1}$ turns off. $\Delta \mathrm{Vg}$ is given as

$$
\begin{equation*}
\Delta V_{g}=\frac{C}{C+C_{g}} \bullet \Delta V_{C L K} \tag{1}
\end{equation*}
$$

Where Cg is the total parasitic capacitor of Vg and C is a coupling capacitor. Generally, the value $\mathrm{W} / \mathrm{L}$ of $\mathrm{M}_{1}$ should be larger to reduce the on-resistance, and the value of C is smaller to save layout area. Cg mainly is composed of two variable capacitors. One is the gate capacitor of $\mathrm{M}_{1}$, and the other is the PN junction's capacitor. They would change with the bias voltage of the node Vg. Therefore, Cg has floating values at different cases including positive edge and negative edge of the CLK, so $\Delta \mathrm{V}_{(\mathrm{g}, \mathrm{rise})}$ and $\Delta \mathrm{V}_{(\mathrm{g}, \text { down })}$ is different. Therefore, an extra charge would be accumulated at the node Vg which would become larger and larger, and $\mathrm{M}_{1}$ may be broken down. Fig. 4(a) shows simulation results of Vg in the circuit of Fig. 3(a) in a $0.13-\mu \mathrm{m}$ CMOS technology.

In this design, an inter-stage double-diode bootstrapped NMOSFET sampling switch is proposed and shown in Fig. 3(b). An extra diode-connected $\mathrm{M}_{3}{ }^{\prime}$ is added. The substrate and the source of $\mathrm{M}_{3}{ }^{\prime}$ are connected together to $\mathrm{V}_{\mathrm{DD}}$ as negative terminal of a diode. The positive terminal of the diode-connected $\mathrm{M}_{3}{ }^{\prime}$ is led to the node $\mathrm{Vg} . \mathrm{M}_{3}$ guarantees the voltage of the node Vg not to exceed $\left(\mathrm{V}_{\mathrm{DD}}-\mathrm{V}_{\text {thp }}\right)$, where $\mathrm{V}_{\text {thp }}$ is the threshold voltage of $\mathrm{M}_{3}{ }^{\prime}$.

However, the switch-off voltage is not zero and the switch may not be closed completely. Therefore, in this paper, the reset MOS Mr is added to ensure the switch closed and cancel the charge injection as shown in Fig. 3(b).

Comparing with CMOS switches, the complicated bootstrapped switches or a single-diode bootstrapped switch, the double-diode bootstrapped switch features a lower and invariable on-resistance, simplicity and stableness.


Fig. 3. (a) Single-diode bootstrapped switch (b) Doublediode bootstrapped switch


Fig. 4. (a) A curve of Vg for single-diode switch (b) A curve of Vg for double-diode switch double-diode bootstrapped switch without Mr

### 3.2 Rail-to-rail preamplifier

Traditional preamplifiers adopt the differential amplifier with resistors as load like that shown in Fig. 5. The common-mode input range of preamplifiers decides the input range of the whole ADC system. As Fig. 5 shows, considering all MOSFETs on the DC path working in saturation region, the input range $V_{\text {input }}$ is shown as

$$
\left\{\begin{array}{l}
V_{\text {input }} \geq V_{o d}+V_{T H}+V_{o d, I}  \tag{2}\\
V_{\text {input }} \leq \min \left(V_{D D}, V_{\text {out }, \text { com }}+V_{T H}\right)
\end{array}\right.
$$

Where $\mathrm{V}_{\text {od }}$ is the overdrive voltage of input NMOSFET transistors, and $\mathrm{V}_{\text {(od,I) }}$ stands for the overdrive of the tail current NMOSFET transistor, $\mathrm{V}_{\mathrm{TH}}$ is the threshold voltage of all MOSFETs, and $V_{\text {(out,com) }}$ is the common voltage of the


Fig. 5. Traditional preamplifier circuit
preamplifiers' outputs. In $0.13-\mu \mathrm{m}$ CMOS technology, the $\mathrm{V}_{\mathrm{DD}}$ is 1.2 V and the threshold voltage of MOSFETs is about 0.4 V . Therefore, the overdrive voltage of input and tail current transistors are required to be larger than 0.15 V , assuming that the $\mathrm{V}_{(\text {od,com })}$ is $0.6 \mathrm{~V} . \mathrm{V}_{(\text {input })}$ ranges from 0.7 V to 1.0 V . The single input range is 0.3 V . In this design, the resolution is 7 bit . One least significant bit (LSB) need to be larger than 2 mV , and it is too small compared with the offset voltage of input NMOSFETs. The offset voltage is larger than 3 mV approximately when the $\mathrm{W} / \mathrm{L}$ of MOSFETs is larger than $4 / 0.5 \mu \mathrm{~m}$ based on the datasheet offered by technology foundry. So the value of one LSB does not meet with the need of this design. Though enlarging the width and the length of input NMOSFETs can reduce the intrinsic offset voltage, the parasitic load is also increased with that. If the driving current of the former voltage buffer is constant, this method will increase the set up time between the output of the voltage buffer and the input of the preamplifiers array. It is likely to make the signal be established incompletely. If the driving current of the former voltage buffer is increased in this growth rate, more power will be consumed. Considering these trades off, a rail to rail preamplifier is implemented to meet with the higher linearity in this design. Its topology is shown in Fig. 6(a).

This preamplifier mainly includes two pairs of input MOSFETs, and two pairs of tail currents and resistance loads. Two pairs of MOSFETs include one pair of NMOSFETs to process the larger input voltage and the other pair of PMOSFETs to handle the smaller input voltage. The total output voltage contains two parts. One is the folding voltage between $\mathrm{V}_{\text {in- }}$ and $\mathrm{V}_{\text {ref- }}$ generated by PMOSFET inputs, and the other is the folding voltage between $\mathrm{V}_{\text {in }+}$ and $\mathrm{V}_{\text {ref+ }}$ generated by NMOSFET inputs. For example, when the common voltage of $\mathrm{V}_{\text {in }+} / \mathrm{V}_{\text {ref+ }}$ is between 0.6 V and $1.2 \mathrm{~V}, \mathrm{~V}_{\text {in }+} / \mathrm{V}_{\text {ref+ }}$ can adopt the NMOSFETs as inputs. Meanwhile, the common voltage of $\mathrm{V}_{\mathrm{in}-} /$ $\mathrm{V}_{\text {ref- }}$ is between 0.6 V and $0 \mathrm{~V}, \mathrm{~V}_{\text {in- }} / \mathrm{V}_{\text {ref- }}$ can adopt the PMOSFETs as inputs. The opposite is also feasible. The curves of the differential inputs compared with the differential reference voltage are shown in Fig. 6(b). However, because of the mismatch of technology, a complete match between the NMOSFET inputs and the PMOSFET inputs is very hard. Therefore, the
transconductances of PMOSFET ( $\mathrm{gmp}_{\mathrm{mp}}$ ) and NMOSFET ( gmn ) are not equal, but it does not affect the outputs of the preamplifier, and $\left(V_{\text {in }+}-V_{\text {in- }}\right)$ is same as the $\left(\mathrm{V}_{\text {ref+ }}-\mathrm{V}_{\text {ref- }}\right)$. It can be defined as $\Delta \mathrm{V}$. The contribution to the currents of outputs by NMOSFET and PMOSFET inputs is shown as equation (3). From (3), the NMOSFET and PMOSFET inputs' contribution to the preamplifiers' outputs are the same.

$$
\begin{align*}
& I_{\text {out }}=g_{m n} \bullet\left(V_{\text {in }+}-V_{\text {ref }+}\right)+g_{m p} \bullet\left(V_{\text {ref- }-}-V_{\text {in- }}\right) \\
& I_{\text {out }}=\left(g_{m n}+g_{m p}\right) \bullet \Delta V \tag{3}
\end{align*}
$$



Fig. 6. (a) Rail-to-rail preamplifier circuits (b) Waves of preamplifiers' inputs

### 3.3 Cascade averaging network

One of the non-ideal factors of the ADC is the mismatch among folding signal paths. That can be reduced through some structural methods. Averaging resistor network is one of the best choices. In the conventional averaging method, averaging resistors lie at preamplifiers' outputs [8]. If the averaging network is set to the outputs of preamplifiers, preamplifiers arrangement order is shown in Fig. 7. However, the inputs of the first folder correspond to the 1 st, 13 th and 25 th outputs of preamplifiers, so the connection among them is very complex in layout. These will bring large parasitic loads, which will affect the gain and the bandwidth of preamplifiers. Therefore, the averaging network's position is changed to the outputs of 1st and 2nd folding stages. As shown in Fig. 7, the red averaging network is replaced by the two


Fig. 7. Cascade averaging network
blue averaging networks. Simulation results show that the averaging effect is the same and complex connections are avoided. The averaging resistor is chosen as large as possible in the same averaging effect to reduce the influence on the gain and the bandwidth of folders.

### 3.4 Comparator with self-cancelation of dynamic offset voltages



Fig. 8. Improved comparators circuits

Comparator always plays a very critical role in the whole ADC system. Offset voltages from the inputs or outputs of comparators are main elements to affect the performance of them. In this design, comparators array is used, so either static offset voltage or dynamic offset voltage cannot be ignored. Static offset voltage mostly is led by the technology mismatch, which can be reduced by enlarging the width and length of MOS transistors. Dynamic offset voltage can be reduced through improving comparators' circuits. In this paper, an improved comparator is proposed based on the traditional dynamic comparator circuit with a latch load shown in [15], which is shown in Fig. 8. The conventional reset switch Mc generates the clock feed-through voltage on the loads of the comparator's outputs, which leads to the dynamic offset voltage. In the improved one, the single reset switch is divided into two switches $\mathrm{M}_{7}$ and $\mathrm{M}_{9}$. Adding dummy MOS transistors $\mathrm{M}_{8}$ and $\mathrm{M}_{10}$ to cancel the clock feedthrough.

### 3.5 Cascade folder, active interpolator and source follower

In this paper, two-stage cascaded folding amplifiers are adopted to avoid that the folding factor of one folding stage is larger [12]. In order to reach substantial gain to weaken the offset voltage caused by comparators, active interpolating amplifiers are adopted in this design [12]. It can cancel the dead zone of differential pairs in the largest signal response and guarantee the linearity of interpolating signals.

A source follower is used to isolate the input signal from the input
sampling capacitor in sum. Traditionally, connecting the substrate to source in the PMOSFET design can overcome the body effect. But it also introduces a changeable substrate capacitor to become a load of the output and will affect the linearity of the source follower in another aspect. So an improved source-follower is illustrated in Fig. 9. A duplicated branch is also connected to the input but the sizes of $\mathrm{M}_{1 \mathrm{~b}}$ and $\mathrm{M}_{2 \mathrm{~b}}$ in this path are chosen to be much smaller comparing to the sizes of $\mathrm{M}_{1}$ and $\mathrm{M}_{2}$ in the main path. By connecting the substrate of the $\mathrm{M}_{1}$ to that of $\mathrm{M}_{1 \mathrm{~b}}$ can immune from the nonlinearity problem [8]. In this paper, the source follower adopts the $\mathrm{I} / \mathrm{O}$ voltage 2.5 V as the supply voltage.


Fig. 9. Improved source follower

## 4 Measurement results

A single-channel 1.0-GS/s 7 -bit PL-FAI-ADC is prototyped in $0.13-\mu \mathrm{m}$ CMOS occupying a core area of 0.32 mm 2 . The die microphotograph is shown in Fig. 10(a) Output data is down-sampled by a factor of four to relax testing data sampling. Fig. 10(b) shows the DNL and INL performance for a lower frequency of 2.4 MHz input signal at a $1.0 \mathrm{GS} / \mathrm{s}$ sampling rate. The values lay in the range of -0.47 LSB to 0.45 LSB and -0.62 LSB to 1.14 LSB , respectively. Fig. 11(a) shows the measured SNDR/SFDR versus input signal frequency at $1.0 \mathrm{GS} / \mathrm{s}$. SNDR/SFDR achieves $41.66 \mathrm{~dB} / 50.4 \mathrm{~dB}$ at a 10.7 MHz input and $37.89 \mathrm{~dB} / 45.89 \mathrm{~dB}$ at a 498 MHz input, respectively. Fig. 11(b) shows the measured FFT spectrum at the nyquist frequency input. The measured ENOB versus nyquist input signal frequency at $1.0 \mathrm{GS} / \mathrm{s}$ is also 6.0 bits, which meets with the requirement of UWB system completely. The total power consumption is 98 mW . In detail, the voltage buffer consumes about 50 mW at a 2.5 V supply voltage, and other blocks consume the rest of 48 mW at a 1.2 V supply voltage. The performance summary of measured results compared with some published 7 -bit gigahertz ADCs are given in Table I. (Fom $=$ Pdiss/ $\left(2^{\mathrm{ENOB}} \times 2 \times \mathrm{ERBW}\right)$, where the ENOB is the value measured at the nyquist input frequency)


Fig. 10. (a) The chip microphotograph (b) Measured DNL/ INL performance at 1.0 GS/s


Fig. 11. Measured (a) SNDR and SFDR vs input frequency (b) FFT spectrum vs Nyquist input frequency performance at $1.0 \mathrm{GS} / \mathrm{s}$

Table I. Performance summary and comparison.

| PARAMETER | $[11]$ | $[1]$ | $[6]$ | $[9]$ | This <br> Work |
| :---: | :---: | :---: | :---: | :---: | :---: |
| Technology | $45-\mathrm{nm}$ | $90-\mathrm{nm}$ | $0.35-\mathrm{\mu m}$ | $90-\mathrm{nm}$ | $\mathbf{0 . 1 3 - \mathbf { - m }}$ |
| Sampling Rate | $1.3 \mathrm{GS} / \mathrm{s}$ | $1.1 \mathrm{GS} / \mathrm{s}$ | $300 \mathrm{MS} / \mathrm{s}$ | $800 \mathrm{MS} / \mathrm{s}$ | $\mathbf{1 . 0} \mathrm{GS} / \mathrm{s}$ |
| Resolution (bit) | 7 | 7 | 7 | 7 | $\mathbf{7}$ |
| ENOB <br> (bit@MHz) | $6.5 / 5.2$ <br> $@ 8 / 650$ | $6.5 / 5.7$ <br> $@ 29 / 400$ | $6.0 / 5.8$ <br> $@ 6 / 160$ | $6.2 / 5.3$ <br> $@ 10 / 200$ | $\mathbf{6 . 6 / 6 . 0}$ <br> $\mathbf{1 0 . 7} / 498$ |
| ERBW (MHz) | 650 | 300 | 60 | 200 | 490 |
| DNL/INL (LSB) | $1 / 1$ | $0.36 / 0.46$ | $0.6 / 1$ | $0.8 / 1.3$ | $\mathbf{0 . 4 7 / \mathbf { 1 . 1 4 }}$ |
| Supply (V) | 1.2 | 1.3 | 3.3 | $1.2 / 2.5$ | $\mathbf{1 . 2 / 2 . 5}$ |
| Power (mW) | 22 | 92 | 200 | 120 | $\mathbf{9 8}$ |
| Calibration | YES | YES | NO | NO | NO |
| Fom <br> (pJ/Conv-step) | 0.46 | 2.21 | 11.2 | 7.6 | $\mathbf{1 . 5 4}$ |

## 5 Conclusion

A $1.0 \mathrm{GS} / \mathrm{s} 7$-bit single-channel pipelined folding and interpolating ADC with coarse-stage-free joint encoding and the double-diode bootstrapped NMOSFET inter-stage sampling switches with the reset MOS switch is implemented. These improvements are more feasible and not sensitive to the implemented process. It achieves larger than 6.0 bits ENOB across the full Nyquist band at a sampling rate of $1.0 \mathrm{GS} / \mathrm{s}$. Its power consumption is 98 mW in a performance efficient way.

## Acknowledgments

This work was sponsored by the Natural Science Foundation of China (grant no. 61006025), the Special Research Funds for Doctoral Program of Higher Education of China (grant no. 20100071110026) and National Science \& Technology Major Project of China (grant no. 2012ZX03001020-003).

