# A 9.2–12.7 GHz Wideband Fractional-N Subsampling PLL in 28 nm CMOS With 280 fs RMS Jitter

Kuba Raczkowski, Nereo Markulic, Student Member, IEEE, Benjamin Hershberg, Member, IEEE, and Jan Craninckx, Fellow, IEEE

Abstract—This paper describes a fractional-N subsampling PLL in 28 nm CMOS. Fractional phase lock is made possible with almost no penalty in phase noise performance thanks to the use of a 10 bit, 0.55 ps/LSB digital-to-time converter (DTC) circuit operating on the sampling clock. The performance limitations of a practical DTC implementation are considered, and techniques for minimizing these limitations are presented. For example, background calibration guarantees appropriate DTC gain, reducing spurs. Operating at 10 GHz the system achieves -38 dBc of integrated phase noise (280 fs RMS jitter) when a worst case fractional spur of -43 dBc is present. In-band phase noise is at the level of -104 dBc/Hz. The class-B VCO can be tuned from 9.2 GHz to 12.7 GHz (32%). The total power consumption of the synthesizer, including the VCO, is 13 mW from 0.9 V and 1.8 V supplies.

*Index Terms*—CMOS process, digital-controlled oscillators, digital-to-time converter, fractional-N, fractional-N, frequency synthesis, jitter, phase-locked loops, phase noise, radio transceivers, sampling, voltage-controlled oscillators.

# I. INTRODUCTION

**R** ADIO-FREQUENCY synthesizers are ubiquitous building blocks of todays ever growing networking solutions. Whether for high throughput applications like LTE-Advanced or for sub-mW Internet-of-Things nodes, the phase noise of the RF synthesizer sets a limit to the achievable datarate or to the total radio power consumption, as one can often be traded for the other. The synthesizer has traditionally been a typical analog system. Analog performance, unfortunately, has not been a beneficiary of the last decade of CMOS scaling. On the contrary, the analog performance has worsened in the logic-centric technologies and this trend is expected to continue.

The answer from the PLL community has been clear. If synthesizer performance is to improve (and it must), the system has to become as digital in nature as possible. The intensive development of the digital PLLs has led to solutions such as [1] which deliver excellent performance with a low power budget in an attractive, self-calibrating systems. One can only expect this trend

K. Raczkowski, B. Hershberg, and J. Craninckx are with imec, 3001 Leuven, Belgium.

N. Markulic is with imec, 3001 Leuven, Belgium, and also with the Vrije Universiteit Brussel, Belgium.

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/JSSC.2015.2403373

of migrating functionality into the digital domain to continue, even though the remaining analog blocks of a digital PLL (TDC, VCO) still dictate the fundamental limitation on phase noise.

However, the analog subsampling PLL proposed in 2009 in [2] remains unbeaten in terms of integrated phase noise (or RMS jitter), as well as figure-of-merit (FoM). This competitive performance is achieved by removing the two classical contributors to phase noise: the frequency divider and the charge pump. Additionally, thanks to very high detection gain, any remaining noise sources in the loop are greatly suppressed. Finally, the advantage of a subsampling PLL is that it does not require high performance analog. There is no need for high accuracy matching (like in classical charge pumps) or for precise timing.

Unfortunately, the subsampling PLL has not enjoyed much popularity, probably because of the inherent integer-N operation of the synthesizer. Integer-N operation is unacceptable for any modern communication standards due to the very high congestion of spectrum.

We propose a solution that enables fractional-N operation of a subsampling PLL [3]. Instead of measuring phase error with a TDC, we recognize that this error is known *a priori* and can be compensated for by using a digital-to-time converter (DTC). In this manner, we are able to achieve fractional-N lock, while retaining the key benefits of subsampling operation.

The subsampling PLL in its simplest form can be reduced to a system containing a VCO, a switch and a capacitor (Fig. 1). If we add a DTC to the system, which can be implemented as a few inverters and a capacitor bank, we arrive at a solution which can be technology independent thanks to its simplicity.

This paper is organized as follows. Section II explains the time domain operation of a subsampling PLL and introduces a method for enhancing it to achieve a fractional-N lock. Section III examines the relevant system-level challenges that arise when using a practical, performance-limited DTC. Section IV describes the circuit implementation of the fractional-N subsampling PLL and Section V presents the performance of the fabricated test chip. Finally, conclusions are given in Section VI.

# II. FRACTIONAL-N OPERATION OF A SUBSAMPLING PLL

# A. Time Domain Analysis of a Subsampling PLL

Our starting point to this analysis is the basic subsampling PLL consisting of a VCO, a sampler operated by a reference clock, a transconductor  $(G_M)$  and a low-pass filter (LPF) (Fig. 1). Compared to the classical mixed-signal PLL, there

0018-9200 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications standards/publications/rights/index.html for more information.

Manuscript received August 31, 2014; revised December 05, 2014; accepted January 30, 2015. Date of publication March 04, 2015; date of current version April 30, 2015. This paper was approved by Guest Editor Ranjit Gharpurey.

is no frequency divider and the operation of the PFD and the charge pump is replaced by the sampler and the  $G_M$ . Phase detection happens by direct sampling of the VCO waveform with a rate dictated by the reference clock. The sampled voltage is converted into error current, which is fed to the LPF. By definition, when the PLL is in the locked state, the phase error is zero, and hence the sampled voltage is zero and no current is fed to the LPF. No correction is necessary. If, however, a phase error is detected, the error current is a function of the phase error. The relation between the phase error  $\Delta \phi_{VCO}$  the output current  $i_{G_M}$  is sinusoidal [2], though for small phase deviations it can be considered simply as  $\beta_{SS} = \Delta i_{G_M} / \Delta \phi_{VCO} = A_{VCO} \cdot g_m$ , where  $A_{VCO}$  is the amplitude of the VCO and  $g_m$  is the transconductance of the G<sub>M</sub>. Contrary to a classical PLL, the phase detection circuits do not need to have high analog performance. Sampler nonlinearity or clipping can be tolerated, since the sampling point is always close to the zero crossing of the sampled voltage. Furthermore, the output charge is produced by the G<sub>M</sub> and the error information lies in the magnitude and sign of the resulting current and not in a variable current pulse duration (like in a conventional PLL). Finally, leakage of the sampler will be corrected by the loop if the opening of the  $G_M$ output happens always with the same delay with respect to the sampling event.

It is evident that the subsampling loop can only synthesize integer-N multiplications of the reference frequency. There is no phase modulation in the loop at all, and the only stable point is when the zero crossings of the VCO waveform match the timings of the edges of the reference. There is no divider in this loop, which means that the traditional method of applying  $\Delta\Sigma$ modulation to the divider [4] is out of reach.

# B. Enhancement of a Subsampling PLL to Enable Fraction-N Mode Operation

The basic subsampling PLL cannot synthesize fractional-N frequencies, because it lacks any phase modulation mechanism in the loop. There are in principle four nodes we can consider in Fig. 1 to introduce a phase modulating element. We certainly cannot add a frequency divider, as it would eliminate the detection gain advantage. We should not apply any costly operations at RF, as it will increase power consumption and phase noise. We could try to use a residue DAC as in [5] to correct the phase error, but it would require a DAC matched to the loop gain and in general the solution would be cumbersome. Finally, we can modulate the phase of the reference clock, which in fact is equivalent to modulating the frequency divider in a classical PLL. Instead of adapting the phase of the divided fractional-N signal to match phase of the reference, we adapt the phase of the reference to match the phase of the fractional-N frequency of the VCO.

Let us assume as an example that the VCO works at a target fractional-N frequency that is different from integer-N by 0.25 (e.g. in Fig. 2, N = 1.75). In the first cycle we will sample at the same time as in the integer-N mode. Then, in the second cycle we recognize a timing error of  $0.25 * T_{vco}$ . In order to sample a zero crossing of the VCO waveform and keep the loop in lock, we need to delay the sampling event by the same  $0.25 * T_{vco}$ .



Fig. 1. General system of a subsampling PLL with example timing.



Fig. 2. Implementation of fractional-N subsampling operation by delaying the sampling reference. The last sampling event aligns exactly with the beginning of the next cycle and therefore has no extra delay.

In the third cycle the timing error increases by the same amount up to  $0.5 * T_{vco}$  and we need to delay sampling by that amount. In fourth cycle the delay needs to be  $0.75 * T_{vco}$  later. Finally, in the fifth cycle, we should sample  $1 * T_{vco}$  after the reference edge. However, we recognize that simply skipping a VCO cycle will yield the same effect. In other words, on the fifth cycle we sample aligned with the integer-N time again  $(0 * T_{vco})$ .<sup>1</sup> Critically, since we know the desired fractional-N frequency of the PLL as well as the reference frequency, we can calculate the position of any following zero crossings with absolute precision. This means that if we could implement an ideal delay generator, the PLL would be completely spur-less, unlike the traditional analog  $\Delta\Sigma$  PLL. Additionally, the tuning range of the delay generator only needs to cover one VCO period, since the calculations "wrap around" as in the aforementioned example.

 $^{1}$ Note that we can do this skipping operation thanks to the sinusoidal detection gain of the subsampling PLL, which repeats every  $T_{VCO}$ .



Fig. 3. Digital computation flow of the DTC modulator for N = 1.75.

# *C. Digital Implementation of the DTC Modulator for the Fractional-N Subsampling Pll*

The delay that needs to be inserted into the reference path can be calculated precisely based only on the multiplication factor N and the reference period  $T_{\rm ref}$ . The digital computation of the necessary phase adjustment is depicted in Fig. 3. First we calculate the difference of the target multiplication factor and an integer-quantized value. This gives us the quantization error that the phase detection is going to make in the following sampling event, scaled in number of VCO periods. A  $\Delta\Sigma$ modulator is used to generate the integer quantization of N. This way the quantization error (Diff in Fig. 3) is a zero-mean stream that is easy to be accumulated and we obtain the desired "phase wrapping" behavior without any further circuitry. As a second operation we accumulate the quantization error, just as the PLL accumulates the phase difference between the VCO and the reference. At this point, we are able to tell with absolute precision what the necessary delay will be in any following cycle. Observing Fig. 3, we see that the accumulated error is reset every time the  $\Delta\Sigma$  modulator overflows towards the neighboring integer.

#### **III.** IMPLEMENTATION LIMITATIONS AND THEIR MITIGATION

If a fractional-N subsampling PLL, as described in the previous section, were implemented with an ideal DTC, it would have the same performance as an integer-N subsampling PLL. This lies in stark contrast to the case of a traditional mixedsignal PLL, where there is an unavoidable penalty associated with the  $\Delta\Sigma$  modulation. Any practical implementation of the fractional-N subsampling PLL system will, however, be limited in a number of ways. The biggest contributor to these limitations is the DTC. We will deal with implementation issues of the DTC one by one in the following sections, proposing adequate solutions.

1) DTC Quantization: A DTC, as any data converter, has a finite resolution. To scale the output of the accumulated phase error to a digital tuning code, we need to multiply the output of the accumulator in Fig. 4 by a factor  $T_{ref}/(LSB_{DTC} \cdot N_{frac})$ . Even in a noiseless system, the sampling moments will occur with accuracy limited by the LSB of the DTC and the resulting error current will be fed into the LPF, thereby modulating the VCO and creating spurs.

One solution to the problem of limited DTC resolution is obvious: we should make sure that the quantization noise resulting



Fig. 4. Digital DTC modulator including quantization, gain correction, and quantization noise shaping.

from the DTC step is well below other noise sources. Additionally we recognize that adjusting the computed digital word to LSB steps is a standard modulation problem, where  $\Delta\Sigma$  modulators are often used. As such, the purpose of the second  $\Delta\Sigma$ modulator (Fig. 4) in this context is to shape the quantization noise beyond the PLL bandwidth. Thanks to the fact that the  $\Delta\Sigma$  stream is perfectly accurate on average, the average PLL frequency is also accurate, with no visible modulation. Here, we propose to use an all-pass  $\Delta\Sigma$  modulator [6], which shapes the quantization noise without affecting the DTC modulation signal.

Another modification to the basic system that helps to mitigate the problem of limited DTC resolution, is to use a MASH modulator [6] in the beginning of the computation path (Fig. 4). A MASH modulator provides better randomization of the generated code, which helps with reducing spurious content. Compared to a first-order  $\Delta\Sigma$  the generated codes have a larger range, which results in a larger delay range of the DTC.<sup>2</sup> Looking at the randomization in time domain, we realize that by generating delays larger than one  $T_{vco}$ , we effectively de-color the sampling data. Moreover, randomizing DTC codes provides an effect similar to dynamic element matching. Since e.g. four DTC codes are used in MASH 1-1-1 mode to generate the same effective sampling phase, their average timing is effective and the apparent DTC nonlinearity is reduced.

2) DTC Offset and Gain Error: If the DTC is placed in the path of the reference, any fixed delay (offset) it introduces will propagate towards the output of the PLL. However, this offset is rarely an issue and can be made small by proper design of the DTC.

DTC gain can be defined as the amount of delay per LSB code. Because the DTC is analog in nature and susceptible to PVT variations, the absolute gain will be unknown and varying with time and temperature. Gain error in the delay steps will introduce spurs in the spectrum of the PLL. It is critical that we enable automatic background calibration, which will track the gain variations and compensate in either analog or digital domain.

An automatic DTC gain calibration can be designed similarly to the popular least-mean-square (LMS) based mechanisms used in digital PLLs [1], [7] (Fig. 5). Simply stated, we have to extract the sign of the sampled voltage and correlate it

<sup>&</sup>lt;sup>2</sup>A first-order  $\Delta\Sigma$  generates modulation of only 1. A popular MASH 1-1-1 modulator has an output range of 7, which is reduced after some filtering in the phase accumulator.



Fig. 5. Background calibration method for correcting DTC gain error.



Fig. 6. Simulation results of gain correction mechanism when a 10% error is applied to the DTC.

with a change in direction of the DTC word. An intuitive explanation of the process can be given by considering that if the modulator "tells" the DTC to sample later, but due to gain error we consecutively detect "early" samples—we can deduct that the DTC gain is too low. After accumulation the correction word can be applied as a scaling factor to the computation path of Fig. 4. After the correction loop converges, there is no penalty on phase noise. Fig. 6 shows a simulation result where a 10% gain error was applied to the DTC. This error introduces a large ripple in the sampled voltage, which in turn results in large spurs at the output of the PLL. After the DTC gain is corrected, the sampled voltage converges back to zero.

3) DTC Nonlinearity: As is the case in any data converter, the DTC will suffer from nonlinearity. This nonlinearity, will naturally increase spurious content at the output of the PLL. Many techniques for improving linearity which are present for DACs also apply for the DTC. For example, careful layout of the tuning element is of highest priority. Advanced nanometerscale technologies offer a significant advantage in this regard, thanks to ever-improving litography resolution. Matching improves with technology for the same area of a capacitor array. Dynamic element matching (DEM) can be also used to improve the linearity of the array [8]. In addition, it is worth noticing that



Fig. 7. Architecture of the fractional-N subsampling PLL.

the third-order MASH modulator generating codes for DTC effectively introduces averaging to the DTC nonlinearity, since there are multiple codes spaced all along the range of the DTC that can be used to sample the same phase offset.

4) DTC Phase Noise: In this paper we propose a solution to enhance an integer-N subsampling PLL by placing a phase modulator (DTC) in the path of the reference. Unfortunately, the phase noise contribution of the DTC adds directly to the phase noise of the reference. Ultimately, the in-band phase noise of the subsampling PLL is limited by the phase noise of both the reference and the DTC, since both pass the system in the same way. Therefore, great care must be taken to minimize the DTC's contribution to phase noise, otherwise the unique phase noise advantages of the subsampling architecture will be lost. Here, scaling of CMOS technology is again on our side, since transistors are getting faster with every node, reducing jitter and phase noise.

#### **IV. CIRCUIT IMPLEMENTATION**

The subsampling phase locked loop can only detect phase error, which makes it susceptible to false locking at any N. Therefore, a frequency acquisition loop is required in addition to the subsampling loop [2] (see Fig. 7). A simple conventional PLL easily fulfills this requirement. It can be disabled once frequency has been acquired in order to save power.

Common to both frequency and phase acquisition loops are the low-pass filter (LPF) and the VCO. For the purpose of demonstrating the concept of the fractional-N subsampling PLL we have chosen the simplest LPF design-a passive third-order lead-lag filter. Tunable resistance in the LPF has been implemented to be able to change the bandwidth of the PLL. Such a simple filter can cause increase in reference spurs and is often avoided in classical charge-pump-based PLLs. Spurious content can increase because the varying level of tuning voltage can introduce mismatches between the currents of the charge pump. In this design, however, any offset in currents of the  $G_M$  is compensated by a slight modification of the locking point (Fig. 8). A locked condition always means zero output current of the G<sub>M</sub>.If changes to the output level cause an input referred offset of the G<sub>M</sub>, the PLL will adapt its locking phase to compensate for this offset.



Fig. 8. The subsampling PLL always locks into a state that guarantees zero output current, even in presence of offset and mismatch.



Fig. 9. Simplified schematic of the subsampling loop.

#### A. Implementation of the Subsampling Loop

The subsampling loop consists of a VCO buffer, a sampler and a  $G_M$ . Additionally, the DTC provides the required phase modulation. Fig. 9 shows the circuits along the subsampling path.

A VCO buffer is required in order to reduce the kickback effect from the sampler to the VCO [9] and to interface the signal levels between the blocks. In this test chip, to accommodate for changing phase noise requirements of a software-defined radio, we have implemented a low-noise VCO that can be operated from a variable supply as high as 1.8 V. Therefore, the input buffer needs to convert the level between the high voltage VCO domain (max. 1.8 V) and the core domain (0.9 V). Additionally, the signal processed by the buffer needs to remain roughly sinusoidal in shape, so that the detection gain (and hence loop gain) can be controlled. The buffer is implemented with a tunable capacitive attenuator and a source follower pair (see Fig. 9). The tunable attenuator is built with metal-oxide-metal (MOM) capacitors and provides additional tuning of loop gain. The buffer is also the largest contributor to power consumption in this loop, as it needs to process a GHz-range signal.

The sampler is built around an NMOS switch and a small MOM capacitor. In total, taking into account the input capacitance of the  $G_M$ , the sampling capacitance is 20 fF. Thermal kT/C noise can be neglected because it is already suppressed by the large detection gain. The implemented sampler uses an auxiliary sampler operating in inverted phase to the primary sampler in order to reduce load variability of the VCO.



Fig. 10. Schematic of the transconductor. The input pair is driven by differential sampled voltage. The output current  $(i_{out})$  is duty-cycled and flows to the loop filter.

Since the implemented VCO can operate from the IO voltage, the tuning voltage also has a range larger than the core voltage. Therefore, the output stage of the G<sub>M</sub> needs to provide translation from the low voltage domain of the sampler to the high voltage domain of the LPF and the VCO. This translation is done in current domain between the first and the second stage of the  $G_M$  (Fig. 10). The first stage is a simple differential pair providing the necessary transconductance, whereas the second stage implements a charge pump-like output. Identically to [2], the detection gain is so large that duty-cycling is required in the output stage of the G<sub>M</sub>. Pulsing is done with a simple digital pulse generator that opens the output switches of the G<sub>M</sub>. Importantly, variations in the pulse width merely change the loop gain. A solution to varying loop gain and loop bandwidth would be to implement a loop bandwidth tracking. The solution of [10] could be attractive here, since it uses the same error signal as the DTC gain compensation.

An important part of the system is the background correction of DTC gain. As said earlier, the error signal from within the PLL is present in the sign of the sampled voltage. However, this is true only if no mismatches are present in the system. If there are any mismatches in the phase detection circuitry (VCO buffers, samplers,  $G_M$ ), the PLL will adjust the locking phase (and sampled voltage) so that the output current of the  $G_M$  is zeroed (Fig. 8). Therefore, the gain correction mechanism requires detection of the sign of output current. This is why the output stage of the  $G_M$  is realized using cascodes. The slightest imbalance of current in the output branch results in a large swing of voltage at the output node. Using a simple clocked comparator to detect the sign of this swing in relation to vtune voltage is sufficient to obtain information about the sign of the output current.



Fig. 11. Complete architecture of the DTC.

#### B. Implementation of the Digital-to-Time Converter

Since the DTC is at the input of the system, its phase noise is multiplied by a square of the PLL multiplication number when transferred to the output [2] (here: 48 dB as N = 250). On top of that, any kind of non-linearity present in the phase error comparison path leads to potential noise folding or spurs [11].

From the PLL system perspective, we target a 10-bit DTC, with a 0.5 ps unit step. This delay range covers multiple VCO periods allowing operation with the third-order MASH 1-1-1 modulator. Because the 0.5 ps step is very small and we know from system simulations that the PLL is sensitive to its disturbance, we can suspect that the DTC needs a good isolation from any noise coming from the supply.

Implementation of the delay generator is shown in Fig. 11 [12]. The first inverters in the chain serve as an input buffer towards the delay circuit loaded with a tunable MOM capacitance  $C_L$ . To suppress mismatch-based errors for the chosen unit size, the capacitor array employs a 5 bit binary/thermometer segmentation. Only the high-to-low transition of the V<sub>x</sub> voltage is important, because the subsampling loop reacts only on closing of the sampling switch. One could realize discharging of the load capacitance using a simple NMOS transistor, however, this would lead to an excessive 1/f noise contribution, which would dominate the PLL phase noise. To reduce this effect we introduce a resistor above the NMOS. The exponential discharging is determined then by the corresponding RC time constant. The delay is, however, a linear function of capacitance. A resistor sets the discharging slope and hence contributes to the output phase noise, however, it generates no 1/f noise. The NMOS transistor can have minimal length and operate merely as a switch, introducing no 1/f noise. Furthermore, any supply ripple coming from the preceding buffer only modulates the NMOS switch resistance which is an order of magnitude smaller than the discharging resistor and does not affect delay. The phase noise level introduced by the delay generator can be derived [12] as

$$\mathcal{L}_{white} \propto 10 \log \left( f_{out} \frac{kTC_L R^2}{V_{DD}^2} \right) \propto 10 \log \left( f_{out} \frac{kT_{\tau_{delay}} R}{V_{DD}^2} \right)$$
(1)



Fig. 12. Class-B VCO schematic and layout floorplan of the NMOS-only digital varactor unit cell of Fig. 13(b).

where R is the resistor value and  $V_{DD}$  is the supply voltage of the delay element and  $f_{out}$  is the signal frequency. Based on (1) and the targeted minimal delay step of 0.5 ps we size the  $R = 180 \ \Omega$  and unit  $C = 3 \ \text{fF}$  to lower the noise of this stage to -160 dBc/Hz for maximal delay. The RC delay control block is followed by a CMOS inverter serving as a comparator to restore steep slopes. This circuit toggling moment is unfortunately dependent on the input slope shape [13] which degrades the linearity of a high range DTC. Care must also be given to the fact that regeneration of the RC-delayed slope is most vulnerable to supply modulation. A tunable regulated supply shown in Fig. 11 is used to protect the supply of the comparator and the following buffer. The regulated supply consists of a constant current source biasing a diode-connected transistor. A capacitor of 4 pF is used for additional decoupling of the regulated supply node. At the moment of toggling, charge is instantaneously pulled from the capacitor, and not from the VDD. The dip in the regulated supply voltage is suppressed by the gain of the current source before reaching the top supply. The dynamic charge flow is in this way kept within the structure itself.

#### C. Implementation of the VCO

The VCO used in this PLL is the class-B structure of Fig. 12, similar to [14]. A digital varactor utilizing ultra-low  $V_T$  thin-oxide transistors provides 6-bit digital coarse frequency tuning, and an analog thick-gate-oxide varactor provides fine tuning. The cross-coupled  $-g_m$  transistors, MC, see the full VCO swing and are implemented as thick-oxide devices. A digitally tunable tail resistor can be used to trade power consumption for phase noise performance.

The digital varactor unit cell used in this VCO is illustrated in Fig. 13(a). Compared to the widely used conventional cell of [15], shown in Fig. 13(b), the proposed structure has a number of advantages, particularly in the context of nanoscale CMOS technology [16]. In the on-state (EN = 1), the proposed switched capacitor cell operates very similarly to the conventional cell: the switch  $M_{SW}$  differentially shorts nodes  $V_A$  and  $V_B$  together and the linear (MOM) capacitors  $C_U$  add to the overall tank capacitance of the VCO. However, in the off-state, the  $M_{PIN}$ transistors provide a 'bottom-pinning' functionality, setting the



Fig. 13. Proposed and conventional switched capacitor structures. The proposed cell (a) is used to implement the digital varactor of the VCO. (a) Proposed switched capacitor cell; (b) Conventional cell of [15].



Fig. 14. Architecture of the frequency acquisition loop.

DC bias levels of the cell such that voltage stress on the devices can be minimized [16]. Additionally, this structure naturally produces the highest off-state Q possible, since all leakage is dynamically compensated by the pinning transistors. The proposed cell also benefits from its compact, NMOS-only implementation. As seen in the simplified cell layout of Fig. 12, it can be realized as a single composite NMOS block placed between the two unit capacitors. By comparison, the conventional cell uses both PMOS and NMOS transistors and a polysilicon resistor, which cannot be abutted. In this design there are 15 thermometrically switched capacitor cells, together with a half and a quarter cell.

#### D. Implementation of the Frequency Acquisition Loop

The frequency acquisition loop (Fig. 14) can be as simple and as low power as possible. Here it has been implemented with a chain of divide-by-2/3 circuits [17], a traditional 3-state PFD, enhanced with a large deadzone following [2] and a very simple charge pump. The first stage of the divider is made with CML logic [18], since the VCO frequency can reach >12 GHz, but the following stages of the divider are standard CMOS gates. The divider itself is driven by the same MASH  $\Delta\Sigma$  modulator used in the digital block of the subsampling path (see Fig. 3). The charge pump does not require any mismatch correction, since during frequency acquisition we do not care about phase noise performance. Once the frequency acquisition is complete, the loop automatically becomes inactive thanks to the increased dead-zone in the PFD and can be completely shut down, saving power.

In general, the loop components for both the phase and the frequency acquisition loop can be made very simple and do not require neither good precision, nor good matching, nor low noise. This makes the system suitable for deeply scaled technologies where analog performance is low and also for very high frequency applications, where accuracy may be a problem.

### V. EXPERIMENTAL RESULTS

Prototype chip of the fractional-N subsampling PLL was manufactured in 1P9M 28 nm bulk digital CMOS technology and occupies an area of 1 mm<sup>2</sup> (Fig. 15). The active area of the PLL is naturally smaller, dominated by the low noise VCO which occupies 500  $\mu$ m × 250  $\mu$ m. The chip is powered by 0.9 V and 1.8 V supplies. The 1.8 V supply is used for the IO interface, the charge-pump and the G<sub>M</sub> stage. The VCO is designed to work with a low-droput regulator (LDO) operating at 1.8 V. This LDO, however is not present on chip and for the results shown below the VCO runs from an unregulated 0.9 V supply. Power consumption (excluding the 50 $\Omega$  output drivers and powering down the PFD-based loop) is 13 mW, where the DTC and G<sub>M</sub> consume 0.5 mW and 0.6 mW, respectively, the



Fig. 15. Chip microphotograph.



Fig. 16. Measured VCO tuning range. Analog tuning (0-1.8 V) is used between digital words.



Fig. 17. Measured VCO free-running phase noise for low-power (VDD = 0.9 V) and high-power (VDD = 1.4 V) mode.

VCO 8 mW, the source-follower VCO buffer 1 mW and the digital controller 2.5 mW. The digital controller was neither optimized for power nor for area and includes additional testing circuitry that cannot be clock-gated.<sup>3</sup> The VCO frequency tuning spans from 9.2 GHz to 12.7 GHz [16] with sensitivity to analog voltage that reaches 200 MHz/V around 10 GHz (Fig. 16). The out-of-band phase noise can be optimized by 2 dB at the cost of additional 10 mW [16] if the VCO is running at 1.4 V (Fig. 17).



Fig. 18. Measured INL and DNL characteristics of the DTC.

Oscilloscope measurements of the DTC show INL and DNL of less than 1.5 LSB and 0.8 LSB, respectively (Fig. 18). The nominal time resolution is 550 fs, which was confirmed via output of the DTC gain estimation algorithm.

#### A. Measured Phase Noise Performance

Phase noise was measured using an Agilent E5052B signal analyzer with an external 7 GHz downconverter. A sample phase noise result around a carrier frequency of 10 GHz showing the fractional-N spectrum with the worst case spur (880 kHz) is shown in Fig. 19. For comparison, the integer-N phase noise is visible as a memory trace, showing little degradation in the fractional-N mode. The in-band (200 kHz) phase noise reaches -104 dBc/Hz in the fractional-N mode. If bandwidth of the PLL is extended beyond optimum of RMS jitter, the in-band phase noise level can drop to -108 dBc/Hz. We believe that the noise at low offset frequencies is a 1/f noise of the reference chain and the DTC. Also, the regulated supply of the DTC is adding some filtered 1/f noise. The integrated phase noise in fractional-N mode spans between -40 dBc and -38 dBc depending on the fractional number. In integer-N mode it reaches -41 dBc. Phase noise integration was done from 10 kHz to 60 MHz and includes all spurs. No compensation or correction was applied to the system, apart from the online DTC gain correction. The PLL is working in a MASH 1-1-1 mode. Jitter was extracted from the integrated phase noise and is shown versus fractional codes in Fig. 20. With out-of-band fractional multiplication, the RMS jitter reaches 230 fs. When working in integer-N mode, the synthesizer achieves RMS jitter of only 204 fs. Integer-N RMS jitter is reported against VCO tuning range in Fig. 20. Settling time (with a fractional step of 20 MHz) is below 2  $\mu$ s. Locking time from a free-running VCO (with preselected band) is below 12  $\mu$ s. No automatic band selection mechanism is present on chip. Spurious response was measured using a Rohde&Schwartz FSQ26 spectrum analyzer and is shown in Fig. 21. The worst case in-band fractional spur is -43 dBc. The reference spur is -60 dBc.

Fig. 22 shows the effect of enabling the DTC gain correction mechanism. Without correction, phase noise is simply not ac-

<sup>&</sup>lt;sup>3</sup>For instance the digital features a full lookup-table of the DTC, which is built with 10k flip-flops. The LUT was programmed with a perfectly linear mapping. It was, therefore, not necessary.



Fig. 19. Measured phase noise for a worst case fractional-N scenario. For reference, the integer-N phase noise trace is shown as well.



Fig. 20. Measured RMS jitter across fractional codes (integer part of N = 250) and integer-N jitter with respect to VCO tuning range.

ceptable. If 1% error is intentionally introduced to the optimal DTC gain, large spurs can be observed. Finally, optimal performance is obtained if the background calibration is tracking the DTC gain.

Fig. 23 shows the effect of the MASH modulator. Higher DSM order is preferable, however, it increases the required range of the DTC.

# B. Remaining Fractional Spur

The integrated jitter is degraded when a fractional spur appears in-band. This spur is directly proportional to the multipli-



Fig. 21. Measured output spectrum of the PLL showing the worst case fractional spur and the reference spur.



Fig. 22. Measured effect of DTC gain mismatch. 1% error in gain was intentionally applied.



Fig. 23. Measured phase noise as a function of the  $\Delta\Sigma$  modulator order. The higher the order, the lower the spurs.

cation factor:  $f_{\text{spur offset}} = f_{\text{ref}} * (N - floor(N))$ . It is believed to be caused by the non-regulated supply of the VCO. The VCO

|                                                   | This work  | [8]       | [20]      | [7]       | [1]       | [19]      | [2]                 |
|---------------------------------------------------|------------|-----------|-----------|-----------|-----------|-----------|---------------------|
| Type of PLL                                       | Analog     | Analog    | Analog    | Analog    | Digital   | Digital   | Analog<br>integer-N |
| Technology                                        | 28nm       | 180nm     | 180nm     | 65nm      | 65nm      | 55nm      | 180nm               |
| Tuning range (GHz)                                | 9.2 - 12.7 | 2.2 - 2.4 | 2.5 - 3.2 | 3.0 - 4.0 | 2.9 - 4.0 | 5.9 - 8.0 | 2.21                |
| Reference freq (MHz)                              | 40         | 48        | 33        | 40        | 40        | 40        | 55.25               |
| Bandwidth (MHz)                                   | 1.8        | 0.5       | 0.2       | 0.5       | 0.3       | 0.5       | 4.5                 |
| In-band phase noise (dBc/Hz) <sup>1</sup>         | -104       | -99.2     | -89       | -96.2     | 92.5      | -103      | -112                |
| Phase noise at 20MHz offset (dBc/Hz) <sup>2</sup> | -138       | -128      | -139      | -123.2    | -128      | -144      | -128.8              |
| RMS jitter (fs) <sup>3</sup>                      | 230 - 280  | 266 - 400 | 455       | 463       | 560       | 190       | 150                 |
| Integrated phase noise (dBc) <sup>1,3</sup>       | -39.838.1  | -38.535   | -34.8     | -33.8     | -32       | -41.5     | -43.7               |
| Worst fractional spur (dBc)                       | -43        | -53       | -74       | -42       | -42       | -70       |                     |
| Reference spur (dBc)                              | -60        | -55       | -78       | -71       | -72       | -94       | -46                 |
| Power (mW)                                        | 13         | 17.3      | 48        | 5         | 4.5       | 36        | 7.6                 |
| Figure-of-Merit (FoM)                             | -241.5240  | -239.1    | -230      | -239.7    | -238.5    | -239      | -247.5              |

 TABLE I

 Performance Summary and Comparison With Other Low-Jitter Fractional-N CMOS PLLs

<sup>1</sup> Scaled to 10 GHz by  $20 \log((f_c/10 \text{ GHz}))$ .

<sup>2</sup> Scaled to 10 GHz and extrapolated from existing data to 20 MHz offset.

<sup>3</sup> Including in-band or out-of-band spurs.

is designed to be operated together with an LDO, which is not present on this chip. The sensitivity to the supply of the VCO is in fact higher than sensitivity to the tuning voltage. In high power mode of the oscillator, the supply sensitivity is reduced and the fractional spur drops by approximately 10 dB. As the spur is visible in both the subsampling mode and in the classical mode of the PLL, it is reasonable to believe that the spur does not come from the subsampling operation, but is rather an effect of parasitic coupling.

# C. Performance Summary and Comparison to the State-of-the-Art

Generally applied figure-of-merit of PLL synthesizers is defined as

FoM = 
$$10 \cdot \log \left[ \left( \frac{\text{RMS jitter}}{1\text{s}} \right)^2 \cdot \left( \frac{\text{Power}}{1\text{mW}} \right) \right].$$
 (2)

Table I and Fig. 24 show the summary of performance and FoM comparison for a few recent low-jitter fractional-N synthesizers. The figure-of-merit of the presented fractional-N subsampling PLL reaches -241.5 with out-of-band worst case spur or -240with the spur in-band. Excellent FoM is achieved thanks to the very low phase noise, but also thanks to the simplicity of the subsampling loop which can be designed low power. Compared to [8], which is also a DTC-enhanced subsampling PLL, in-band phase noise (after scaling to 10 GHz) is close to 6 dB lower, which may be a benefit of working with a 28 nm technology. On the other hand, nanometer-scale technologies suffer from large 1/f noise, which in our case, dominates noise profile of the DTC. Our FoM is only slightly better than [7], though achieved for almost three times larger N and with a three times larger bandwidth. The design is on-par with the lowest-jitter digital PLL in [19], though consuming only a third of its power and not using a reference doubler.

As a final remark, due to underestimated loop gain, the bandwidth cannot be made smaller than 1.8 MHz, which worsens



Fig. 24. Figure-of-merit comparison of recent fractional-N synthesizers.

the final integrated phase noise and jitter. A simplified analysis of optimum bandwidth, based on in-band phase noise level and VCO phase noise, shows a potential of a 5 dB improvement in integrated phase noise.

# VI. CONCLUSION

We propose a methodology of enhancing the low phase noise subsampling PLL to work with fractional-N multiplication factors. This methodology introduces a digital-to-time converter in the path of the reference clock, assisted by a simple digital controller. Open-loop modulation of the DTC is possible thanks to the fact that the quantization error introduced by the integer-N PLL is known *a priori*. We propose an effective online calibration mechanism to adjust the modulation to the PVT variations of the DTC. Moreover, we propose a number of techniques to improve spurious performance of the system limited by the resolution of the DTC. A fractional-N subsampling PLL prototype reaches 280 fs of RMS jitter in worst case fractional spur scenario and 204 fs in integer-N mode while consuming 13 mW. The synthesizer has a tuning range from 9.2 GHz to 12.7 GHz. Compared to state-of-the-art synthesizers (see Table I) and to our knowledge, this is the lowest phase noise analog fractional-N synthesizer to date. The in-band phase noise level of -104 dBc/Hz challenges the state-of-the-art of all fractional-N synthesizers.

# REFERENCES

- [1] D. Tasca, M. Zanuso, G. Marzin, S. Levantino, C. Samori, and A. Lacaita, "A 2.9–4.0-GHz fractional-N digital PLL with bang-bang phase detector and 560-fsrms integrated jitter at 4.5-mW power," *IEEE J. Solid-State Circuits*, vol. 46, no. 12, pp. 2745–2758, Dec. 2011.
- [2] X. Gao, E. Klumperink, M. Bohsali, and B. Nauta, "A Low noise subsampling PLL in which divider noise is eliminated and PD/CP noise is not multiplied by N<sup>2</sup>," *IEEE J. Solid-State Circuits*, vol. 44, no. 12, pp. 3253–3263, Dec. 2009.
- [3] K. Raczkowski, N. Markulic, B. Hershberg, J. Van Driessche, and J. Craninckx, "A 9.2–12.7 GHz wideband fractional-N subsampling PLL in 28 nm CMOS with 280 fs RMS jitter," in *IEEE Radio Frequency Integrated Circuits Symp. Dig.*, 2014, pp. 89–92.
- [4] T. Riley, M. Copeland, and T. Kwasniewski, "Delta-sigma modulation in fractional-N frequency synthesis," *IEEE J. Solid-State Circuits* vol. 28, no. 5, pp. 553–559, May 1993.
- [5] A. Swaminathan, K. Wang, and I. Galton, "A wide-bandwidth 2.4 GHz ISM band fractional-N PLL with adaptive phase noise cancellation," *IEEE J. Solid-State Circuits*, vol. 42, no. 12, pp. 2639–2650, Dec. 2007.
- [6] R. Schreier and G. C. Temes, Understanding Delta-Sigma Data Converters. New York, NY, USA: Wiley, 2004.
- [7] S. Levantino, G. Marzin, C. Samori, and A. Lacaita, "A wideband fractional-N PLL with suppressed charge-pump noise and automatic loop filter calibration," *IEEE J. Solid-State Circuits* vol. 48, no. 10, pp. 2419–2429, Oct. 2013.
- [8] W.-S. Chang, P.-C. Huang, and T.-C. Lee, "A fractional-N divider-less phase-locked loop with a subsampling phase detector," *IEEE J. Solid-State Circuits*, vol. 49, no. 12, pp. 2964–2975, Dec. 2014.
- [9] X. Gao, E. Klumperink, G. Socci, M. Bohsali, and B. Nauta, "Spur reduction techniques for phase-locked loops exploiting a sub-sampling phase detector," *IEEE J. Solid-State Circuits*, vol. 45, no. 9, pp. 1809–1821, Sep. 2010.
- [10] G. Marzin, S. Levantino, C. Samori, and A. Lacaita, "2.9 a background calibration technique to control bandwidth in digital plls," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, 2014, pp. 54–55.
- [11] A. Lacaita, S. Levantino, and C. Samori, *Integrated Frequency Synthesizers for Wireless Systems*, ser. Engineering Pro. Cambridge, U.K.: Cambridge Univ. Press, 2007.
- [12] N. Markulic, K. Raczkowski, P. Wambacq, and J. Craninckx, "A 10-bit, 550-fs step digital-to-time converter in 28 nm cmos," in *Proc. 40th Eur. Solid-State Circuits Conf. (ESSCIRC)*, 2014, pp. 79–82.
- [13] D. Auvergne, J. Daga, and M. Rezzoug, "Signal transition time effect on CMOS delay evaluation," *IEEE Trans. Circuits Syst. I: Fund. Theory Applicat.* vol. 47, no. 9, pp. 1362–1369, Sep. 2000.
- [14] P. Andreani, K. Kozmin, P. Sandrup, M. Nilsson, and T. Mattsson, "A TX VCO for WCDMA/EDGE in 90 nm RF CMOS," *IEEE J. Solid-State Circuits*, vol. 46, no. 7, pp. 1618–1626, Jul. 2011.
- [15] H. Sjoland, "Improved switched tuning of differential CMOS VCOs," *IEEE Trans. Circuits Systems II: Analog Digital Signal Process.*, vol. 49, no. 5, pp. 352–355, May 2002.
- [16] B. Hershberg, K. Raczkowski, K. Vaesen, and J. Craninckx, "A 9.1–12.7 GHz VCO in 28 nm CMOS with a bottom-pinning bias technique for digital varactor stress reduction," in *Proc. 40th Eur. Solid-State Circuits Conf. (ESSCIRC)*, 2014, pp. 83–86.
- [17] C. Vaucher, I. Ferencic, M. Locher, S. Sedvallson, U. Voegeli, and Z. Wang, "A family of low-power truly modular programmable dividers in standard 0.35-µm CMOS technology," *IEEE J. Solid-State Circuits* vol. 35, no. 7, pp. 1039–1045, Jul. 2000.
- [18] V. Szortyka, Q. Shi, K. Raczkowski, B. Parvais, M. Kuijk, and P. Wambacq, "21.4 A 42 mW 230 fs-jitter sub-sampling 60 GHz PLL in 40 nm CMOS," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, 2014, pp. 366–367.

- [19] C.-W. Yao, L. Lin, B. Nissim, H. Arora, and T. Cho, "A low spur fractional-N digital PLL for 802.11a/b/g/n/ac with 0.19 psrms jitter," in *Proc. Symp. VLSI Circuits (VLSIC)*, 2011, pp. 110–111.
- [20] Y.-C. Yang, S.-A. Yu, Y.-H. Liu, T. Wang, and S.-S. Lu, "A quantization noise suppression technique for  $\Delta\Sigma$  fractional-N frequency synthesizers," *IEEE J. Solid-State Circuits*, vol. 41, no. 11, pp. 2500–2511, Nov. 2006.



Kuba Raczkowski received the M.Sc. degree in electrical engineering from Warsaw University of Technology, Warsaw, Poland, in 2006, and the Ph.D. degree from K.U. Leuven, Belgium, in 2011 for his work on 60 GHz phased-array circuits and systems. Since then he has been with imec, Leuven, Bel-

gium, working on high-performance RF synthesizers. Currently, he is part of the team developing custom, specialty imagers.



**Nereo Markulic** (S'14) was born in 1988 in Rijeka, Croatia. He received the M.Sc. degree (*magna cum laude*) in electrical engineering from the University of Zagreb, Croatia, in 2012. In 2014 he was awarded a scholarship from the Flemish Government (IWT-Vlaanderen) to pursue a Ph.D. degree at the Vrije Universiteit Brussel, Belgium, in collaboration with Interuniversity Micro-Electronic Centre (IMEC), Belgium. His research is focused on digitally assisted frequency synthesizers for multi-standard radios in CMOS.



**Benjamin Hershberg** (S'06–M'12) received H.B.S. degrees in electrical engineering and computer engineering from Oregon State University, Corvallis, OR, USA, in 2006. He received the Ph.D. degree in electrical engineering from Oregon State University in 2012 for his work in scalable, low-power switched-capacitor amplification solutions, including the techniques of ring amplification and split-CLS.

Currently, he is with imec, Leuven, Belgium, researching reconfigurable analog front-ends for software-defined radio applications.



Jan Craninckx (S'92–M'98–SM'07–F'14) received the M.S. and Ph.D. degree in microelectronics (*summa cum laude*) from the ESAT-MICAS Laboratories of the Katholieke Universiteit Leuven, Belgium, in 1992 and 1997, respectively. His Ph.D. work was on the design of low-phase noise CMOS integrated VCOs and PLLs for frequency synthesis.

From 1997 to 2002, he worked with Alcatel Microelectronics (later part of STMicroelectronics) as a senior RF engineer on the integration of RF transceivers for GSM, DECT, Bluetooth, and WLAN. In

2002, he joined IMEC, Leuven, Belgium, where he currently is the Senior Principal Scientist responsible for RF, analog and mixed-signal circuit design. His research focuses on the design of RF transceiver front-ends in nanoscale CMOS for software-defined radio (SDR) systems, covering all aspects of RF, analog and data converter design.

Dr. Craninckx has authored and co-authored more than 150 papers, book chapters, and patents. He is/was a member of the Technical Program Committees for several conferences (including ESSCIRC and ISSCC), was the chair of the SSCS Benelux chapter (2006–2011), a SSCS Distinguished Lecturer (2012–2013), and is an Associate Editor of the IEEE JOURNAL OF SOLID-STATE CIRCUITS.