# A 500-MHz–1.25-GHz Fast-Locking Pulsewidth Control Loop With Presettable Duty Cycle

Sung-Rung Han, Student Member, IEEE, and Shen-Iuan Liu, Senior Member, IEEE

Abstract—A 500-MHz-1.25-GHz fast-locking pulsewidth control loop (PWCL) with presettable duty cycle is realized in 0.35- $\mu$ m CMOS technology. The proposed voltage-difference-to-digital converter and switched charge pump circuits reduce the lock time of a conventional PWCL. Compared with the conventional PWCL, the proposed circuit can reduce the lock time by a factor of 2.58. A method to preset the duty cycle of the output clock is also described. Circuit measurements verify that the duty cycle of the output clock can be adjusted from 35% to 70% in steps of 5%.

Index Terms-Duty-cycle presetting, fast locking, pulsewidth control loop (PWCL), switched charge pump, voltage-difference-to-digital converter.

### I. INTRODUCTION

**\O** MEET the demand for high-speed operation today, many systems adopt a double data rate (DDR) technology, such as DDR SDRAM and double-sampling ADC. In these systems, both rising and falling edges of the clock are used to sample the input data, requiring that the duty cycle of the clock be precisely maintained at 50%. Therefore, how to generate a clock with precise 50% duty cycle for high-speed operation is an important issue. Several approaches [1]-[5] can provide 50% duty-cycle clocks. However, open-loop methods [1], [2] that rely on complementary clocks cannot provide immunity to device mismatches. A delay-locked loop (DLL)-based method [3] has the speed limitation in the phase detector. Methods in [4] and [5] suffer from duty-cycle distortion in high-speed operation due to clock mismatches and dc level mismatches, respectively.

The pulsewidth control loop (PWCL) [6], whose architecture is suitable for high-speed application, was proposed to adjust the output duty cycle of the multistage driver. In a PWCL, because of the high gain of the multistage driver, the control voltage must be quiet enough to ensure a precise duty cycle as the loop is locked. Since both the rising and falling edges of the input clock contribute to the jitter, the duty-cycle variation can be twice the peak-to-peak jitter. For these reasons, the loop gain must be kept low, however, with low gain the loop may take a long time to settle. This long settling time reduces the timing budget for other function blocks in a system. Although the proposed phase-fixed PWCL [7] was intended to be used with a DLL/PLL, the same problem will occur when a precise duty cycle is required.

Manuscript received June 19, 2003; revised November 12, 2003.

The authors are with the Department of Electrical Engineering and Graduate Institute of Electronics Engineering, National Taiwan University, Taipei, Taiwan 10617, R.O.C. (e-mail: lsi@cc.ee.ntu.edu.tw).

Digital Object Identifier 10.1109/JSSC.2003.822781

This paper proposes using a new approach to achieve a fastlocking and duty-cycle-presettable PWCL. The operation of the circuit is investigated in Section II where new circuit models and design equations are developed. The design approach for a fastlocking PWCL is described in Section III. Section IV discusses the design issues of duty-cycle setting. The experimental results are illustrated in Section V. Finally, conclusions are presented in Section VI.

#### II. TRANSIENT ANALYSIS OF THE CONVENTIONAL PWCL

The conventional PWCL [6] is shown in Fig. 1(a). In order to design a fast-locking PWCL, the transient mechanism must be investigated. To simplify analysis,  $V_c$  is initialized to  $V_{dd}$ and  $V_r$  is initialized to its equilibrium voltage,  $V_{ref}$ . Based on this condition, the transient response of the control voltage  $V_{\rm ctl}$ can be sketched in Fig. 1(b). As shown in the figure, there are four regions in the transient. The behavior of each region can be described as follows.

## A. Nonlinear Region

In this region, because the voltage difference between  $V_c$  and  $V_r$  is very large, the amplifier is saturated and the control voltage  $V_{\rm ctl}$  is the lower bound voltage  $V_{\rm lb}$  of the differential pair. Thus, under these conditions clockout is always high. The circuit operation in this region can be modeled as shown in Fig. 2(a). The parameters  $I_{cp2}$  and C model the second charge pump and the loop filter, respectively. Throughout this region, the control voltage is always  $V_{\rm lb}$ .

#### B. Linear Region

As  $V_c$  is discharged, the input voltage difference becomes small enough to allow the amplifier to operate in the linear region, and the transient enters the linear region. In this region, although the control voltage  $V_{\rm ctl}$  increases gradually, clock<sub>out</sub> remains at a high level. A circuit model for this region is shown in Fig. 2(b) where the factor  $A_0$  is the dc gain of the amplifier. If the output common-mode voltage of the amplifier is defined as  $V_{\rm cm}$ , then  $V_c$  at the end of this region can be expressed as

$$V_{c,\text{final}} = V_{\text{ref}} + \frac{2 \cdot (V_{\text{cm}} - V_{\text{lin}})}{A_0} \tag{1}$$

where  $V_{\text{lin}}$  is the control voltage at which the transient starts to enter the acquisition region. The transient time required for both nonlinear and linear regions can be expressed as

$$t_{n-l} = \frac{C \cdot [V_{dd} - V_{c,\text{final}}]}{I_{\text{cp2}}}.$$
(2)



Fig. 1. (a) Conventional PWCL. (b) Transient response of the control voltage.

## C. Acquisition Region

Only when the control voltage has increased to  $V_{\text{lin}}$ , can the output duty cycle start to decrease with time. Thus, at this point the circuit begins to operate in a negative feedback mode and enters the acquisition region. A small-signal model can be established for this region as shown in Fig. 2(c). Din and Dout represent the input and output duty cycle, respectively. The corner frequency  $\omega_a$  is determined by the output resistance of the amplifier and loading capacitance  $C_A$ . The factor  $K_c$  represents the total gain of the control stage and buffer line (to reduce the complexity of the model). If the node  $V_r$  is set to be a reference point, the gain between the duty-cycle error,  $D_{in}$ - $D_{out}$ , and the effective charge pump current is  $-2 \cdot I_p$ . In this region, the initial state is the same as the locked state of an input clock with 100% duty cycle. The operation of the circuit can be treated as if the output duty cycle initially locks to 100% then changes to 50% abruptly. For this reason, the transient time in this region can be derived as follows. The duty-cycle transfer function can be expressed as

$$H(s) = \frac{D_{\text{out}}(s)}{D_{\text{in}}(s)} = \frac{2 \cdot I_p \cdot A_0 \cdot \omega_a \cdot (K_c/C)}{S^2 + \omega_a \cdot S + 2 \cdot I_p \cdot A_0 \cdot \omega_a \cdot (K_c/C)}$$
$$= \frac{\omega_n^2}{S^2 + 2 \cdot \xi \cdot \omega_n \cdot S + \omega_n^2} \tag{3}$$

where  $\xi$  is the damping factor and  $\omega_n$  is the natural frequency. Assuming the damping factor is designed to be close to but less than unity, the step response for the duty cycle [8] can be solved from the above equation, giving

$$\Delta D_{\text{out}}(t) = \left\{ 1 - \frac{1}{\sqrt{1 - \xi^2}} \cdot e^{-\xi \cdot \omega_n t} \\ \cdot \sin\left[\omega_n \cdot \sqrt{1 - \xi^2} \cdot t + \sin^{-1}\left(\sqrt{1 - \xi^2}\right)\right] \right\} \\ \cdot \Delta D_{\text{in}} \cdot u(t) \tag{4}$$

where  $\Delta D_{in}$  and  $\Delta D_{out}$  are the step changes of the input and output duty cycles, respectively. Substituting loop parameters into (4), the transient time  $t_a$  in this region can be found. Then, the overall lock time of the system can be expressed as

$$t_{\text{lock}} = t_{n-l} + t_a. \tag{5}$$

For C = 20 pF,  $I_p = 50 \ \mu$ A,  $A_0 = 5$ ,  $K_c = 1.6 \ V^{-1}$ , and  $\xi = 0.96$ , based on the equations derived above, the lock time is calculated to be 761.6 ns when the duty cycle of the output clock settles to  $50\% \pm 0.1\%$ . Note that the transient time in the nonlinear and linear regions is 660 ns whereas the transient time in the acquisition region is only 101.6 ns. It is obvious that the transient time through the first two regions dominates the total lock time.

### **III. FAST-LOCKING CIRCUIT DESIGN**

To achieve fast locking, one strategy is to focus the design effort in two directions: one is to reduce the transient times in the nonlinear and linear regions, and the other is to speed up the transient in the acquisition region. These two objectives can be



(c)

Fig. 2. Circuit models in different regions. (a) Nonlinear region. (b) Linear region. (c) Acquisition region.



Fig. 3. Proposed fast-locking PWCL.

achieved with the proposed architecture shown in Fig. 3. The fast-locking mechanism is shown enclosed in the dashed line. It consists of a voltage-difference-to-digital converter (VDDC) and a switched charge pump (SCP) circuit. The VDDC is used to

detect the transient region of the four regions discussed in Section II. The SCP circuits provide different charge pump currents corresponding to the control codes from VDDC and the external codes which are used to preset the duty cycle of  $clock_{out}$ .



Fig. 4. Voltage-difference-to-digital converter.

## A. VDDC

The VDDC is depicted in Fig. 4. It is similar to a flash ADC [9], however, the reference voltage  $V_r$  is connected to the middle of the resistor string. Relative to  $V_r$ , an additional eight reference voltages,  $V_{r1} \sim V_{r8}$ , are generated in the resistor string. By comparing these reference voltages with  $V_c$ , the output digital codes of the comparators represent the voltage difference between  $V_c$  and  $V_r$ . The correspondence between the voltage difference and the output code is described as follows. If  $V_c$  is greater than  $V_r$ ,  $d_5$ – $d_8$  are always high and the code of  $d_1$ – $d_4$  is proportional to the voltage difference. On the other hand, if  $V_r$ is greater than  $V_c$ , then  $d_1-d_4$  are always low, and the code of  $d_5-d_8$  is inversely proportional to the voltage difference. In the locked state,  $d_1$ - $d_4$  are low, and  $d_5$ - $d_8$  are high. Therefore, the regions of the transient can be recognized by these codes. There is a lock window set by  $V_{r4}$  and  $V_{r5}$  to detect when the loop approaches a locked state. As the voltage difference becomes small enough for  $V_c$  to enter the lock window, the fast-locking mechanism will be disabled.

## B. Switched Charge Pump

The purpose of the SCP circuit is to provide output current which is proportional to the voltage difference between  $V_c$ and  $V_r$ . The SCP circuit shown in Fig. 5 is based on the Mu–Svensson charge pump [6] and multiple-path charge pump [10]. If the output bits of the VDDC,  $d_1-d_8$ , are all low, a large constant current is provided. As the code of  $d_5-d_8$  is increased, the output current decreases to correspond to a smaller voltage difference. Until  $d_1-d_4$  are all low and  $d_5-d_8$  are all high, the SCP outputs the minimum current  $I_{\rm CP}$ , which is the same as the charge pump current in the conventional circuit. However, as the code of  $d_1-d_4$  is increased, the output current is increased again to correspond to the larger voltage difference. The magnitude of the current in each current source path is shown in the figure. All 18 currents are represented as a ratio of  $I_{\rm CP}$ .

To verify the fast-locking mechanism, the designed case is illustrated as follows. Because the initial voltage of  $V_c$  is  $V_{dd}$ , the reference voltages used to compare with  $V_c$  are  $V_{r1}-V_{r4}$ . When  $V_c$  is greater than  $V_{r1}$ , the transient is in the nonlinear region. When the transient is in the linear region,  $V_c$  is between  $V_{r1}$  and  $V_{r2}$ . Reference voltages  $V_{r3}$  and  $V_{r4}$  are set to speed up the transient in acquisition region. Because the transient time during nonlinear and linear regions dominates the lock time, the following calculation focuses on these two regions and the effect of acceleration in acquisition region is ignored. Assuming the charge pump current corresponding to acquisition region is the same as the charge pump current  $I_{cp}$  in locked state, the lock time can be derived as follows:

$$t_{n-l} = C \left\{ \frac{V_{dd} - V_{r1}}{I_{cp,n}} + \frac{V_{r1} - V_{r2}}{I_{cp,l}} + \frac{V_{r2} - V_{c,\text{final}}}{I_{cp}} \right\}$$
(6)

where  $I_{cp,n}$  and  $I_{cp,1}$  are the charge pump currents in the nonlinear and linear regions, respectively. Three segments of voltage difference in (6) are designed to be 1.3, 0.25, and 0.1 V, respectively. The remaining parameters are the same as in Section II. The transient time during the nonlinear and linear regions can be calculated as 196 ns. Thus, adding the transient time in the acquisition region, 101.6 ns, the lock time becomes 297.6 ns. This time is less than half that of the conventional lock time, 761.6 ns. Furthermore, if the added charge pump currents are made larger, the lock time can be reduced dramatically.

# IV. DUTY-CYCLE PRESETTING DESIGN

Because the ratio of the charge current to the discharge current of the charge pump 2 determines the output duty cycle [6], the duty cycle can easily be set by controlling both currents. There are two ways to set the same current ratio: increasing the charge/discharge current or decreasing the discharge/charge current. Because the former has a larger current difference between the two charge pumps, the resulting jitter is larger. Thus, the later approach is adopted here. In this design, another seven current sources with switches controlled by external code,  $d'_1-d'_7$ , are added to SCP2 to generate an output clock with duty cycle ranging from 35% to 70% in steps of 5%. The circuit is shown in Fig. 6.

## V. EXPERIMENTAL RESULTS

The proposed circuit has been fabricated in 0.35- $\mu$ m CMOS technology. A microphotograph of the proposed circuit is shown in Fig. 7. The total area of the chip is  $0.86 \times 0.32 \text{ mm}^2$  without pads. In this design, the loop filters and the loading capacitors are off-chip. The supply voltage is 3.3 V and the input clock frequency can be 500-MHz–1.25-GHz. For the purpose of comparing fairly and saving chip area, the conventional and proposed architectures are implemented in the same chip with switches switching between the two circuits.

Fig. 8 shows the measured reference and output clocks when the loop is in the locked state. The measured rms jitter shown in Fig. 9 is 3.15 ps. In Fig. 10, the transient responses of the control



Fig. 5. Switched charge pump circuit.



Fig. 6. Duty-cycle presetting current paths.



Fig. 7. Chip microphotograph.

voltage in the conventional and proposed circuits are shown together for comparison. When the duty cycle of the output clocks settle to  $50\%\pm0.1\%$ , in which the 0.1% duty-cycle variation is the defined lock point in this paper, deduced from the measured waveform of the control voltage, the lock times of the conventional and proposed circuits are 800 and 310 ns, respectively. Consequently, compared with the conventional PWCL, the proposed circuit can reduce the lock time by a factor of 2.58. Fig. 11 shows the preset duty-cycle output waveforms. The measured duty cycles of the output clocks can range from 35% to 70% in steps of 5%. Table I gives a performance summary of this work.

## VI. CONCLUSION

This paper describes the proposed methods that can be used to design a fast-locking and duty-cycle-presettable PWCL.



Fig. 8. Output waveform of reference and output signals.



Fig. 9. Output jitter.

First, the operating mechanism of the system is investigated to fully understand system behavior. Second, the system is analyzed to develop accurate system models and design equations



Fig. 10. Transient behavior of the conventional and proposed circuits.



Fig. 11. Output waveforms with different duty cycles.

TABLE I Performance Summary

| Technology<br>Supply Voltage<br>Frequency Range<br>Lock Time (Conventional) | 0.35µm, 1-poly, 4-metal CMOS<br>3.3V<br>500MHz~1.25GHz<br>800ns<br>310ns |
|-----------------------------------------------------------------------------|--------------------------------------------------------------------------|
| Output Duty Cycle Range                                                     | 35%~70% (@ a step of 5%)                                                 |
| Core Size                                                                   | 0.86*0.32 mm <sup>2</sup>                                                |
| Power Consumption                                                           | 270 mW                                                                   |

to mathematically estimate the amount of improvement by the proposed method. Finally, the design issues associated with preset duty cycle are discussed. From the measurement results, the proposed circuit increases the lock speed by a factor of 2.58 times over the conventional circuit. Further, the lock speed of the system can be improved if the additional currents are increased.

### REFERENCES

- J. Lee and B. Kim, "A low-noise fast-lock phase-locked loop with adaptive bandwidth control," *IEEE J. Solid-State Circuits*, vol. 35, pp. 1137–1145, Aug. 2000.
- [2] K. Nakamura, M. Fukaishi, Y. Hirota, Y. Nakazawa, and M. Yotsuyanagi, "A CMOS 50% duty cycle repeater using complementary phase blending," in *Symp. VLSI Circuits Dig. Tech. Papers*, June 2000, pp. 48–49.
- [3] Y. Moon, J. Choi, K. Lee, D. K. Jeong, and M. K. Kim, "An all-analog multiphase delay-locked loop using replica delay line for wide-range operation and low-jitter performance," *IEEE J. Solid-State Circuits*, vol. 35, pp. 377–384, Mar. 2000.
- [4] T. Ogawa and K. Taniguchi, "A 50% duty-cycle correction circuit for PLL output," in *Proc. IEEE Int. Symp. Circuits and Systems*, vol. 4, 2002, pp. 21–24.
- [5] Y. J. Jung, S. W. Lee, D. Shim, W. Kim, C. H. Kim, and S. I. Cho, "A low jitter dual loop DLL using multiple VCDL's with a duty cycle corrector," in Symp. VLSI Circuits Dig. Tech. Papers, June 2000, pp. 50–51.
- [6] F. Mu and C. Svensson, "Pulsewidth control loop in high-speed CMOS clock buffers," *IEEE J. Solid-State Circuits*, vol. 35, pp. 134–141, Feb. 2000.
- [7] P. H. Yang and J. S. Wang, "Low-voltage pulsewidth control loops for SOC applications," *IEEE J. Solid-State Circuits*, vol. 37, pp. 1348–1351, Oct. 2002.
- [8] B. Razzavi, Design of Analog CMOS Integrated Circuits. New York: McGraw-Hill, 2001.
- [9] D. Johns and K. Martin, *Analog Integrated Circuit Design*. New York: Wiley, 1997, pp. 507–513.
- [10] G. T. Roh, Y. H. Lee, and B. Kim, "Optimum phase-acquisition technique for charge-pump PLL," *IEEE Trans. Circuits Syst. II*, vol. 44, pp. 729–740, Sept. 1997.



**Sung-Rung Han** (S'00) was born in Taipei, Taiwan, R.O.C., in 1974. He received the B.S. degree in electronic engineering from Huafan University, Taipei, in 1998 and the M.S. degree in electrical engineering from Tatung University, Taipei, in 2000. Currently, he is working toward the Ph.D. degree at National Taiwan University, Taipei.

His research interests include phase-locked loops, delay-locked loops, and duty-cycle control circuits.



**Shen-Iuan Liu** (S'88–M'93–SM'03) was born in Keelung, Taiwan, R.O.C., in 1965. He received the B.S. and Ph.D. degrees in electrical engineering from National Taiwan University, Taipei, in 1987 and 1991, respectively.

During 1991–1993, he served as a Second Lieutenant in the Chinese Air Force. During 1991–1994, he was an Associate Professor in the Department of Electronic Engineering of National Taiwan Institute of Technology. He joined the Department of Electrical Engineering, National Taiwan University,

Taipei, in 1994, and has been a Professor since 1998. His research interests are in analog and digital integrated circuits and systems.