# A 6MHz-130MHz DLL with a Fixed Latency of One Clock Cycle Delay

Hsiang-Hui Chang, Jyh-Woei Lin, and Shen-Iuan Liu

Department of Electrical Engineering & Graduate Institute of Electronic Engineering, National Taiwan University Taipei, Taiwan 10617, R. O. C.

#### Abstract

In this paper, a wide range delay-locked loop (DLL) with a fixed latency of one clock cycle is proposed. Using the phase selection circuit and the start-controlled circuit enlarges the operating frequency range of this DLL and eliminates the harmonic locking problems. The operating frequency range of the DLL can be from  $1/T_{Dmin}$  to  $1/(N \times T_{Dmax})$ , where  $T_{Dmin}$  and T<sub>Dmax</sub> are the minimum and maximum delay of a delay cell, respectively, and N is the number of delay cells used in the delay line theoretically. Fabricated in a 0.35-um 1P3M standard CMOS process, the DLL occupies an active area of 880-um×515-um and consumes a maximum power of 132 mW at 130 MHz. The measurement results exhibit that the operating frequency range is from 6MHz to 130MHz and the latency is just one clock cycle. From the entire operating frequency range, the maximum rms jitter would not exceed 25 ps.

#### Introduction

The rapid and continuous advances of CMOS processes over the past twenty years have led to a highly-integrated level and a fast operation speed in electronic systems. As the system complexity and operation frequency increase, the issue of synchronization becomes of paramount concern since it strongly dominates the system performance. Phaselocked loops (PLLs) [1] and Delay-locked loops (DLLs) [2-3] have been typically employed for the purpose of synchronization. Due to the difference of their configuration, the DLLs are preferred for their unconditional stability and faster locking time than PLLs. Besides, a DLL offers better jitter performance than a PLL because noise in the voltage-controlled delay line (VCDL) does not accumulate over many clock cycles.

Conventional DLLs may suffer from harmonic locking over wide operating range. If the DLLs want to operate at lower frequency without harmonic locking, the number of delay stages must be increased to let the maximum delay of the delay line equal to the period of the lowest frequency. However, the maximum operating frequency of a DLL will be limited by the minimum delay of the delay line.

If the delay different from an integer of clock periods is detected, the closed loop will automatically correct it by changing the delay time of the VCDL. However, the conventional DLL will fail to lock or falsely lock to two or more periods,  $T_{clk}$ , of the input signal if the initial delay of the VCDL is shorter than 0.5  $T_{clk}$  or longer than 1.5  $T_{clk}$  as



Fig.1 The DLL in normal lock and false lock conditions

shown in Fig. 1. Therefore, if the DLL is desired to lock a delay which it is just one clock cycle of input reference signal, the initial delay of VCDL needs to be located between 0.5  $T_{ref}$  and 1.5  $T_{ref}$  no matter what the initial voltage of the loop filter is. If the maximum and minimum delay of the VCDL are  $T_{VCDL_max}$  and  $T_{VCDL_min}$ , respectively. As a result, the period of input signal should satisfy the following inequality [4]:

$$\begin{array}{l} \text{Max} ( T_{\text{VCDL}\_\min}, 2/3 \times T_{\text{VCDL}\_\max} ) < T_{\text{CLK}} < \text{Min} \\ ( T_{\text{VCDL}\max}, 2 \times T_{\text{VCDL}\min} ) \end{array}$$
(1)

However, if  $T_{VCDL_{max}} \ge 3 \times T_{VCDL_{min}}$ , there is no range of  $T_{CLK}$  that can satisfy eq. (1) and the DLL is prone to the false locking problem. It is difficult to design a VCDL in which  $T_{VCDL_{max}}$  is just equal to  $2 \times T_{VCDL_{min}}$  when process variations are taken into account. Thus, some solutions [4]~[8] have been proposed to overcome this problem.

In this work, an approach using the phase selection circuit to automatically decide what number of delay cells should be used. It can enable the DLL to operate in the wide frequency range Meanwhile, a new start-controlled circuit is presented for the DLL to solve false locking problems and keep the latency of one clock cycle. Its duty cycle is not necessary to be exact 50%.

## **II.** The architecture of the proposed DLL

The architecture of the proposed DLL is shown in Fig. 2. It is composed of a conventional analog DLL, a phase selection circuit and a start-controlled circuit. Before the DLL begins to lock, the phase selection circuit will choose an appropriate delay cell to be a feedback signal (vcdl\_clk) according to different frequencies of input signal. In o ther words, the number of the delay cells may change at different input frequencies. The minimum delay,  $T_{Dmin}$ , of the delay line is determined by one unit-delay cell. The maximum



Fig.2 System architecture of the proposed DLL

delay is decided by N×T<sub>Dmax</sub> where N is the number of unitdelay cells. Thus, the operating frequency range of the DLL can be from  $1/T_{Dmin}$  to  $1/(N \times T_{Dmax})$ . As the input frequency is higher, the phase selection circuit will select the smaller number of delay cells and the gain of the VCDL which is proportional to the number of delay cells will become smaller. In order to have an adequate loop bandwidth for the DLL, the capacitances used in the loop filter must become smaller. In this work, 3-bits control signals generated from the phase selection circuit will switch the number of capacitors in the loop filter depending on the selected phase. After the vcdl clk is decided, the DLL will start the locking process which is controlled by the start-controlled circuit. First, the delay between input and output of the VCDL is initially set to the minimum value and then let the down signal of the PFD output activate, supposed that the VCDL's delay increases with control voltage decreasing. Therefore, the delay between input and output of the VCDL will increase until it reaches one clock period of the input signal. Thus, the DLL will not fall into false locking and the latency is fixed to one clock cycle no matter how long delay the VCDL can provide.

# **III. Circuit description**

## A. Phase selection circuit

The phase selection circuit consists of two blocks: an edge detector and a multiplexer with a decoder, as shown in Fig. 3. The schematic and timing diagram of the edge detector are shown in Fig. 4 and Fig. 5, respectively. At the initial state, the signal *startb* is set to low to reset the edge detector outputs (i.e.,  $d3 \sim d10$ ) and the delay of the VCDL is set to its minimum value. When the signal *startb* goes high, the edge detector will detect the rising edge of input signals in sequence during the next two rising edge of ref\_clk. Referring to Fig. 5(a), suppose that the signals (phase 3 ~ phase 10) are all have rising edges in sequence during one

clock cycle, therefore, the outputs  $(d3 \sim d10)$  are all high and the multiplexer will select the phase 10 as the output signal, vcdl\_clk. However, if the input frequency is higher, suppose that the timing diagram is similar to Fig.5 (b). All the inputs have rising edges during one clock cycle, but only the rising edges of phases  $(1 \sim 4)$  in sequence lead the selected phase to be 4. The vcdl\_clk will be low until the selected phase is chosen. After the vcdl\_clk is decided, the DLL will start the locking process, which is explained later. By the decoder, signals (d3 ~ d10) are decoded to generate 3-bits control signals which switch the number of capacitors used in the loop filter for tuning the loop bandwidth.



Fig.3 Block diagram of the phase selection circuit



Fig.4 Schematic of the edge detection circuit



Fig.5 Timing diagram of the edge detection circuit



Fig. 6 Schematic of start-controlled circuit associated with a PFD



Fig.7 Timing diagram of the start-controlled circuit

#### B. Start-controlled circuit

The schematic of the start-controlled circuit and the associated PFD are shown in Fig. 6. It composes only of two rising edge trigger DFFs, two NAND gates and two inverters. The timing diagram of this start-controlled circuit is shown in Fig. 7. Initially, startb is set to low in order to clear the two DFFs outputs. Therefore, setupb is low and pulls the control voltage to VDD as shown in Fig. 3 (i.e., set the VCDL delay to its minimum value). In this way, the two inputs of the PFD are in low level. When the startb goes to high, the setupb will also go to high. After two consecutive falling edges of vcdl clk trigger the DFFs, the down signal of the PFD will be activated and let the delay of the VCDL increase. The delay of the VCDL will increase until it is equal to one clock period of the input signal due to the nature of negative feedback architecture. In order to get equal delays for path1 and path2, some dummy loads should be added in point A. In comparison with [5], the start-controlled circuit has two advantages: one is that the proposed circuit is simple and the other is that the duty cycle of ref clk and vcdl\_clk does not require to be exact 50%.

# C. Other-circuits

In this work, the dynamic logic style PFD [9] is adopted to avoid the dead zone problem and improve the operating speed. To mitigate charge injection errors induced by the parasitic capacitors of the switches and current source transistors, the charge pump circuit developed in [10] is used here. The delay cell circuit is similar to [10]. The control voltage of the loop filter is directly feed to NMOS rather than PMOS. Therefore, the transfer curve of delay vs. control voltage is monotonic decreasing.

### IV. Measurement results

The prototype chip is fabricated in a 0.35-um single-ploy triple-metal standard CMOS process and the microphotograph of the chip is shown in Fig. 8. The capacitors used in the loop filter are integrated in the chip and formed by metal-to-metal capacitors. The experimental results show that the DLL can operate in the frequency range of 6 MHz ~ 130 MHz. Fig. 9 and Fig. 10 show the locking process of the DLL as the operating frequency is 6 MHz and 130 MHz, respectively. From Fig. 9 and Fig. 10, show that the first four cycles of the DLL in the locking process as the operating frequency is 6 MHz and 130 MHz, respectively. After the signal, startb, is high, the phase selection circuit will select one of the outputs of the VCDL as close as possible to the next rising edge of the input clock, ref clk. Fig. 9 and Fig. 10 also show that after the signal, startb, is high, the first rising edge of the output clock of the VCDL, vcdl clk, leads that of the input clock, ref clk. Since the signal, startb, will set the control voltage,  $V_{ctrl}$ , in Fig. 2 to Vdd, the proposed phase detector and the current pump circuit will discharge the loop filter to increase the delay of the VCDL. It will align the phases between the input clock and output clock of the VCDL. Fig. 11 shows the jitter histogram when DLL operates in 130 MHz. Fig. 12 shows the measurement results of rms jitter over different frequencies. Table I gives the performance summary. As a result, the proposed DLL indeed have a wide-operational range and a fixed latency of one clock cycle.



Fig.8 Microphotograph of the chip.



Fig.9 The DLL at initial state when operating frequency is 6MHz



Fig.10 The DLL at initial state when operating frequency is 130MHz



Fig.11 Jitter histogram when DLL operates at 130MHz



Fig.12 Measurement results of rms jitter over different frequencies

#### V. Conclusions

A DLL with wide-range operation and fixed latency of one clock cycle is proposed in this paper. First, the multiphase outputs of the VCDL are all send to the phase selection circuit. Then the phase selection circuit will auto matically

select one of the delayed outputs to feedback. As a result, this DLL can operate over the wide range without suffering from harmonic locking problems. Ideally, this DLL can operate from  $1/(N \times T_{Dmax})$  to  $1/T_{Dmin}$ . The experimental results also demonstrate the functionality of the proposed DLL. Moreover, at different operating frequencies, the jitter performances are all in an acceptable range and the latency is just one clock cycle.

| Table I: Performance Summary     |                                        |
|----------------------------------|----------------------------------------|
| Process                          | 0.35-um 1P3M TSMC CMOS process         |
| Operating Voltage                | 3.3 V                                  |
| <b>Operating Frequency Range</b> | 6 MHz ~ 130 MHz                        |
| RMS Jitter                       | 24.77 ps @ 6 MHz<br>3.297 ps @ 130 MHz |
| Peak-to-Peak Jitter              | 210 ps @ 6 MHz<br>24.3 ps @ 130 MHz    |
| Power Dissipation                | 132 mW @ 130 MHz                       |
| Active Area                      | 880-um×515-um @ without pads           |

#### VI. References

- B. Razavi, "Monolithic phase-locked loops and clock recovery circuits: theory and design", IEEE press, 1996.
- [2] R. L. Aguitar and D. M. Santos, "Multiple target clock distribution with arbitrary delay interconnects," *IEE Electronic Letters.*, vol. 34, no. 22, pp. 2119-2120, Oct. 1998.
- [3] R. B. Watson, Jr. and R. B. Iknaian, "Clock buffer chip with multiple target automatic skew compensation", *IEEE J. Solid-State Circuits*, vol. 30, no. 11, pp. 1267-1276, Nov. 1995.
- [4] Y. Moon, J. Choi, K. Lee, D. K. Jeong, and M. K. Kim, "An allanalog multiphase delay-locked loop using a replica delay line for wide-range operation and low-jitter performance", *IEEE J. Solid-State Circuits*, vol. 35, no. 3, pp. 377-384, Mar. 2000.
- [5] C. H. Kim et al, "A 64-Mbit, 640-Mbyte/s bidirectional data strobed, double-data-rate SDRAM with a 40-mW DLL for a 256-Mbyte memory system", *IEEE J. Solid-State Circuits*, vol. 33, no 11, pp. 1703-1710, Nov. 1998.
- [6] D. J. Foley, and M. P. Flynn, "CMOS DLL-based 2-V 3.2-ps jitter 1-GHz clock synthesizer and temperature-compensated tunable oscillator", *IEEE J. Solid-State Circuits*, vol. 36, no. 3, pp. 417-423, Mar. 2001.
- [7] H. Yahata, T. Okuda, H. Miyashita, H. Chigasaki, B. Taruishi, T. Akiba, Y. Kawase, T. Tachibana, S. Ueda, S. Aoyama, A. Tsukinori, K. Shibata, M. Horiguchi, Y. Saiki, and Y. Nakagome," A 256-Mb double-data-rate SDRAM with a 10-mW analog DLL circuit," Symposium on VLSI Circuits Digest of Technical Papers, pp. 74~75, June 2000.
- [8] Y. Okuda, M. Horiguchi, and Y. Nakagome," A 66-400 MHz, adaptive-lock-mode DLL circuit duty-cycle error correction", Symposium on VLSI Circuits Digest of Technical Papers, pp. 37~38, June 2001.
- [9] S. Kim, et al., "A 960-Mb/s/pin interface for skew-tolerant bus using low jitter PLL", *IEEE J. Solid-State Circuits*, vol. 32, no 5, pp. 691-700, May 1997.
- [10] J. G. Maneatis, "Low-jitter process-independent DLL and PLL based on self-biased techniques", *IEEE J. Solid-State Circuits*, vol. 31, no 11, pp. 1723-1732, Nov. 1996.