# Physical Implementation and Testing of Low-Power Adiabatic Flip-Flops with Energy-Recycling Pads

**Abstract**. This paper presents adiabatic flip-flops based on CPAL (complementary pass-transistor adiabatic logic) circuits with energy-recycling output pad cells. The energy-recycling output pad cells for driving adiabatic chips include mainly bonding pads, ESD (electrostatic discharge) protection circuits, and two stage energy-recycling buffers. The adiabatic flip-flops and sequential circuits with energy-recycling output pad cells have been fabricated with Chartered 0.35um process. The adiabatic flip-flops have large energy savings over a wide range of frequencies.

**Streszczenie.** W artykule zaproponowano adiabatyczny przerzutnik bazujący na układach CPAL – complementary pass-transistor adiabatic logic). Zastosowano też blok wyjściowy z odzyskiwaniem energii. (**Implementacja i testy przerzutnika diabatycznego małej** mocy z blokiem odzyskiwania energii)

Keywords: adiabatic computing, adiabatic flip-flops, energy-recycling pad cells, physical Implementation Słowa kluczowe: przerzutnik adiabatyczny, odzyskiwanie enrgii.

## Introduction

Large power consumption of VLSI chips reduces the battery service life, thus the goal of low-power design for battery-powered devices is to extend the battery service life while meeting performance requirements [1]. Reducing power dissipation is also a design goal even for nonportable devices, since excessive power dissipation results in increased packaging and cooling costs as well as potential reliability problems.

Adiabatic logic is an attractive low-power approach by utilizing AC voltage supplies (power-clocks) to recycle the energy of circuits instead of being dissipated as heat [2, 3, 4]. As sequential circuits are also main logic units in digital systems, several adiabatic flip-flops have been reported in recent years [5]. Recently, the two-phase CPAL (Complementary Pass-transistor Adiabatic Logic) flip-flops have been proposed in [5], which is more suitable for sequential circuits.

In adiabatic circuits, energy dissipation occurs even for constant input signals, because their output nodes are always charged and discharged by power-clocks. In order to reduce dynamic and leakage power of adiabatic circuits during idle periods, power-gating techniques have been introduced, which are realized by shutting down idle adiabatic units to disconnect their power-clocks. Recently, a power-gating scheme for the CPAL flip-flops has been proposed [5]. However, these previously reported adiabatic sequential circuits including power-gating schemes are only investigated using SPICE simulations. SPICE does not accurately model the actual layout. Since the actual layouts will introduce parasitic parameters, their parasitic effects for adiabatic circuits should not be also neglected.

This paper focuses on the layout implementation, manufacture, and testing of CPAL flip-flops including pad cells for low-power digital chips. The two-phase non-overlap power-clock generator is presented to supply the two-phase CPAL sequential circuits. A power-gating scheme for the adiabatic sequential circuits is proposed. A practical sequential system with the proposed power-gating scheme is demonstrated using a mode-10 counter. For comparison, a conventional mode-10 counter is also implemented using the similar structure. The designs of energy-recycling output pad cells based on CPAL are described. For comparison, a conventional output pad cell is also implemented. Fullcustom layouts are drawn. The adiabatic and conventional flip-flops have been embedded in a test chip, which has been fabricated with Chartered 0.35µm process and tested. The energy consumption of the two-phase CPAL sequential

circuits with pad cells has large savings over a wide range of frequencies, as compared with the conventional CMOS counterparts.

# Reviewer of two-phase CPAL circuits & sequential circuits

The basic structure of the CPAL buffer is shown in Fig. 1 [2]. It is composed of two main parts: the logic function circuit and the load driven circuit. The logic circuit consists of four NMOS transistors (N5–N8) with complementary pass-transistor logic (CPL) function block. The load driven circuit consists of a pair of transmission gates (N1, P1 and N2, P2). The clamp transistors (N3 and N4) ensure stable operation by preventing from floating of the output nodes. Cascaded CPAL gates are driven by the two-phase non-overlap power-clocks [2]. The detailed description on the CPAL circuits can be found in [2, 5].



Fig.1. CPAL buffer using two-phase scheme and its waveforms

The adiabatic flip-flop can be structured using a cascaded logic chain. The *T* flip-flop with reset and enable terminals based on two-phase CPAL circuits are shown in Fig. 2(a). Because pre-settable flip-flops are more universal and suitable for the design of adiabatic sequential circuits, a reset line is added for the adiabatic *T* flip-flops by using a multiplexer. When Reset is logic '1', the output Q is set to '0'. If *Reset* is '0', the function of the multiplexer is the same as a CPAL buffer. An enable terminal (*EN*) has also been added for the *T* flip-flop. Assume that the present state of flip-flops is Q. Then, when *EN* is logic '1', next state Q+ of the *T* flip-flop can be written as  $Q^+ = T \oplus Q$ . If *EN* is '0', the next state is Q+ = Q. The layout of *T* flip-flop with reset and enable terminals is shown in Fig. 2 (b).



Fig.2. CPAL two-phase *T* flip-flop with reset and enable terminals (*Reset* and *EN*). (a) Schamatic and (b) its layout

Complex sequential circuits can also be realized. Taken as an example, a mode-10 adiabatic counter using the *T* flip-flops is shown in Fig. 3. Its structure is same as the conventional CMOS implementation based on *T* flip-flops. The reset signal is generated by using *AND* gates with  $Q_0$ and  $Q_3$ . In order to synchronize the signals between the stages, the signals ( $Q_0$ ,  $Q_1$ , and  $Q_2$ ) from the middle signal of the *T* flip-flops are used as inputs to the upper two AND logic gates, instead of the usual output signals ( $Q_0$ ,  $Q_1$ , and  $Q_2$ ).



Fig.3. Adiabatic moed-10 counter based on two-phase CPAL flip-flops

### Power-clock generator & power-gating scheme

The two-phase power-clocks can be generated from a single-phase sinusoidal power-clock, as shown in Fig. 4. The signal converter is used to convert the single-phase sinusoidal power-clock to the two-phase power-clocks. The mode-2 counter using the static CMOS flip-flop is used to synchronize the signal converter.

In order to reduce energy loss of adiabatic units during idle periods, the power-gating technologies can be introduced by switching off their power-clocks [5]. The power-gating scheme for the two-phase adiabatic sequential circuits is also shown in Fig. 4. A transmission gates (TG) are used as the power-gating switch, which are inserted between the single-phase power-clock (clk) and virtual power-clock (pc). It is used to disconnect the adiabatic logic block from the power-clock during idle periods. A clamp NMOS transistor prevents the floating state of the virtual power-clock (pc). In active mode, the power-gating control signal (active) is high, thus virtual power-clock (pc) follows power-clock (clk). In sleep mode, active is low, thus virtual power-clock (pc) is set as low level, so that the power-clock generator as well the powergated adiabatic logic block is shut down.



Adiabatic output pad cells

A basic output pad cell and its layout is shown in Fig. 5, which includes mainly bonding pads, electrostatic discharge (ESD) protection circuits, and two stage conventional buffers (INV1 and INV2) that are used to drive chip pads. The gate-grounded NMOS (GGNMOS) and the gate-VDD PMOS (GDPMOS) are often designed with a large device dimension and wider drain-contact-to-polygate layout spacing to sustain the desired ESD level [6, 7, 8]. The second-stage output driver (INV2) is used to drive the PAD with large capacitances. The resistor R in the ESD protection circuit is usually included to effectively protect the output driver (INV2). The series resistance and the large junction capacitances of the ESD clamp devices cause a long resistance-capacitance (RC) delay to the output signals, thus the second-stage output driver (INV2) should use enough large device dimensions to reduce the RC delay.



Fig.5. Basic output pad cell for the conventional CMOS circuits. (a) Structure and (b) Its layout

The energy dissipation of the CMOS circuits includes mainly two terms: dynamic dissipation for charging and discharging load capacitances, and energy dissipation due to leakage currents [9]. The leakage dissipation can be ignorable for our used process, and thus the energy dissipation for charging and discharging the node capacitances on chip pads is the most main part. This energy dissipation per cycle can be written as

(1) 
$$E_{\text{Static}} = C_{\text{L}} V_{\text{DD}}^{2}$$

where  $C_L$  is load capacitance of the output driver, and  $V_{DD}$  is supply voltage.

In the output pad cells, the energy loss of the secondstage output driver (INV2) for driving the bonding pad is the most main part because its capacitance is much larger than the other nodes.

The energy-recycling output pad cell is similar to conventional CMOS digital ones except for the CMOS inverters, which are replaced by the two stage energy-recycling buffers to reduce the energy dissipations on the large load capacitances. The energy-recycling output pad cell using CPAL drivers is shown in Fig. 6 (a), which include mainly chip bonding pads, ESD protection circuits, and two stage CPAL buffers. Fig. 6 (b) shows its layout.



Fig.6. Its layout Energy-recycling pad cell using CPAL drivers. (a) Structure and (b) Its layout

The CPAL circuits use complementary pass-transistor logic (N5 - N8) for logic evaluation and the transmission gates (N1, P1 and N2, P2) for driving output loads. Its energy loss per cycle an be expressed as

(2) 
$$E_{\text{CPAL}} = C_{\text{X}} (V_{\text{DD}} - V_{\text{TN}}) V_{\text{TN}} + \frac{1}{2} C_{\text{X}} (V_{\text{DD}} - V_{\text{TN}})^2 + 2 (\frac{RC_{\text{L}}}{T}) C_{\text{L}} V_{\text{DD}}^2$$

where  $C_X$  is capacitance of the node X or Xb,  $V_{DD}$  is peak voltage of power-clocks,  $V_{TN}$  is threshold voltage of NMOS transistors,  $C_L$  is load capacitance of the CPAL circuits, R is turn-on resistance of the transmission gates, and T is transition time of the power clocks. In (2), the first and second terms represent non-adiabatic energy loss on the internal nodes (X and Xb) of the CPAL circuits, and the third term represents full-adiabatic energy loss on the output nodes. Since T is much longer than the value of  $RC_L$ , the energy dissipation is almost a constant

(3) 
$$E_{\text{CPAL}} \approx C_X (V_{\text{DD}} - V_{\text{TN}}) V_{\text{TN}} + \frac{1}{2} C_X (V_{\text{DD}} - V_{\text{TN}})^2$$

If the CPAL buffers are used for driving large node capacitances on the bonding pads, the energy loss can be greatly reduced compared to the conventional CMOS pad cells, because the capacitance  $C_X$  of internal nodes is far smaller than load capacitance  $C_L$  on the pads and the non-adiabatic energy loss of large node capacitances has been eliminated in the CPAL circuits

In order to reduce full-adiabatic energy dissipations on the output nodes and maintain high operation frequencies, enough large device sizes of the adiabatic buffers should be used for a large node capacitance on the bonding pads. An investigation for the energy dissipation of the CPAL buffers has been carried out when driving load capacitances, as shown in Fig. 7.



Fig. 7. The energy dissipation of the CPAL buffer per cycle versus channel width of N1 and N2 (the channel width of P1 and P2 is two times as large as N1 and N2, and the frequency is 100MHz).

According to (2), the full-adiabatic energy loss of the CPAL buffer can be reduced by increasing the channel width of its MOS transistors (N1, N2, P1, and P2), whereas its non-adiabatic energy loss can be reduced by reducing the channel width of N1 and N2. Therefore, we can choose the optimal size of N1 and N2 to minimize the total energy loss for a given load capacitance.

#### **Post-layout simulations**

Two mode-10 adiabatic counters with and without the power-gating scheme have been implemented. A conventional CMOS mode-10 counter is also implemented. The three counters along with an adiabatic 32×32 register file and an 8×8 multiplier have been integrated in a test chip with Chartered 0.35µm CMOS technology, which has been fabricated. The full-custom layouts are drawn by using Virtuoso<sup>TM</sup> Layout Editor. The layout of the CPAL mode-10 adiabatic counter without the power-gating scheme is shown in Fig. 8, whose area is about  $49\mu$ m×136µm.

The sizes of the power-gating switch have been optimized. Considering the energy overhead of the switch and its area penalty, the channel width of the nMOS and pMOS transistors of TG is taken with 40 $\lambda$  (8 $\mu$ m) and 80 $\lambda$  (16 $\mu$ m), and  $\lambda$ =0.2 $\mu$ m.



Fig.8. The layout of mode-10 counter with power-clock generator

Energy loss comparisons of the two-phase CPAL and conventional mode-10 counter are shown in Fig. 9. The adiabatic counter attains energy savings of 87% at 50MHz and 44% at 200MHz, compared with its CMOS counterpart. Fig. 10 shows the energy loss per ten cycles of the mode-10 counter without and with the power-gating switches. The mode-10 counter with the power-gating scheme attains energy savings of 78.9% to 77.3% for clock rates ranging from 50 to 200MHz at a = 0.2. At 100MHz, we achieve energy savings of 85.7%, 74.1%, 65.1%, and 55.3% when a is 0.1, 0.2, 0.3, and 0.4, respectively. More energy savings can be achieved for longer sleep time.



Fig.9. Energy consumption comparisons of the two-phase CPAL and conventional mode-10 counters



Fig.10. Energy consumption comparisons of the two-phase CPAL mode-10 counters with and without power-gating switches

#### Manufacture and test of adiabatic flip-flips with pads

The test chip has been fabricated, and its photo is shown in Fig. 11. The core of the test chip includes some energy-recycling circuits and their corresponding conventional CMOS implementations. The test circuits for testing the chip are shown in Fig. 12.



Fig. 11. The photo of the test chip

Fig. 13 shows the test waveforms of the adiabatic flipflops. *PC* is the power-clock that is used for driving the energy-recycling counters. The "*CPAL counter output*" is outputs of the CPAL counter.



Core circuits for testing

our-phase powerclock generator

Fig. 12. Test circuits



Fig. 13. Test waveforms of the output pads

#### Conclusion

The physical implementation and testing of adiabatic flip-flops are described in this paper. The post-layout simulation results show that the energy consumption of adiabatic CPAL sequential circuits with pad cells has large savings compared to the conventional CMOS implantations. The proposed adiabatic flip-flops with energy-recycling pad cells could be used for ultra-low energy applications.

#### REFERENCES

- [1] Zhang W. Q., Su L., Zhang Y., Li L. F., Hu J. P., Low-leakage flip-flops based on dual-threshold and multiple leakage reduction techniques, Journal of Circuits, Systems and Computers, 20 (2011), No. 1, 147-162.
- [2] Hu J. P., Xu T. F., Li H., A lower-power register file based on complementary pass-transistor adiabatic logic, IEICE Trans. on Inf. & Sys., E88–D (2005), No. 7, 1479–1485
- [3] Maksimovic D., Oklobdzija V. G., Nikolic B., Current K. W., Clocked CMOS adiabatic logic with integrated single-phase power-clock supply, IEEE Trans. on VLSI, 8 (2000), No. 4, 460-463.
- [4] Kim S., Papaefthymiou M. C., True single-phase adiabatic circuitry, IEEE Tran. on VLSI Systems, 9(2001), No. 1, 52-63
- [5] Zhang W. Q., Zhou D., Hu X. Y., Hu J. P., The implementations of adiabatic flip-flops and sequential circuits with power-gating schemes, in Proc. IEEE MWSCAS, 2007. pp. 767-770
- [6] Ker M. D., Chen S. H., Chuang C. H., ESD failure mechanisms of analog I/O cells in 0.18-µm CMOS technology, IEEE Trans on device and materials reliability, 6 (2006), No. 1, 102-111
- [7] Daniel S., Krieger G., Process and design optimization for advanced CMOS I/O ESD protection devices, in Proc. EOS/ESD, 1990, pp.206–213
- [8] Ker M. -D., Chen T. -Y., Wu C. -Y., Chang H. -H., ESD protection design on analog pin with very low input capacitance for high-frequency or current-mode applications, IEEE J. Solid-State Circuits, 35(2000), No. 8, 1194–1199
- [9] Hu J. P., Yu X. Y., Low voltage and low power pulse flip-flops in nanometer CMOS processes, Current Nanoscience, 8 (2012), No. 1, 102-107.

**Authors**: Jintao Jiang, Faculty of Information Science and Technology, Ningbo University, E-mail: nbhjp@yahoo.com.cn; prof. Jianping Hu, Faculty of Information Science and Technology, Ningbo University, E-mail: hujianping2@nbu.edu.cn.