# Carry Select Adder Implementation using Asynchronous Fine Grain Power Gated Logic

# Nadisha E B<sup>1</sup>, Akhila P R<sup>2</sup>

<sup>1</sup>M.Tech Student, Department of Electronics and Communication Engineering, SCMS School of Engineering and Technology, Karukutty, Cochin, Kerala, India

<sup>2</sup>Assistant Professor, Department of Electronics and Communication Engineering SCMS School of Engineering and Technology, Karukutty, Cochin, Kerala, India

**Abstract:** This paper presents a low power logic family, called asynchronous fine-grain power-gated logic (AFPL). Each pipeline stage is comprised of the logic function called efficient charge recovery logic (ECRL) gates and a handshake controller. ECRL gates have negligible leakage power dissipation. By incorporating partial charge reuse (PCR) mechanism the energy dissipation required to complete the evaluation of an ECRL gate can be reduced . Moreover, AFPL-PCR adopts a  $C^*$ -element, in its handshake controllers. To mitigate the hardware overhead of the AFPL circuit, circuit simplificationtechniques have been developed.

Keywords: AFPL circuits, CSLA adders, PCR mechanism, ECRL logic gates.

### 1. Introduction

As the feature size continues to shrink and the transistor density increases, power dissipation has become an important concern in nanoscale CMOS VLSI design. As several features like threshold voltage, gate oxide thickness and channel length continue to shrink, leakage dissipation is becoming a significant contributor to the total power dissipation. Various techniques for reducing leakage loss have been proposed bat the circuit level and also at the process technology levels. At the circuit level, transistor stacking, dual threshold CMOS, reverse body biasing, and power gating are the leakage reduction techniques . Power gating technique is highly efficient for leakage reduction. In general, power gating increase the resistance of leakage paths by inserting sleep transistors which are the power gating transistors)where necessary. In the idle (sleep) mode, the sleep transistors are turned off and thus the leakage current is highly reduced; in the active modethe pull-up and pull-down networks are reconnected to power supply rails. For synchronous circuits, power gating can be implemented using the fine-grain which is at gate level or coarse-grain manner. The fine-grain power gating can considerably reduce leakage dissipation at run time when compared with coarse-grain power gating approach.A coarse-grain powergated synchronous system has the following disadvantages: 1) it needs a complex power network 2) it requires wakeup rush current control to prevent ground bounce noise; 3) it has longer wakeup latency; and 4) it needs rigorous static and dynamic IR drop analysis.

Asynchronous circuits employ local handshaking protocol for transferring data, so they are data-driven type and becomes active only when doing work. That is, asynchronous circuits do not switch when inactive. Asynchronous circuits in inactive mode suffer leakage dissipation .Several techniques have been proposed recently for power gating to reduce the static power of asynchronous circuits.

Asynchronous circuits can be power-gated at the gate level of granularity. Method proposed was asynchronous adiabatic logic (AAL) for gate level. Each stage consists of a gate called as adiabatic, which implements the logic function of this stage, and a block called control and regeneration (C&R). When the control and regeneration block detects that theincoming input to the gate becomes a valid data, the output of the C&R block transits to HIGH level , and the logic gate can acquire power; when the C&R block detects that the data input to the gate becomes empty, then the output of the block transits to LOW, and the adiabatic gate is not powered and remains idle. Unidirectional control signal (i.e., the output of the C&R block)accomplishthe synchronization between neighboring stages in rather than bidirectional handshake signals, so an asynchronous adiabatic logic circuit whose pipeline stages have diverse propagation delay may result in data values propagating along the pipeline stages to be overridden by its succeeding data token.

AFPL can achieve fine-grain power gating to tolerate static power dissipation without excess hardware overhead. The partial charge reuse(PCR)mechanism is combined with AFPL method to reduce the energy required to complete the evaluation of a logic block.

# 2. Literature Survey

AFPL can be combined with a mechanism calledpartial charge reuse (PCR). When AFPL incorporates this mechanism which is denoted by AFPL PCR, and that without PCRas AFPL w/o PCR. Fig.1 shows the structure of the AFPL pipelines. In AFPL w/o PCR(without PCR) [see Fig.1(a)], a pipeline stage, denoted by  $S_i$ , comprises of logic called efficient charge recovery logic gate gatesG<sub>i</sub>(ECRL)and a handshake controller HC<sub>i</sub>, handles handshaking using hand shke protocol with the neighboring stages in the circuit and provides power to ECRL logic gate  $G_i$ . In AFPL-PCR [see Fig.1(b)], denoted by  $S_{i+1}$  which is a pipeline stage, has an additional unit, the PCR unit PCR<sub>i+1</sub>, which controls charge reuse between other pipeline stages S<sub>i</sub>and S<sub>i+2</sub>.

Volume 4 Issue 11, November 2015 <u>www.ijsr.net</u> Licensed Under Creative Commons Attribution CC BY

### International Journal of Science and Research (IJSR) ISSN (Online): 2319-7064 Index Copernicus Value (2013): 6.14 | Impact Factor (2013): 4.438



Figure 1: AFPL pipelines. (a) AFPL w/o PCR pipeline. (b) AFPL-PCR pipeline

There are two main differences between AFPL PCR and AFPL without PCR. First, AFPL-PCR employs the PCR unit PCR<sub>i+1</sub> to control charge reuse betweenS<sub>i</sub>and S<sub>i+2</sub>which are the pipeline stages. Second, HC<sub>i</sub>the handshake controllerin AFPL-PCR employs an enhanced C-element, which is called C\*-element to control the power node Vp<sub>i</sub>of the ECRL gates. The enhanced element C\*-element has the advantage that an ECRL gate can discharge early if its outputs are no longer required, without waiting for next empty token to arrive at this stage. As shown in the Fig.1(b), in the PCR<sub>i+1</sub> unit, transistor M2 is used as a diode, that allows the current to flow only in a direction from Vp<sub>i</sub>to Vp<sub>i+2</sub>, and transistor M1 shown in the figure is used as a switch, which is turned on when charge reuse is activated.

In order to evaluate the effectiveness of the proposed AFPL, we have employed AFPL w/o PCR(AFPL without PCR), and AFPL-PCR to implement an eight-bit five-stagepipelined Kogge–Stone adder for performance comparison. Eight-bit five-stage pipelined Kogge–Stone adder is simulated using VHDL in Xilinx ISE Design Suite 8.1. Power consumption

comparison is done for AFPL w/o PCR(AFPL without PCR)and AFPL-PCR mechanism. The AFPL w/o PCR (AFPL without PCR) implementation can reduce power dissipation by 19.1%–32.0%, and the AFPLPCR implementation can reduce power dissipation by 30.6%–55.3%.

# 3. AFPL-PCR implementation of an Eight-Bit Five-Stage Pipelined Kogge–Stone adder



Figure 2: 8 bit Kogge–Stone adder schematic

The Kogge Stone(KS) adder is a parallel prefixcarry look ahead adder. It generates the carry signals of the adder circuit within O(log n) time, and is widely the fastest adder design. It is one of the common design for high-performance adders in industry.But it has large area. In this implementation, all ECRL gates in the same pipeline stage share a common handshake controller to mitigate the hardware overhead. The logic blocks of the Kogge–Stone adder consist of 83 ECRL gates (508 transistors); the handshake controllers and PCR units of the Kogge– Stone adder consist of 14 logic gates (83 transistors). That is, the handshake controllers and PCR units account for 14% of the total transistor count.

| Power summary:                     | I(mA) | P(mW) |
|------------------------------------|-------|-------|
| Total estimated power consumption: |       | 819   |
| Vccint 1.80V:                      | 451   | 812   |
| Vcco33 3.30V:                      | 2     | 7     |
| Inputs:                            | 7     | 13    |
| Logic:                             | 349   | 629   |
| Outputs:                           |       |       |
| Vcco33                             | 0     | 0     |
| Signals:                           | 80    | 143   |
| Quiescent Vccint 1.80V:            | 15    | 27    |
| Quiescent Vcco33 3.30V:            | 2     | 7     |

Figure 3: Simulated result without using PCR mechanism

Figure 3 shows the power consumption for eight-bit fivestage pipelined Kogge–Stone adder without using PCR mechanism.The power rating has been estimated using Xilinx ISE 8.1 and the total estimated power consumption was found to be 819mw.

| Power summary:                     | I(mA) | P(mW) |
|------------------------------------|-------|-------|
| Total estimated power consumption: |       | 711   |
| Vccint 1.80V:                      | 391   | 704   |
| Veco33 3.30V:                      | 2     | 7     |
| Inputs:                            | 7     | 13    |
| Logic:                             | 305   | 549   |
| Outputs:                           |       |       |
| Vcco33                             | 0     | 0     |
| Signals:                           | 64    | 115   |
| Quiescent Vccint 1.80V:            | 15    | 27    |
| Quiescent Vcco33 3.30V:            | 2     | 7     |

Figure 4: Simulated result using PCR mechanism

Figure 4 shows the power consumption for eight-bit fivestage pipelined Kogge–Stone adder using PCR mechanism.The power rating has been estimated using Xilinx ISE 8.1 and the total estimated power consumption was found to be 711mw. In Kogge-stone adder, carries aregenerated fast by computing carries in parallelat the cost of increased area.

### 4. Delay, Area and Power Evaluation of 16bit Conventional Carry Select Adder(CSLA)



Carry Select Adder (CSLA) is used in many data processing systems which is the fastest adders to perform arithmetic functions faster. This method uses a simple and more efficient gate level modification to reduce the area and power of the CSLA. The proposed method has reduced area and power as compared to the regular SQRT CSLA adder with a slight increase in the delay. In this work the performance of proposed designs is evaluated in terms of delay, area, power, and their products by hand with the logical effort and using custom design and the layout of circuit in 0.18- m CMOS process technology. From the result it is clear that the proposed structure is better than the regular SQRT CSLA. The delay, area and power evaluation methodology considers all gates made up of AND, OR, and Inverter, each having delay and area equal to 1 unit. After that add the number of gates in the longest path of a logic system that is contributing the maximum delay. The area is evaluated by counting the

total number of AOI (AND OR inverter)logicgates required for each logic block.

| Design Summary              |          |      |      |         |             |          |      |
|-----------------------------|----------|------|------|---------|-------------|----------|------|
| **********                  |          |      |      |         |             |          |      |
| Number of errors: 0         |          |      |      |         |             |          |      |
| Number of warnings: 16      |          |      |      |         |             |          |      |
| Logic Utilization:          |          |      |      |         |             |          |      |
| Number of Slice Flip Flops: | 49       | COL  | of   | 13,824  | 18          |          |      |
| Number of 4 input LUTs:     | 35       | out  | of   | 13,824  | 18          |          |      |
| Logic Distribution:         |          |      |      |         |             |          |      |
| Number of occupied Slices:  |          |      |      |         | 51 out of   | 6,912    | 18   |
| Number of Slices containing | only rel | late | ź 1: | ogic:   | 51 out of   | 51       | 100% |
| Number of Slices containing | unrelate | ed 1 | ogi  | 01      | 0 out of    | 51       | 01   |
| *See NOTES below for an     | explanat | tion | of   | the eff | ects of unr | elated 1 | ogic |
| Total Number 4 input LUTs:  | 67       | out  | of   | 13,824  | 18          |          |      |
| Number used as logic:       |          |      |      | 35      |             |          |      |
| Number used for 32x1 RAMs:  |          |      |      | 32      |             |          |      |
| (Two LUTs used per 32x1 RA  | M)       |      |      |         |             |          |      |
| Number of bonded IOBs:      | 47       | out  | of   | 510     | 9%          |          |      |
| IOB Flip Flops:             |          |      |      | 11      |             |          |      |
| Number of GCLKs:            | 1        | out  | of   | 4       | 25%         |          |      |
| Number of GCLKIOBs:         | 1        | OUT  | of   | 4       | 25%         |          |      |

Total equivalent gate count for design: 4,837 Additional JTAG gate count for IOBs: 2,304 Peak Memory Usage: 192 MB

| (a)                                |       |       |
|------------------------------------|-------|-------|
| Power summary:                     | I(mA) | P(mW) |
| Total estimated power consumption: |       | 173   |
| Vccint 1.80V:                      | 93    | 167   |
| Vcco33 3.30V:                      | 2     | 7     |
| Clocks:                            | 69    | 125   |
| Inputs:                            | 8     | 15    |
| Logic:                             | 0     | 0     |
| Outputs:                           |       |       |
| Vcco33                             | 0     | 0     |
| Signals:                           | 0     | 0     |
| Quiescent Vccint 1.80V:            | 15    | 27    |
| Quiescent Vcco33 3.30V:            | 2     | 7     |

TIMING REPORT

NOTE: THESE TIMING NUMBERS ARE ONLY A SIMTHESIS ESTIMATE. FOR ACCURATE TIMING INFORMATION PLEASE REFER TO THE TRACE REPORT GENERATED AFTER FLACE-and-ROUTE.

Clock Information:

| ******************** | **********            | +++++++ | - 1 |
|----------------------|-----------------------|---------|-----|
| Clock Signal         | Clock buffer(FF name) | Load    | 1   |
|                      | **********            | ******* | -+  |
| clk                  | BUFGP                 | 1.76    | 1   |

Timing Summery:

Speed Grade: -6

peen urane: -e

| Timing o | constraint: Default period analysis for Clock 'clk' |
|----------|-----------------------------------------------------|
| Clock    | period: 7.435n# (frequency: 134.4990Hz)             |
| Total    | number of paths / destination ports: 341 / 133      |
|          |                                                     |
|          | (c)                                                 |

Figure 6: Conventional (a)area,(b)power,(c)time

Figure 6 shows the area, power and delay of conventional CSLA simulated using VHDL in Xilinx ISE Design Suite 8.1.

# 5. Delay, Area and Power Evaluation Methodology of regular 16-b SQRT CSLA

The structure of 16-b regular SQRT CSLA is shown in the Fig. 7. It has five groups of different size RCA.



| Number of 4 input LUTs:     | 37       | out  | of  | 13,824  | 1     | ŧ.  |      |       |       |
|-----------------------------|----------|------|-----|---------|-------|-----|------|-------|-------|
| Logic Distribution:         |          |      |     |         |       |     |      |       |       |
| Number of occupied Slices:  |          |      |     |         | 39    | out | of   | 6,912 | 18    |
| Number of Slices containing | only re. | late | d 1 | ogic:   | 39    | out | of   | 39    | 1004  |
| Number of Slices containing | unrelat  | ed 1 | ogi |         | 0     | out | of   | 39    | 01    |
| *See NOTES below for an     | explana  | tion | of  | the eff | lects | of  | unre | lated | logic |
| Total Number 4 input LUTs:  | 70       | out  | of  | 13,824  | 1     | ł   |      |       |       |
| Number used as logic:       |          |      |     | 37      |       |     |      |       |       |
| Number used for 32x1 RAMs:  |          |      |     | 32      |       |     |      |       |       |
| (Two LUTs used per 32x1 RJ  | M)       |      |     |         |       |     |      |       |       |
| Number used as Shift regis  | ters:    |      |     | 1       |       |     |      |       |       |
| Number of bonded IOBs:      | 30       | out  | of  | 510     | 5     | h   |      |       |       |
| IOB Flip Flops:             |          |      |     | 9       |       |     |      |       |       |
| Number of GCLKs:            | 1        | out  | of  | 4       | 25    | ł   |      |       |       |
| Number of GCLKIOBs:         | 1        | out  | of  | 4       | 25    | ł   |      |       |       |

Total equivalent gate count for design: 4,737 Additional JTAG gate count for IOBs: 1,488 Peak Memory Usage: 192 MB

(a)

| ()                                 |       |       |
|------------------------------------|-------|-------|
| Power summary:                     | I(mA) | P(mW) |
| Total estimated power consumption: |       | 147   |
| Vccint 1.80V:                      | 78    | 140   |
| Vcco33 3.30V:                      | 2     | 7     |
| Clocks:                            | 54    | 98    |
| Inputs:                            | 8     | 15    |
| Logic:                             | 0     | 0     |
| Outputs:                           |       |       |
| Vcco33                             | 0     | 0     |
| Signals:                           | 0     | 0     |
| Quiescent Vccint 1.80V:            | 15    | 27    |
| Quiescent Vcco33 3.30V:            | 2     | 7     |
| (b)                                |       |       |

Timing Summary:

Speed Grade: -6

Minimum period: 7.505ns (Maximum Frequency: 133.245MHz) Minimum input arrival time before clock: 5.720ns Maximum output required time after clock: 8.919ns Maximum combinational path delay: No path found

Timing Detail:

All values displayed in nanoseconds (ns)

| Timing constraint:<br>Clock period: 7. | Default period analysis for Clock 'clk'<br>505ns (frequency: 133.245MHz)<br>paths (destination paths) 278 (111 |
|----------------------------------------|----------------------------------------------------------------------------------------------------------------|
| Total number of                        | paths / destination ports: 378 / 111                                                                           |
| Delay:                                 | 7.505ns (Levels of Logic = 2)                                                                                  |
| Source:                                | X0/x2/add 1 1 (FF)                                                                                             |
| Destination:                           | X1/data out 0 (FF)                                                                                             |
| Source Clock:                          | clk rising                                                                                                     |
| Destination Cloc                       | k: clk rising                                                                                                  |
|                                        | (c)                                                                                                            |
| Figure 8: R                            | egular 16-b SORT CSL (a)area, (b)power.                                                                        |

Figure 8 shows the area, power and delay of regular 16-b SQRT CSLA simulated using VHDL in Xilinx ISE Design Suite 8.1.

(c)time

# 6. Delay, Area and Power Evaluation of Modified 16-b SQRT CSLA

The structure of the proposed 16-b SQRT CSLA adder using BEC (Binary to Excess-1 Converter) logic for RCA(Ripple carry adder) with carry input of the adderc<sub>in</sub>=1to optimize the area and power is shown in the Fig. 9.



**Figure 9:** Modified 16-b SQRT CSLA. The parallel RCA with c<sub>in</sub>=1 is replaced with BEC

| Design Summary              |          |       |     |         |            |           |      |
|-----------------------------|----------|-------|-----|---------|------------|-----------|------|
| Number of errors: 0         |          |       |     |         |            |           |      |
| Number of warnings: 1       |          |       |     |         |            |           |      |
| Logic Utilization:          |          |       |     |         |            |           |      |
| Number of Slice Flip Flops: | 22       | out   | of  | 13,824  | 1%         |           |      |
| Number of 4 input LUTs:     | 29       | out   | of  | 13,824  | 1%         |           |      |
| Logic Distribution:         |          |       |     |         |            |           |      |
| Number of occupied Slices:  |          |       |     |         | 23 out o   | f 6,912   | 1%   |
| Number of Slices containing | only re. | lated | i 1 | ogic:   | 23 out o   | f 23      | 100% |
| Number of Slices containing | unrelat  | ed lo | ogi | o:      | 0 out o    | £ 23      | 0%   |
| *See NOTES below for an     | explana  | tion  | of  | the eff | ects of un | related 1 | ogic |
| Total Number 4 input LUTs:  | 31       | out   | of  | 13,824  | 1%         |           |      |
| Number used as logic:       |          |       |     | 29      |            |           |      |
| Number used as Shift regis  | ters:    |       |     | 2       |            |           |      |
| Number of bonded IOBs:      | 4        | out   | of  | 510     | 1%         |           |      |
| IOB Flip Flops:             |          |       |     | 1       |            |           |      |
| Number of GCLKs:            | 1        | out   | of  | 4       | 25%        |           |      |
| Number of GCLKIOBs:         | 1        | out   | of  | 4       | 25%        |           |      |

Total equivalent gate count for design: 614 Additional JTAG gate count for IOBs: 240 Peak Memory Usage: 192 MB

(a)

| Power summary:                     | I(mA) | P(mW) |
|------------------------------------|-------|-------|
| Total estimated power consumption: |       | 133   |
| Vccint 1.80V:                      | 70    | 126   |
| Vcco33 3.30V:                      | 2     | 7     |
| Clocks:                            | 47    | 84    |
| Inputs:                            | 8     | 15    |
| Logic:                             | 0     | 0     |
| Outputs:                           |       |       |
| Vcco33                             | 0     | 0     |
| Signals:                           | 0     | 0     |
| Quiescent Vccint 1.80V:            | 15    | 27    |
| Ouiescent Vcco33 3.30V:            | 2     | 7     |

Timing Summary:

-----

Speed Grade: -6

Minimum period: 6.420ns (Maximum Frequency: 155.763MHz) Minimum input arrival time before clock: 4.039ns Maximum output required time after clock: 6.514ns Maximum combinational path delay: No path found

Timing Detail:

All values displayed in nanoseconds (ns)

| *****************                                                | ***************************************                               |
|------------------------------------------------------------------|-----------------------------------------------------------------------|
| Timing constraint: D<br>Clock period: 6.42<br>Total number of pa | efault period analysis for Clock 'clk'<br>Ons (frequency: 155.763MHz) |
| rocar maner or pa                                                | uno / descinación ports. Iri / 55                                     |
| Delay:                                                           | 6.420ns (Levels of Logic = 3)                                         |
| Source:                                                          | X0/x2/add 3 (FF)                                                      |
| Destination:                                                     | X2/cache hit (FF)                                                     |
| Source Clock:                                                    | clk rising                                                            |
| Destination Clock:                                               | clk rising                                                            |
|                                                                  | (c)                                                                   |

# Figure 10: Modified 16-b SQRT CSLA(a)area,(b)power,(c) time.

Figure 10 shows the area, power and delay of modified 16-b SQRT CSLA simulated using VHDL in Xilinx ISE Design Suite 8.1.

The proposed modified SQRT CSLA adder saves 113 gate areas than the regular SQRT CSLA adder, with only 11 increases in gate delays.

### 7. Inferences

#### KS-ADDER:

**Timing Summary:** 

Speed Grade: -7

Minimum period: 3.589ns (Maximum Frequency: 278.660MHz)

Minimum input arrival time before clock: 35.960ns Maximum output required time after clock: 34.270ns Maximum combinational path delay: 42.892ns

| Power summary:                     | I(mA) | P(mW) |
|------------------------------------|-------|-------|
| Total estimated power consumption: |       | 674   |
|                                    |       |       |
| Vecint 1.80V:                      | 371   | 667   |
| Vcco33 3.30V:                      | 2     | 7     |
|                                    |       |       |
| Inputs:                            | 7     | 13    |
| Logic:                             | 285   | 513   |
| Outputs:                           |       |       |
| Veco33                             | 0     | 0     |
| Signals:                           | 64    | 114   |
|                                    |       |       |
| Quiescent Vccint 1.80V:            | 15    | 27    |
| Quiescent Vcco33 3.30V:            | 2     | 7     |

Figure 11:KS adder

CSA ADDER:

Timing Summary:

Speed Grade: -7

Minimum period: No path found Minimum input arrival time before clock: 15.675n Maximum output required time after clock: 11.254ns Maximum combinational path delay: 17.672ns

| Power summary:                    | I(mA) | P(mW) |
|-----------------------------------|-------|-------|
| Total estimated power consumption |       | 121   |
| Vecint 1.80V:                     | 64    | 114   |
| Vcco33 3.30V:                     | 2     | 7     |
| Clocks:                           | 18    | 32    |
| Inputs:                           | 7     | 13    |
| Logic:                            | 16    | 28    |
| Outputs:                          |       |       |
| Vcco33                            | 0     | 0     |
| Signals:                          | 8     | 15    |
| Quiescent Vccint 1.80V:           | 15    | 27    |
| Quiescent Vcco33 3.30V:           | 2     | 7     |

Figure 12: CSA adder

### 8. Conclusion

This paper has proposed the AFPL. In the AFPL circuit, the logic blocks become active only when performing useful computations, and the idle logic blocks were not powered and have negligible leakage power dissipation. The AFPL circuit employs ECRL logics to construct its logic blocks to avoid the occurrence of the short-circuit current from VDD

Volume 4 Issue 11, November 2015 <u>www.ijsr.net</u> Licensed Under Creative Commons Attribution CC BY to the ground, and to eliminate the requirement for additional standalone pipeline latches.

The PCR mechanism can be incorporated in the AFPL circuit to form the AFPL-PCR circuit. The AFPL-PCR pipeline uses the enhanced C\*-element in its handshake controllers (HC) such that an ECRL logic gate in the AFPL-PCR pipeline can enter the sleep mode early to reduce the leakage dissipation once the output has been received by the downstream pipeline stage. For the AFPL-PCR implementation of an eight-bit five-stage pipelined Kogge–Stone adder, power consumption is less than that without using PCR.

A method is proposed in this paper to reduce the area and power of SQRT CSLA architecture. The reduced number of gates of this work offers the greater advantage in the reduction of area and also the total power. The compared results shows that the modified SQRT CSLA adder has a slightly larger delay, but the area and power of the 16-b modified SQRT CSLA are significantly reduced. The modified CSLA architecture is therefore, low area, low power, simple and efficient for VLSI hardware implementation. Also it has lower power consumption than kogge stone adder.

### References

- [1] Meng-Chou Chang, Member, IEEE, and Wei-Hsiang Chang," Asynchronous Fine-Grain Power- Gated Logic", IEEE transactions on very large scale integration (vlsi) systems, vol. 21, no. 6, june 2013.
- [2] Ramkumar and Harish M Kittur," Low-Power and Area-Efficient Carry Select Adder", IEEE transactions on very large scale integration (vlsi) systems ,May 2011.
- [3] H. Jeon, Y.-B. Kim, and M. Choi, "Standby leakage power reduction technique for nanoscale CMOS VLSI systems," IEEE Trans. Instr.Meas., vol. 59, no. 5, pp. 1127–1133, May 2010.
- [4] M. Arsalan and M. Shams, "Asynchronous adiabatic logic," in *Proc. IEEE Int. Symp. Circuits Syst.*, May 2007, pp. 3720–3723.
- [5] M. Nomura, Y. Ikenaga, K. Takeda, Y. Nakazawa, Y. Aimoto, and Y. Hagihara, "Delay and power monitoring schemes for minimizing power consumption by means of supply and threshold voltage control in active and standby modes," IEEE J. Solid-State Circuits, vol. 41, no. 4, pp. 805–814, Apr. 2006.
- [6] K. Ishida, K. Kanda, A. Tamtrakarn, H. Kawaguchi, and T. Sakurai, "Managing subthreshold leakage in charge-based analog circuits with low-VTH transistors by analog T-switch (AT-switch) and super cut-off CMOS (SCCMOS)," IEEE J. Solid-State Circuits, vol. 41, no. 4, pp. 859–867, Apr. 2006.

# **Author Profile**



**Nadisha E B** received the B. Tech degree in Electronics And Communication Engineering from Mahatma Gandhi University, Kerala at SCMS School of Engineering and Technology2013 and now she is pursuing her M.Tech degree in VLSI and Embedded

systems under the same university in SCMS School of Engineering and Technology, Cochin.