# High Speed Vedic Multiplier Design Based On CSLA

# Sijo Mathew<sup>1</sup>, S. Chinnapparaj<sup>2</sup>, Dr. D. Somasundareswari<sup>3</sup>

<sup>1</sup>PG Scholar, Hindusthan Institute of Technology, Coimbatore-32, India

<sup>2</sup>Assistant Professor, Department of ECE, Hindusthan Institute of Technology, Coimbatore-32, India

<sup>3</sup>Professor and Dean, Department of ECE, SNS College of Technology, Coimbatore-35, India

Abstract: This work proposes an high speed Vedic Multiplier based on area, delay and power efficient Carry Select Adder. In this paper a fast method of multiplication based on ancient Indian Vedic mathematics is proposed. The whole of Vedic mathematics is based on 16 sutras and manifests a unified structure of mathematics. Among the various methods of multiplication in Vedic mathematics, Urdhava tiryakbhyam is discussed in detail. All the redundant logic operations present in the conventional CSLA are eliminated and proposed a new logic formulation for CSLA. The proposed CSLA design involves significantly less area and delay than the recently proposed BEC-based CSLA. The multiplier discussed here is compared with other multiplier to highlight the speed and power superiority of the vedic multiplier.

Keywords: Carry Select Adder, Urdhava Tiryakbhyam Sutra, Multiplier, Vedic Mathematics, Low power design.

#### 1. Introduction

The multiplier is one of the fundamental hardware blocks in many Digital Signal Processing systems for performing different operations. Some of the important arithmetic functions implemented by the multiplier in the DSPs are Multiply and Accumulate (MAC),inner product. Not just in the DSP systems, the digital multiplier is an indispensable block in Digital Image Processing systems, and even in Microprocessor in its ALU. The former microprocessors did not have a Multiplier block, instead of which they used multiply routines, for shifting and adding the partial results to produce the final product result. But with the enhanced levels of integration in the latest VLSI circuits day-by-day, the task of designing a multiplier block has began receiving immense devotion in the design of digital systems.

In this, Urdhva tiryakbhyam Sutra is first applied to the binary number system and is used to develop digital multiplier architecture. This is shown to be very similar to the popular array multiplier architecture. This Sutra also shows the effectiveness of to reduce the NXN multiplier structure into an efficient 4X4 multiplier structures. The proposed multiplication algorithm is then illustrated to show its computational efficiency by taking an example of reducing a 4X4-bit multiplication to a single 2X2-bit multiplication operation. This work presents a systematic design methodology for fast and area efficient digit multiplier based on Vedic mathematics .The Multiplier Architecture is based on the Vertical and Crosswise algorithm of ancient Indian Vedic Mathematics.

A conventional carry select adder (CSLA) is an RCA–RCA configuration that generates a pair of sum words and output carry bits corresponding the anticipated input-carry (c=0 and 1) and selects one out of each pair for sum and output-carry. A conventional CSLA has less CPD than an RCA, but the design is not attractive since it uses a dual RCA. Few

attempts have been made to avoid dual use of RCA in CSLA design.

## 2. Existing System

#### 2.1 Urdhva Tiryakbhyam Sutra

Urdhva Tiryakbhyam (Vertical & Crosswise) algorithm can be generalized for n x n bit number. This Multiplier has the advantage that the number of bits increases, gate delay and area increases very slowly as compared to other multipliers. Therefore it is time, space and power efficient. It is demonstrated that this architecture is quite efficient in terms of silicon area/speed . Since in this multiplier the partial products and their sums are calculated in parallel, the multiplier is independent of the clock frequency of the processor. Therefore the multiplier will require the same amount of time to calculate the product and hence is independent of the clock frequency. By adopting the Vedic multiplier, structure. Due to its regular structure, it can be easily layout in microprocessors and designers can easily circumvent this power of multiplier. It can easily be increased by increasing the input and output data bus widths since it has a quite a regular problems to avoid catastrophic device failures. The net advantage is that it reduces the need of microprocessors to operate at increasingly high clock frequencies. While at higher clock frequency generally results in increased processing power, its disadvantage is that it also increases power dissipation which results in higher device operating temperatures. Methodology of Parallel Calculation

1 1 1 1 1 1 1 1 1 x 1+1 x 1 = 2 1 1 1 1 1 1 1 1

```
1 \times 1+1 \times 1+1 \times 1=3
1 1 1 1
1 \times 1+1 \times 1+1 \times 1+1 \times 1=4
1 1 1 1
1 \times 1+1 \times 1+1 \times 1=3
1 1 1 1
1 \times 1+1 \times 1=2
1 1 1 1
1 \times 1+1 \times 1=2
1 1 1 1
1 \times 1=1
Final answer = 1 2 3 4 3 2 1
```

#### 2.2 Vedic Multiplier

The 8x8 vedic multiplier module is implemented using four 4x4 vedic multiplier modules of as shown in fig.1. Here partial product generation and addition is done concurrently. B[7:0] and a[7:0] are taken as two binary numbers. 4x4 vedic multiplier modules, three 8 bit ripple carry adder are used to generate the desired 16 bit product s15 down to s0. The least significant 4 bits of the result of rightmost 4x4 vedic multiplier produce the result s3s2s1s0. The 8 bit ripple carry adder(located in middle in fig.4.) adds two 8 bits operands i.e concatenated 8 bits ("0000" and most The upper 8 bit ripple carry adder adds the results of two 4x4 vedic multiplier modules(second and third from right) and generates one carry and 8 bit result. The bottom 8 bit ripple carry adder adds 4x4 vedic multiplier module result and concatenated 8 bits ("000", carry from upper 8 bit ripple carry adder and most significant bits of the result from middle 8 bit ripple carry adder) to generate the most significant bits of the final product i.e. s15s14s13s12s11s10s9s8. Significant four bits of rightmost 4x4 vedic. It generates the resultant bits s7s6s5s4 at its output. multiplier module) and the result of second from right 4x4 vedic multiplier module.



Figure 1: Block Diagram of 8\*8 bit Vedic Multiplier.

#### 2.3 Carry Select Adder

The CSLA has two units: 1) the sum and carry generator unit (SCG) and 2) the sum and carry selection unit as shown in the fig2 . The SCG unit consumes most of the logic resources of CSLA and significantly contributes to the critical path. Different logic designs have been suggested for efficient implementation of the SCG unit. We made a study of the logic designs suggested for the SCG unit of conventional and BEC-based CSLAs of by suitable logic expressions. The main objective of this study is to identify redundant logic operations and data dependence. Accordingly, we remove all redundant logic operations and sequence logic operations based on their data dependence.



Figure 2: Structure of the BEC-based CSLA; n is the input operand bit-width

## 3. Proposed System

#### 3.1 Proposed Adder Design

The proposed CSLA is based on the logic formulation given in (4a)–(4g), and its structure is shown in Fig. 3(a). It consists of one HSG unit, one FSG unit, one CG unit, and one CS unit. The CG unit is composed of two CGs (CG0 and CG1) corresponding to input-carry '0' and '1'. The HSG receives two n-bit operands (A and B) and generate half-sum word s0 and half-carry word c0 of width n bits each. Both CG0 and CG1 receive s0 and c0 from the HSG unit and generate two n-bit full-carry words c11 and c10 corresponding to input-carry '0' and '1', respectively. The logic diagram of the HSG unit is shown in Fig. 3(b). The logic circuits of CG0 and CG1 are optimized to take advantage of the fixed input-carry bits. The optimized designs of CG0 and CG1 are shown in Fig. 3(c) and (d), respectively.

S0(i) = A(i) (+) B(i),  $C0(i) = A(i) \cdot B(i) 4(a)$   $C10(i) = C10(i-1) \cdot S0(i) + C0(i)$ for (10(0) = 0) 4(b)  $C11(i) = C11(i-1) \cdot S0(i) + C0(i)$ for (11(0) = 1) 4(c) C(i) = C10(i) if (Cin = 0)C(i) = C11(i) if (Cin = 1) 4(d)

The CS unit selects one final carry word from the two carry words available at its input line using the control signal cin. It selects c10 when cin=0; otherwise, it selects c11. The CS unit can be implemented using an n-bit 2-to-1 MUX. However, we find from the truth table of the CS unit that

Volume 4 Issue 6, June 2015 <u>www.ijsr.net</u> Licensed Under Creative Commons Attribution CC BY

#### International Journal of Science and Research (IJSR) ISSN (Online): 2319-7064 Index Copernicus Value (2013): 6.14 | Impact Factor (2013): 4.438



**Figure 3:** (a) Proposed CS adder design, where n is the input operand bit-width..(b) Gate-level design of the HSG. (c) Gate-level optimized design for input-carry =0. (d) Gate-level optimized design for input-carry =1. (e) Gate-level design of the CS unit. (f) Gate-level design of the final-sum generation (FSG) unit. Optimized design of the CS unit is shown in Fig. 3(e),

which is composed of n AND–OR gates. The final carry word c is obtained from the CS unit. The MSB of c is sent to output as cout and (n-1) LSBs are XORed with (n - 1) MSBs of *half-sum* (s0)

In the FSG [shown in Fig. 3(f)] to obtain (n - 1) MSBs of final-sum(s) .The LSB of s0 is XORed with cin to obtain the LSB of s.



# 3.2 Proposed Multiplier Design

Figure 4: 8\*8 Vedic Multiplier Using Proposed CSLA

The architecture of 8x8 Vedic multiplier using "Urdhva Tiryagbhyam" Sutra is shown in Fig.4. The 8x8 Vedic multiplier architecture is implemented using four 4x4. Vedic multiplier modules, three 8 bit modified carry select adder shown in fig 3 (a) .The first step in the design of  $8\times8$  block will be grouping the 4 bit of each 8 bit input. These pair terms will form vertical and crosswise product terms. Each input bit-pair is handled by a separate  $4\times4$  Vedic the schematic of a  $8\times8$  block designed using  $4\times4$  blocks. The partial products represent the Urdhva vertical and cross product terms

#### 3. 3 ASIC Synthesis Results

We have coded the Vedic Multiplier in VHDL using the proposed CSLA design and the existing CSLA designs for

bit-widths 8, 16, and 32. All the designs are synthesized in the Cadence Design Suit. The net list file obtained from the DC are processed in the IC Compiler (ICC). After placement and route, the area, Delay, and power reported by the ICC are listed Table 1 for comparison.

As shown in Table 1, the proposed SQRT-CSLA involves significantly less area and less delay and consumes less power than the existing designs. We can find from Fig. 5 that the proposed Multiplier design offers a saving of 39% ADP and 37% energy than the conventional Vedic Multiplier; The proposed CSLA saves 32% ADP and 33% energy than the BEC-based SQRT-CSLA; on average, for different bit-widths.

#### 4. Performance Analysis

The proposed system was analysed for performance and overhead with existing multiplication techniques. The result of performance analysis is visualized in the form of graph to provide a clear insight on the improvements achieved. This analysis was carried out on the following metrics: throughput, Area, ADP, and speed. The resultant values is tabulated.

| Table 1: | Parameter | measurements |
|----------|-----------|--------------|
|----------|-----------|--------------|

| <b>Table 1.</b> I arameter measurements |            |        |       |      |  |  |
|-----------------------------------------|------------|--------|-------|------|--|--|
| Design                                  | Throughput | Area   | Delay | ADP  |  |  |
|                                         |            |        | ns    |      |  |  |
| Conv CSLA                               | .552       | 1438.1 | 3.45  | .692 |  |  |
| CSLA With BEC                           | 27.2       | 1228.2 | 3.78  | 21.7 |  |  |
| CSLA Using D-Latch                      | .58        | 906    | 4.08  | .72  |  |  |
| Propsed CSLA                            | .542       | 951    | 3.42  | .668 |  |  |

The tabulated values are graphically plotted below for comparison of the existing systems and the proposed system. Series

#### 1) Existing Vedic Multiplier



Figure 4: Performance analysis of proposed system vs existing systems

The overhead reduction achieved for computing Different Multiplier on analysis result, is visualized using a pie chart.



Figure 5: Computational Overhead

The values and the graph proves that this system is more efficient than any existing location calibration techniques based on the ground of fairness, adaptability, scalability and minimal cost for communication and calibration and has minimal computational overhead.

## 5. Conclusion

This entire work is based on creating a area delay and power efficient Vedic multiplier based on carry select adder. This paper presents a novel way of realizing a high speed multiplier using Urdhva Tiryagbhyam sutra and carry select addition technique. A 4-bit modified multiplier is designed. The 8-bit multiplier is realized using four 4-bit Vedic multipliers and proposed carry select adders. Carry select adders(CSLA) are modified such as all the redundant logic operations present in the conventional CSLA are eliminated and proposed a new logic formulation for CSLA. In the proposed scheme, the carry select (CS) operation is scheduled before the calculation of final-sum, which is different from the conventional approach. The proposed 8bit multiplier gives a total delay of 15.050 ns which is less when compared to the total delay of any other renowned multiplier architecture. Results also indicate a 13.65% increase in the speed when compared to normal Vedic multiplier without carry select adder technique.

# References

- [1] K.K.Parhi, VLSI Digital Signal Processing. New York, NY, USA: Wiley,1998.
- [2] Mrs. M. Ramalatha, Prof. D.Sridharan, "VLSI Based High Speed Karatsuba Multiplier for Cryptographic Applications Using Vedic Mathematics", IJSCI, 2007
- [3] Ramkumar B, Kittur H.M. (2012) ,"Low Power and Area Efficient Carry Select Adder," IEEE transactions on VLSI systems, Vol 20,No.2.
- [4] Sreeni vasulu P, Srinivasa Rao K., Vinay Babu A., (2012) "Energy And Area Efficient Carry Select Adder on a Reconfigurable Hardware" International Journal of Engineering Research and Application, Vol 2, Issue 2.
- [5] Jai Skand, Priya Keerthu, Deepthi Shakti, (2012) "An Efficient Design of Vedic Multiplier using New Encoding Scheme" International Journal on Computer Applications, Vol-53, No.11.
- [6] Anvesh kumar, Ashish Raman and Sarin, Arun Khosla R.K, (2010)"Small Area Reconfigurable FFT Design by Vedic Mathematics " in Proc IEEE ICAAE'10, Singapoure, vol 5, pp. 836-838,.
- [7] Tam Anh Chu, (2002) "Booth Multiplier with Low Power High Performance Input Circuitary", US Patent, 6.393.454 B1.
- [8] Gankhuyag G, Chan Mo Kim, Yong Beom Cho, (2008), "Multiplier Design based on ancient Indian Vedic Mathematics", in SOC Design Conference, Volume 2.
- [9] Y. Kim and L.-S. Kim, "64-bit carry-select adder with reduced area," Electron. Lett., vol. 37, no. 10, pp. 614– 615, May 2001.
- [10] Y. He, C. H. Chang, and J. Gu, "An area efficient 64-bit square root carry select adder for low power application," in Proc. IEEE Int. Symp. Circuits Syst., 2005, vol. 4, pp. 4082–4085.
- [11] Swami Bharati Krishna Tirtha, Vedic Mathematics. Delhi: Motilal Banarsidass Publishers, 1965.
- [12] D. Goldberg, "Computer Arithmetic", in Computer Architecture: A Quantitative Approach, J.L.Hennessy and D.A. Patterson ed., pp. A1-A66, San Mateo, CA: Morgan Kaufmann, 1990.
- [13] A.D. Booth, "A Signed Binary MultiplicationTechnique", Qrt.J.Mech.App. Math.,, vol. 4, pp.236–240, 1951.
- [14] A.P. Nicholas, K.R Williams, J. Pickles, "Application of Urdhava Sutra", Spiritual Study Group, Roorkee (India),1984.
- [15] Ming-Chen Wen, Sying-Jyan Wang, and Yen-Nan-Lin, .Low PowerParallel Multiplier with Column Bypassing., *Electronics letters*, 10,12 May 2005 Volume 41, Issue Page(s): 581 -583
- [16] Harpreet Singh Dhillon and Abhijit Mitra, "A Reduced-Bit Multiplication Algorithm for Digital Arithmetic's", International Journal of Computational and Mathematical Sciences 2;2 © www.waset.org Spring 2008.
- [17] Stephen Brown and Zvonko Vranesic, 2005. Fundamentals of Digital Logic with VHDL Design, 2<sup>nd</sup> Edition. McGraw-Hill Publishing Companies.
- [18] O. J. Bedrij, "Carry-select adder," IRE Trans. Electron. Comput., vol. EC-11, no. 3, pp. 340–344, Jun. 1962.

#### **Author Profile**



**Mr Sijo Mathew** received his B.Tech degree in Electronics & Communication from College of Engineering Kidangoor, Kottayam, Affiliated to Cochin University of Science And Technology(CUSAT). Currently he is persuing M.E

degree in VLSI design from Hindustan institute of Technology , Coimbatore, Affiliated to Anna University Chennai, Tamilnadu. His research

interests are Low power VLSI Design, FIR and IIR Filter Design, Network on chip design.



**Mr S. Chinnapparaj** received his B.E degree in Electronics & communication engineering from RVS College of Engineering & Technology, Dindigul, Affiliated to Anna University, Chennai, Tamilnadu and M.E degree in VLSI Design from Anna University,

Coimbatore TamilNadu. Currently He is pursuing Ph.D. in Anna University, Coimbatore and working as an Assistant Professor in the department of ECE at Hindustan institute of Technology, Coimbatore. His research interests are Low Power VLSI Design, Stegnography, Low Power Dissipation and High Fault Coverage.



**Dr. D. Somasundareswari** working as Professor and Dean in the department of Electronics & communication at SNS College of Technology, Coimbatore 35. She has more than 18 years experience in teaching and published 22 international journals. her

research interests are low power vlsi design, signal processing, digital design.