Design of High Speed 32 Bit Multiplier Architecture Using Vedic Mathematics and Compressors

Deepak Kurmi¹, V. B. Baru²

¹PG Student, E&TC Department, Sinhgad College of Engineering, Pune, Maharashtra, India
²Associate Professor, E&TC Department, Sinhgad College of Engineering, Pune, Maharashtra, India

Abstract: Multiplier unit is the key block of digital signal processors as well as general purpose processors that substantially decide the speed of processor. Design of high speed multiplier is need of the day. This paper introduces a high speed multiplier architecture using Vedic mathematics Urdhwa-Tiryakbhyam sutra, however speed of multiplier greatly depends upon the addition of partial products. To further increase the speed of multiplier a novel approach of 4:2 and 7:2 compressors has been used, these compressors are very efficient in terms of speed of addition and require lower gate count. Vedic mathematics, compressors and reconfigurable multiplication architecture has been used to implement high speed 32 bit multiplier. The delay of 32 bit proposed multiplier is 44.249 ns. Upon comparison, the proposed multiplier is 1.5 times faster than existing Vedic multiplier and almost 2 times faster than conventional and booth multiplier. The architecture has been implemented using Verilog language and the tool used for simulation is Xilinx ISE 14.5.

Keywords: VLSI, FPGA, Compressors, Vedic Mathematics.

1. Introduction

Enhancing the speed of multiplication is indispensable for current high performance digital signal processors and general purpose processors. High speed multiplication is becoming one of the key operations in signal processing and RISCs. Several multiplier architectures have been introduced over the past few decades such as booth’s multiplier [7] and conventional multiplier and these multiplier are very popular in modern VLSI design. These algorithm have very time consuming processes such as addition, shifting and subtraction which requires a large number of steps before arriving the final answer also these steps reduce the speed exponentially with growing number of bits in multiplicand and multiplier. This demands a very efficient architecture of multiplier.

A novel multiplier architecture based on Vedic mathematics has been explored to address the disadvantages associated with existing multiplier architecture. Vedic mathematics is ancient system of mathematics that was reintroduced by Bharati Krishna Tirthaji Maharaj [3]. He was the scholar of Sanskrit, mathematics and philosophy. Bharati Krishna Tirthji Maharaj simplified various complex mathematical problems in 16 Sutras and 13 sub sutras which deals with trigonometry, algebra, Geometry and has various applications in signal processing, control engineering and VLSI. One of the primacy of Urdhwa Tiryakbhyam sutra is that all the partial products are obtained simultaneously which efficiently increase the speed of multiplication. As addition of Partial products consumes most of the time in multiplication, 4:2 and 7:2 compressors [1] have been introduced in this paper. Compressors are nothing but a coherent architecture for addition of more than 3 bits simultaneously. The novel compressor architecture introduced in this paper requires a few gates as compared to full adder based compressor. First 8 bit and 16 bit multiplier have been implemented using Urdhwa Triyakbhyam sutra [7] and compressors. To further design high speed multiplier architecture reconfigurable multiplier technique has been used.

2. Vedic Mathematics

Vedic mathematics is the part of Sthapatya Veda which is an up Veda of Atharwa Veda [14]. Bharati Krishna Tirthji maharaj (1884-1960) thoroughly studied Vedas and introduced Vedic mathematics in 16 sutras and 13 up sutras which covers every area of mathematics. To be precise Vedic mathematics is the composition of very simplified methods to solve various complex mathematical problems. Vedic mathematics is not magic but a logical way to look into mathematics and all sutras in Vedic mathematics are purely logical. All sixteen sutras are given below:

1. Ekadhikena Purvena
2. Nikhilam navatascaramam Dasatah
3. Urdhva - tiryagbhyam
4. Paravartya Yojayet
5. Sunyam Samya Samuccaye
6. Anurupyne - Sunyamanyat
7. Sankalana - Vyavakalanabhyam
8. Puranapuranabhyam
9. Calana - Kalanabhyam
10. Ekanyunena Purvena
11. Anurupyena
12. Adyamadyenantya - mantyena
13. Yavadunam Tavadunikrtya Varganca Yojayet
14. Antyayor Dasakepi
15. Antyayoreva
2.1 Urdhwa Tiryakbhyam Sutra

Urdhwa Tiryakbhyam Sutra is one of the sutras of Vedic mathematics and provides a very simple way to multiply two decimal numbers. Same technique has been used to multiply two binary numbers so that algorithm can be implemented in a digital systems [14]. The algorithm works very efficiently in multiplying two binary numbers as well. Urdhwa Tiryakbhyam is the Sanskrit word which means vertical and crosswise respectively. As all the partial products calculated in parallel and require only AND GATE so the multiplier based on this method is independent of processor frequency. Figure 1 explains the multiplication of two 8 bit binary numbers using Urdhwa Tiryakbhyam Sutra.

![Figure 1: Pictorial representation of Urdhwa Tiryakbhyam sutra for 2 8-bit binary multiplication [1]](image)

Figure 1 shows the algorithm to multiply two 8 bit binary numbers using Urdhwa Tiryakbhyam Sutra. Each arrow in fig 1 represents one partial product thus 64 partial products will be required for two 8 bit binary multiplication. In fig 1 left most point is LSB and right most point is MSB.

3. Compressor Architecture

Compressor is the digital architecture for addition of more than 3 bits simultaneously. Various Compressor architecture 3:2, 4:2, 5:2 and 7:2 are available, due to its property to reduce larger number of bits into smaller one this architecture is called compressor. In this paper a novel approach to make 4:2 and 7:2 compressor has been used and compared with full adder based compressor.

3.1 4:2 Compressor

4:2 compressor consist of 5 inputs one is carry and 4 input bits and gives three output bits [2]. A general block diagram of 4:2 compressor has been shown in fig 2. Various approaches have been proposed to improve the speed of compressor.

![Figure 2: Block representation of 4:2 compressor](image)

An optimized 4:2 compressor architecture has been proposed in this paper to reduce the critical path of compressor. The optimized logic diagram of 4:2 compressor has been shown in fig 3, upon comparison of proposed compressor architecture with full adder based 4:2 compressor that has been shown in fig 4 it can be seen that the propagation delay of proposed multiplier is low.

![Figure 3: Full adder based 4:2 compressor](image)

![Figure 4: Optimized 4:2 compressor architecture [2]](image)

Full adder based compressor has large number of XOR GATE and each XOR gate needs 7 transistor to implement it furthermore this increases power consumption of design comparing this to proposed compressor where full adder has been implemented using MUX and XOR and complementary pass transistor logic (CPL) further reduce the delay and power consumption.

3.2 7:2 Compressor

Similar to 4:2 compressor 7:2 compressor is able to add 9 bits (2 carry bits from previous stage and 7 input bits) simultaneously. 7:2 compressor has been implemented using
two 4:2 compressor, two full adder and one half adder and the architecture of same has been shown in fig 5 [1].

<table>
<thead>
<tr>
<th>P</th>
<th>X</th>
<th>C</th>
<th>S</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>X0</td>
<td>C0</td>
<td>S0</td>
</tr>
<tr>
<td>1</td>
<td>X1</td>
<td>C1</td>
<td>S1</td>
</tr>
<tr>
<td>2</td>
<td>X2</td>
<td>C2</td>
<td>S2</td>
</tr>
</tbody>
</table>

Figure 5: 7:2 compressor using 4:2 compressor, half adder and full adder [1]

7:2 compressor has also been coded in Verilog and tested in Xilinx ISE design suit 14.5. It has been seen that the proposed 7:2 compressor is 1.05 times faster than the conventional 7:2 compressor.

4. Proposed Methodology

In this paper 8 bit high speed multiplier architecture has been implemented using Vedic mathematics (Urdhwa Tiryakbhyam Sutra), proposed 4:2 and 7:2 compressors have been introduced for the addition of partial products, this approach significantly increase the speed of Vedic multiplier. Significance of 4:2 and 7:2 compressors has already been shown in previous section.

Further high speed 32 bit multiplier is designed using Urdhwa Tiryakbhyam, compressors and reconfigurable multiplication method.

4.1 8 Bit and 16 Bit Multiplier Architecture

8 bit multiplier architecture is implemented by Urdhwa Tiryakbhyam suthra. Now Let us consider two 8 bit multiplier and multiplicand X,Y and Y has to be multiplied this can be achieved by Urdhwa Tiryakbhyam method discussed in section 3. The following equation shows the procedure to generate all partial products. After that all partial product has been added using proposed compressors, full adder and half adder.

\[
P_0 = X_0Y_0 \quad (1) \\
P_1 = X_0Y_1 + X_1Y_0 + C_1 \quad (2) \\
P_2 = X_2Y_0 + X_0Y_2 + X_1Y_1 + C_2 \quad (3)
\]

Equations (1) to (16) denote all the vertical and crosswise partial products and their addition according to Urdhwa Tiryakbhyam C1 to C38 are carries. P15-P9 denote 16 bit output where P9 is LSB and P15 is MSB.

16 bit high speed multiplier has also been implemented using Urdhwa Tiryakbhyam Sutra and using above mentioned equations. Compressors and look ahead carry adder has been used to add all partial products. A significant improvement has been observed in 16 bit multiplier as compared to conventional multiplier. Both 8 bit and 16 bit multiplier has been coded in Verilog language and synthesized in Xilinx ISE 14.5.

4.2 32-Bit High Speed Multiplier Architecture

However a different approach of multiplication with use of reconfigurable multiplier [13] in combination with Urdhwa Tiryakbhyam and compressors has been used to design a very efficient architecture for 32 bit multiplication.

To implement 32 bit multiplier A and B defined as the 2n-bits wide multiplicand and multiplier, respectively. Ai, Bi are their respective n most significant bits whereas Ai, Bi are their respective n least significant bits. Ai * Bi, Ai, Ai * Bi is the crosswise products. The product of A and B can be expressed as follows:

\[
\text{PRODUCT (32 BIT)} = (A_{15}*B_{16}) 2^{32} + (A_{14}*B_{16}) 2^{31} + A_{13}*B_{16} \quad [13]
\]

Fig 6 describes the methodology used to implement 32 bit multiplier that is very efficient in terms of its speed. All the 16 bit and 8 bit multiplication have been implemented using Vedic Mathematics Urdhwa Tiryakbhyam Sutra. Final product of 32 bit multiplier has been calculated using above mentioned reconfigurable multiplier formulae.
32 bit multiplier is coded in Verilog HDL (Hardware Description Language). Logic synthesis and simulation was done using EDA (Electronic Design Automation) tool in XilinxISE14.5 - Project Navigator and ISim simulator integrated in the Xilinx package. Spartan6 device XC6SLX16 has been used for synthesis in Xilinx.

Various popular multiplier such Vedic Multiplier, booth Multiplier and Conventional multiplier has also been implemented in Verilog HDL and compared with proposed 32 bit multiplier in terms of logic delay and route delay. A significant performance has been seen in proposed multiplier. All the results are tabulated in Table 1

<table>
<thead>
<tr>
<th>Architecture</th>
<th>Proposed 32 multiplier</th>
<th>Vedic Multiplier</th>
<th>Booth's Multiplier</th>
</tr>
</thead>
<tbody>
<tr>
<td>Total Delay(ns)</td>
<td>44.249</td>
<td>63.120</td>
<td>128.068</td>
</tr>
<tr>
<td>Logic Delay(ns)</td>
<td>11.121</td>
<td>14.998</td>
<td>42.342</td>
</tr>
<tr>
<td>Route Delay(ns)</td>
<td>33.128</td>
<td>48.122</td>
<td>85.726</td>
</tr>
<tr>
<td>Logic level(ns)</td>
<td>38</td>
<td>57</td>
<td>75</td>
</tr>
</tbody>
</table>

From Table 1 it is evident that the proposed multiplier has reduction in total delay and logic level upon comparison with various multiplier it has been seen proposed multiplier is almost 2 times faster than Booth’s multiplier and 1.5 times faster than conventional Vedic multiplier.

6. Conclusion

This paper presents a novel way of realizing 32 bit high speed multiplier using various research methods such as Urdhwa Tiryakbhyam sutra, compressors and reconfigurable multiplication method, methodology to implement same has also been discussed. As speed is major concern proposed multiplier is best in speed. As future work multiplier performance can be tested in ALU and can be compared with other existing multiplier.

References


