# Design of Advanced Configurable Radix-4 Booth Multiplier for Low Power and High Speed Applications

Sareddy Swathi<sup>1</sup>, P. Sandhya Rani<sup>2</sup>

<sup>1</sup>ECE, Bharat Institute of Technology and Science for Women, JNTUH University, Hyderabad, India

<sup>2</sup>ECE, Bharat Institute of Technology and Science for Women, JNTUH University, Hyderabad, India

Abstract: Many multimedia and DSP applications are highly multiplication intensive so that the performance and power consumption of these systems are dominated by multipliers. The computation of the multipliers manipulates two input data to generate many partial products for subsequent addition operations, which in the CMOS circuit design requires many switching activities. Thus, switching activity within the functional unit requires for majority of power consumption and also increases delay .this approach dynamically detects the input range of multipliers and disables the switching operation of non effective ranges. Therefore, minimizing the switching activities can effectively reduce power dissipation and increase the speed of operation without impacting the circuit's operational performance. Here attempt is made to combine configuration, partially guarded computation, and the truncation technique to design a high speed and power efficient configurable BM (CBM). The main concerns are speed, power efficiency and structural flexibility. The proposed multiplier not only perform single 16-b, single 8-b, or twin parallel 8-b multiplication operations but also offer a flexible tradeoff between output accuracy and power consumption to achieve more power savings. Portable multimedia and digital signal processing (DSP) systems, which typically require flexible processing ability, low power consumption, and short design cycle, have become increasingly popular over the past few years.

Keywords: Boothmultiplier (BM), configurable booth multiplier (CBM).

#### 1. Introduction

There are several techniques are available [1] - [3] to improve the speed and power efficiency is analyzed. Approaches termed guarded evaluation; clock gating, signal gating, truncation etc. reduce the power consumption and increase the speed of multipliers by eliminating spurious computations according to the dynamic range of the input operands. The work in [4] separated the arithmetic units into the most and least significant parts and turned off the most significant part when it did not affect the computation results to save power. Techniques in [5] that can dynamically adjust two voltage supplies based on the range of the incoming operands and disable ineffective ranges with a zero-detection circuitry were presented to decrease the power consumption of multipliers. In [6] a dynamic-range detector to detect the effective range of two operands was developed. The one with the smaller dynamic range is processed to generate booth encoding so that partial products have a greater opportunity to be zero, thereby reducing power consumption maximally. With this characteristic, significant power saving can be achieved by directly omitting the adder cells for computing the least significant bits of the output product, but large truncation errors are introduced. Various error compensation approaches and circuits, which add the estimated compensation carries to the carry inputs of the retained adder cells to reduce the truncation error. In the constant scheme [7], constant error compensation values were pre-computed and added to reduce the truncation error. On the contrary, data-dependent error compensation approaches [8] – [10] were developed to achieve better accuracy than that of the constant schemed were in data dependent error compensation values will be added to reduce the truncation error of array and Booth multipliers (BMs). Here, we attempt to combine

configuration, partially guarded computation, and the truncation technique to design a power-efficient configurable BM (CBM). Our main concerns are power efficiency and structural flexibility. Most common multimedia and DSP applications are based on 8–16-b operands, the proposed multiplier is designed to not only perform single 16-b but also performs single 8-b, or twin parallel 8-b multiplication operations. The experimental results demonstrate that the proposed multiplier can provide various configurable characteristics for multimedia and DSP systems and achieve more power savings with slight area overhead.

#### 2. Configurable Booth Multiplier Design

Figure 1 shows the block diagram of the proposed 16-b CBM. In this section, partially guarded computation and the truncation technique are integrated into the configurable multiplication to construct a 16-b low-power CBM [11]. The configuration signals are utilized to configure the operation of the proposed. Multiplier into six modes as shown. When CM [2:1] = 11 or 10, the single 16-b or single 8-b multiplication Design of advanced Configurable Radix4 Booth Multiplier For Low Power And High Speed Applications www.iosrjournals.org 32 | Page operation is performed. On the other hand, two parallel 8-b multiplication operations that satisfy the high throughput requirement are carried out if CM[2:1] = 00. The Bit CM[0] decides whether truncation has to be done or not, if it is 0 then truncation will be done through which more power saving and speed is obtained else the output product will not be truncated. Whenever truncation is done error compensation values will be added to maintain output precision.



Figure 1: Block diagram of the Configurable Booth Multiplier

#### 2.1 Dynamic Range Detector (DRD)

The proposed dynamic-range detector (*DRD*) in Figure 1 generates switching signals SWLH, SWHH, SWHL and SWLL for each 8-b Booth multiplication to pick the operand that leads more partial products to zero for Booth encoding. In addition to switching signals, *DRD* produces several extra shutdown signals including SDLH, SDHH, SDHL, and SDLL to dynamically disable the redundant computation of the multiplier by forcing unnecessary partial-product bits and carry propagations to zero based on the multiplication mode and the effective range of the input operands.

#### 2.1.1 Switching Logic

Switching logic for four 8-bit Booth multiplications whose input operands are A[15,8], B[15,8], A[7,0] and B[7,0]. If the output of a comparator is 1, it indicates that the input 3-

bit group is successive zeros or ones so that its Booth encoded product will be zero. Finally each operand is compared to generate the switching signal that is used to determine which operand is a multiplier. In our design, the input operands will be exchanged if the switching signal is one.

#### 2.1.2 Shutdown Logic

Given the multiplication mode and the effective range of the input operands, the shut down logic shown in Figure 3 produces shutdown signals SDLH, SDHH, SDHL and SDLL, to individually shut down AHBH, AHBL, ALBH and ALBL multiplications by setting the signals to be zero to dynamically disable the redundant computation of the multiplier by forcing unnecessary partial-product bits and carry propagations to zero based on the multiplication mode and the effective range of the input operands.



Figure 2: Shut down logic of the dynamic-range detector.

#### 2.2 Sign Bit Generator

If one of the input operands is zero, the entire operation of the configurable multiplier can be shut down to obtain more power savings by preventing input registers from loading new data and directly resetting the output registers to zero thereby increasing the speed of operation. Therefore, we develop an SBG as shown in Figure 4 to generate an SB, LZ and HZ and shut down the entire multiplier when one of the input operands is zero (clock gating technique [12]).



Figure 3: Sign Bit Generator

#### 2.3 Radix 4 Booth Encoding

Radix 2 booth algorithm does not work well when the multiplier has isolated ones. In such case the recorded of multiplier has more number one's when compared to the actual multiplier. So we group 3 bits for finding the recorded multiplier which will help to overcome the above said disadvantage. To multiply A by X, the Radix 4 Booth algorithm starts from grouping X by three bits and encoding into one of  $\{-2, -1, 0, 1, 2\}$ .

#### Table 1: Truth Table of Booth Encoding Scheme (Radix 4)

|      | 8      |        | 0          |
|------|--------|--------|------------|
| X(i) | X(i-1) | X(i-2) | У          |
| 0    | 0      | 0      | +0         |
| 0    | 0      | 1      | +y         |
| 0    | 1      | 0      | +y         |
| 0    | 1      | 1      | +2y<br>-2y |
| 1    | 0      | 0      | -2y        |
| 1    | 0      | 1      | -у         |
| 1    | 1      | 0      | -у         |
| 1    | 1      | 1      | +0         |

Table I Radix4 Modified Booth algorithm scheme for odd values of i

Table I shows the rules to generate the encoded signals by Radix 4 BE scheme. Then with these new multipliers multiplication is done by means of shifting and adding the multiplicand. For negative values 2's compliment is obtained.

#### 2.4 Truncation and Error Compensation Circuit

For fixed-width multiplication operation the least significant bits of the n-bit output product can be disabled to further reduce power consumption and reducing number of adders there by increasing the speed of operation. To incorporate into the proposed multiplier, the partial products of each 8-b Booth multiplication are divided into Higher part (HP), Middle part (MP), and Lower Part (LP), as shown in Figure 5(a). When truncation is performed, the partial products in LP are forced to zero. The partial products in MP are used as inputs to generate approximate carries as shown in Figure 5(b) which are added along with the carry inputs of the adder cells in HP to reduce the truncation error.

#### 2.5 16-Bit Multiplication Matrix

The total Multiplication expression is divided in to four sub expressions by using Divide and Conquer method then the expression is modified as AHBH, ALBL, AHBL and ALBH as shown in Figure 6. Where AH means A [15:8] and AL means A [7:0] and similarly for operand B. Four independent partial-product arrays are produced by using Radix-4 Booth Encoding approach.



Figure 4: Multiplication matrixes for 16-b multiplication

#### 2.6 Compressor and Adder

These partial products can be effectively reduced using Dadda tree compression techniques. In the compression algorithm each and every partial product is combined in groups of three and compressed in groups of 2 using full adder which is the 3:2 compressor. This process will be continued until all the partial products along with their carries are compressed. Thus the number of stages and the delay in those stages are reduced effectively using Dadda tree compression technique.

# 3. Experimental Results

Radix4 booth encoding for n=8 and n=16 and the proposed CBM for n=16 are designed in verilog HDL and their simulation are tabulated below and their simulation results were verified. These multipliers were synthesized by using Xilinx ISE 9.2i (and also Synopsys) design complier with TSMC 90nm CMOS standard cell technology library.



# 4. Conclusion

The experimental results have shown that the proposed multiplier outperforms the conventional multiplier both Radix 2 Booth multiplier and Radix 4 Booth multiplier in terms of power and speed of operation with enough accuracy at the expense of extra area.

# 5. Acknowledgements

I am grateful to the principal of Bharat Institute of technology and science for women M. Chennakeshav Reddy, Director Ch. Venugopal Reddy, HOD P. Rajani Kumari and my guide Ms. P. Sandhya Rani for providing excellent environment to work with ample facilities and academic freedom.

### References

- J. Choi, J. Jeon, and K. Choi, "Power minimization of function units by partially guarded computation," in Proc. Int. Symp. Low Power Electron. Des, Jul. 2000, pp. 131–136.
- [2] Fayed A and M. A. Bayoumi, "A novel architecture for low-power design of parallel multipliers," in Proc. IEEE Comput. Soc. Annu Workshop VLSI, Apr. 2001, pp. 149–154.
- [3] N. Honarmand and A. A. Kusha, "Low power minimization combinational multipliers using data-

# Volume 3 Issue 7, July 2014 www.ijsr.net

driven signal gating," in Proc. IEEE Int. Conf. Asia-Pacific Circuits Syst., Dec. 2006, pp. 1430–1433.

- [4] K.-H. Chen and Y.-S. Chu, "A spurious-power suppression technique for multimedia/DSP applications," IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 56, no. 1, pp. 132–143, Jan. 2009.
- [5] T. Yamanaka and V. G. Moshnyaga, "Reducing energy of digital multiplier by adjusting voltage supply to multiplicand variation," in Proc. 46th IEEE Midwest Symp. Circuits Syst., Dec. 2003, pp. 1423–1426.
- [6] N.-Y. Shen and O. T.-C. Chen, "Low-power multipliers by minimizing switching activities of partial products," in Proc. IEEE Int. Symp. Circuits Syst., May 2002, vol. 4, pp. 93–96.
- [7] M. J. Schulte and E. E. Swartzlander Jr., "Truncated multiplication with correction constant," in Proc. Workshop VLSI Signal Process., Oct.1993, pp. 388– 396.
- [8] S. J. Jou, M. H. Tsai, and Y. L. Tsao, "Low-error reduced-width Booth multipliers for DSP applications," IEEE Trans. Circuits Syst. I, Fundam. Theory Appl., vol. 50, no. 11, pp. 1470–1474, Nov.2003.
- [9] K. J. Cho, K. C. Lee, J. G. Chung, and K. K. Parhi, "Design of lowerror fixed-width modified booth multiplier," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 12, no. 5, pp. 522–531, May2004.
- [10] T.-B. Juang and S.-F. Hsiao, "Low-power carry-free fixed-width multipliers with low-cost compensation circuit," IEEE Trans.Circuits Syst.II, Analog Digit. Signal Process. vol. 52, no. 6, pp. 299–303, Jun. 2005.
- [11] Shiann-Rong Kuang and Jiun-Ping Wang "Design of power efficient configurable booth multiplier" IEEE Trans. Circuits Syst. I Regular Papers vol. 57, no.3, pp. 568-580, March 2010.
- [12] T. Kitahara, F. Minami, T. Ueda, K. Usami, S. Nishio, M. Murakata, and T. Mitsuhashi, "A clock -gating method for lowpower LSI design," in Proc. Int. Symp. Low Power Electron. Des, Feb. 1998, pp. 07–312.

# **Author Profile**



**S. Swathi** She has got B.tech degree from Sreekavitha Engineering College, Karepally, Khammam, pursuing M.Tech from Bharath Institute of Technology and Science for Women, Hyderabad, India



**Ms. P. Sandhya Rani**, Assistant Professor, in ECE department, She has got her M. Tech (VLSI System Design) JNTUH University, In 2012, She is working as a Assistant Professor in Bharat Institute of Technology and Science for Women, Hyderabad, India