Abstract: In this paper, we develop a new methodology for designing a lower-error and area efficient 2’s complement fixed-width Booth multiplier that receives two n-bit numbers and produces an n-bit product. By properly choosing the generalized index and binary thresholding, we derive a better error-compensation bias to reduce the truncation error. Since the proposed error compensation bias is realizable, constructing low-error fixed width Booth multiplier is area and time efficient for VLSI implementation. Finally, we successfully apply the proposed fixed-width Booth multiplier to FIR filter. The simulation results show that the performance is superior to by using the direct-truncation fixed-width Booth multiplier.

Keywords: Modified Booth Multiplier, Booth Encoder, partial product, FIR, Signed-unsigned.

1. Introduction

Booth's multiplication algorithm is an algorithm which multiplies two signed binary numbers in two's complement notation. The algorithm was invented by Andrew Donald Booth in 1950 while doing research on crystallography at Birkbeck College in Bloomsbury, London. Booth used desk calculators that were faster at shifting than adding and created the algorithm to increase their speed. Booth's algorithm is of interest in the study of computer architecture. It can be used for both signed-magnitude numbers as well as 2’s complement numbers with no need for a correction term or a correction step.

This technique has the advantage of reducing the number of partial products by one half regardless of inputs. The encoding is performed within two steps: encoding and selection. The purpose of the encoding is to scan the triplet of bits of the multiplier and define the operation to be performed on the multiplicand, as shown in fig.1. This method is actually an application of a sign-digit representation in radix-4 [1]. The Booth-MacSorley algorithm, usually called the Modified booth algorithm or simply the booth algorithm. For example, a 3-bit recording would require the following set of digits to be multiplied by the multiplicand: 0, ±1, ±2, ±3. The difficulty lies in the fact that ±3Y is computed by summing (or subtracting) 1 to ±2Y, which means that a carry propagation occurs. The delay caused by the carry propagation renders this scheme to be slower than a conventional one. Consequently, only the 2 bit Booth recording is used.

2. Conventional Modified Booth Multiplier

Booth's algorithm examines adjacent pairs of bits of the N-bit multiplier \( Y \) in signed two's complement representation, including an implicit bit below the least significant bit, \( y_{-1} = 0 \). For each bit \( y_i \) for \( i \) running from 0 to \( N-1 \), the bits \( y_i \) and \( y_{i-1} \) are considered. Where these two bits are equal, the product accumulator \( P \) is left unchanged. Where \( y_i = 0 \) and \( y_{i-1} = 1 \), the multiplicand times \( 2^i \) is added to \( P \); and where \( y_i = 1 \) and \( y_{i-1} = 0 \), the multiplicand times \( 2^i \) is subtracted from \( P \). The final value of \( P \) is the signed product. Encoder and partial product generation circuit is shown in 2(a) and 2(b), corresponding truth table in table1.

The representation of the multiplicand and product are not specified; typically, these are both also in two's complement representation, like the multiplier, but any number system that supports addition and subtraction will work as well. As stated here, the order of the steps is not determined. Typically, it proceeds from LSB to MSB, starting at \( i = 0 \); the multiplication by \( 2^i \) is then typically replaced by incremental shifting of the \( P \) accumulator to the right between steps; low bits can be shifted out, and subsequent additions and subtractions can then be done just on the highest \( N \) bits of \( P \) [1].

![Figure 1: Implementation of modified booth recording](image1.png)
3. Existing Method

Multipliers are key components of many high performance systems such as FIR filters, microprocessors, digital signal processors, etc. A system’s performance is generally determined by the performance of the multiplier because the multiplier is generally the slowest element in the system. Furthermore, it is generally the most area consuming. Hence, optimizing the speed and area of the multiplier is a major design issue. However, area and speed are usually conflicting constraints so that improving speed results mostly in larger areas. In parallel multipliers, high performance can be achieved by using modified Booth encoding, which decreases the number of partial products by a factor of 2 through multiplier recording.

Fixed word size in lossy systems that allow small accuracy loss to output data, can be maintained by using n×n fixed width multipliers which result in only n most significant product bits. Complexity reduction in hardware and significant area and power savings can be achieved by removing the adder cells of standard multiplier for the computation of the n less significant bits of 2n-bit output product. However, the fallout being high truncation error will be introduced into the system to which direct-truncated fixed-width multiplier. Variety of error compensation methods that add estimated compensation value to inputs of the reserved adder cells, which effectively arrests the truncation error. As it is universally known that error compensation value can be derived by the Adaptive/Constant scheme [2]. Regardless of current input data value influence, the Constant Scheme arrives to constant error compensation value and same is used to carry inputs of the retained adder cells while performing multiplication operations. Less hardware results in a relatively large truncation error to the Constant Scheme.

Adaptive scheme has been designed such that high accuracy is achieved by adjusting the compensation value according to the input data at the cost of higher hardware complexity. However, most of the adaptive error compensations are developed only for Fixed-width array multipliers & has less influence on reduction of truncation error for a fixed width modified booth multipliers directly. Various error compensation approaches, have been proposed to effectively scale down the truncation errors of Fixed-width modified Booth Multiplier at the cost of hardware complexity. Simple error compensation circuit results in better error performance and area with booth encoded outputs as inputs to generate the error compensation value.

The Modified Booth multiplier is an extension of Booth’s multiplier. In Modified Booth, the number of partial products reduced by N/2, that is half of total partial products as compare to simple multiplication process [3]. So, clearly if the number of partial products becomes reduced, the area of the multiplier also will reduce and automatically as the result of it, the speed will increased. So, this multiplier is more efficient and Comparison shown in table 2.

### Table 2: Comparison of sign-magnitude number multiplication with and without booth encoding

<table>
<thead>
<tr>
<th></th>
<th>Booth encoding</th>
<th>No Booth encoding</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Internal Representation:</strong></td>
<td>2’s complement (some partial products need to be subtracted)</td>
<td>Sign magnitude (all the partial products are positive)</td>
</tr>
<tr>
<td><strong>Hardware for encoding and selection</strong></td>
<td>One row of 4:2 compressor</td>
<td>Only one XOR is used to compute the sign in parallel</td>
</tr>
<tr>
<td><strong>Sign extension</strong></td>
<td>2 extra bits (sign extension and complementation)</td>
<td>No extra bits</td>
</tr>
<tr>
<td><strong>The normalization requires some Leading Zero Detectors and Leading One Detectors</strong></td>
<td>The normalization and even the rounding is easy</td>
<td>The simplicity of the schematic allows a highly regular layout</td>
</tr>
<tr>
<td><strong>1 XOR + 1 AND (encoding), 2 XOR+1 AND (multiplexer)</strong></td>
<td>1 AND (partial product generation), 3 XOR (4:2 compressor)</td>
<td>Total: 3 XOR +2 AND</td>
</tr>
<tr>
<td><strong>Total: 3 XOR +2 AND</strong></td>
<td></td>
<td>Total: 3 XOR +1 AND</td>
</tr>
</tbody>
</table>

3.1 Modified Booth Encoder (MBE)

Modified Booth encoding is most often used to avoid variable size partial product arrays. Before designing a MBE, the multiplier B has to be converted into a Radix-4 number by dividing them into three digits respectively according to Booth Encoder, Prior to convert the multiplier, a zero is appended into the Least Significant Bit (LSB) of the multiplier [4]. Table 3 shows the truth table for a booth encoder with two’s, one’s and correction bit [4]-[6].
The operand that is booth encoded is called multiplier, and the other operand is called multiplicand. If a 3-bit binary input sequence is given at the input, and perform the operation as mentioned in front of it [4], the partial products will be generated. As mentioned in fig.3, there are total 3 combinations for generating a partial product i.e., the obtained partial product is dependent upon the combination of three binary bits of the multiplier.

4. Proposed Method

In Booth multiplier Multiplication operation when considered between multiplicand and the multiplier namely A & B respectively, representation of which is depicted as below:

Let us consider the multiplication operation of two -bit signed numbers $A = a_{n-1}a_{n-2} ... a_0$(multiplicand) and $B = b_{n-1}b_{n-2} ... b_0$(multiplier). The two’s complement representations of A and B can be expressed as follows:

$$A = -a_{n-1}2^{n-1} + \sum_{i=0}^{n-2} a_i 2^i,$$

$$B = -b_{n-1}2^{n-1} + \sum_{i=0}^{n-2} b_i 2^i ... ... (1)$$

Modified Booth encoding groups the multiplier bits into triplets, and are encoded as given in table II, then B can be expressed as:

$$B = \sum_{i=0}^{n-1} M_i 2^{2i} = \sum_{i=0}^{n-1}(-2b_{2i+1} + b_{2i} + b_{2i-1})2^{2i} ... ... (2)$$

Where $b-1 = 0$ and $K_j$ belongs to the set of values {-2, -1, 0, 1, 2}.

According to the Modified Booth encoding table the existing booth encoder and partial product generation circuit are shown in figure 4 (a) and figure 4 (b) respectively.

With relation to negation operation, each bit of multiplicand $A$ is complemented and an extra binary value „1” is added to least significant bit pertaining to next partial product row. To implement correction bit $C_j$ addition of, „1” is being used and thereby indicating the partial product row $PP_j$ as positive ($C_j=0$) or negative ($C_j=1$). As each partial product row is represented in two’s complementation, the sign bit for $PP_j$ each must be properly extended up to the $(2n-1)^{th}$bit position. Many ways [14] are proposed to give a solution to eliminate the problem of sign extension bits as they affect the performance and power consumption of the parallel multiplier with large values of $n$. The partial product matrix of an 88 modified booth multiplier with sign extension elimination technique is illustrated in Fig 5, where in $PP_j$ and $S_j$ denote the $K^{th}$ product bit and the sign bit of partial product row $PP_j$, respectively.

4.1 Fixed-Width Modified Booth Multiplier

In the post-truncated modified Booth multiplier (PTM) the with the addition of an extra „1”(carry value by $LSP_{min}$) at the $(n-1)$th bit position of partial product matrix as depicted in Fig 6 (including „1”) then outputs the highly significant $n$ product bits.

Table 3: Modified booth encoding table

<table>
<thead>
<tr>
<th>Booth Encoder i/p $b_{2i+1}b_{2i}b_{2i-1}$</th>
<th>Operation</th>
<th>Booth encoder o/p $N_j t_j o_j Z_j C_j$</th>
</tr>
</thead>
<tbody>
<tr>
<td>0  0  0</td>
<td>+0</td>
<td>0  0  0  10</td>
</tr>
<tr>
<td>0  0  1</td>
<td>+A</td>
<td>0  0  1 00</td>
</tr>
<tr>
<td>0  1  0</td>
<td>+A</td>
<td>0  0  1 00</td>
</tr>
<tr>
<td>0  1  1</td>
<td>+2A</td>
<td>0  1  0 00</td>
</tr>
<tr>
<td>1  0  0</td>
<td>-2A</td>
<td>1  1  0 01</td>
</tr>
<tr>
<td>1  0  1</td>
<td>-A</td>
<td>1  0  1 01</td>
</tr>
<tr>
<td>1  1  0</td>
<td>-A</td>
<td>1  0  1 01</td>
</tr>
<tr>
<td>1  1  1</td>
<td>-0</td>
<td>1  0  0 10</td>
</tr>
</tbody>
</table>

Figure 4(a): Modified Booth encoder

Figure 4(b): Partial product generation circuit
4.2 Multiplier in FIR Filter

As the complexity of digital filters is dominated by the number of multiplications, many works have focused on minimizing the complexity of multiplier blocks that compute the constant coefficient multiplications required in filters. The problem of designing FIR filters has received great attention during the last decade, as the filters are suffering from a large number of multiplications, leading to excessive area and power consumption even if implemented in full custom integrated circuits [8]. Early works have focused on replacing multiplications by decomposing them into simple operations such as addition, subtraction and shifting shown in Fig. 7.

As the coefficients of an application specific filter are constant, the decomposition is more efficient than employing multipliers. The complexity of FIR filters in this case is dominated by the number of additions/subtractions used to implement the coefficient multiplications.

5. Results and Observations

6. Conclusion

In this paper, we present a 8-bit×8-bit advanced multiplier capable of carrying out both signed and unsigned operations. The proposed unified signed/unsigned multiplier was optimized in terms of speed, power consumption and silicon area by exploiting more regular partial product array, developing more efficient compression methods and combining several types of fast adders. Booth multiplier in FIR gives efficient results than FIR filter using array multiplier.

References


Author Profile

B. Sireesha obtained her B.Tech degree in Electronics and Communication Engineering from Don Bosco institute of technology and science, Guntur, Andhra Pradesh, India in 2012 and she is currently pursuing M.Tech in VLSI Design from Hindustan University, Chennai, Tamilnadu. Her field of interest includes high speed and low power VLSI architecture, Embedded Systems and DSP applications.

Diana Aloshius received her B.E. degree in Electronics and Instrumentation Engineering from Bharathiar University, India in 2004 and M.E. degree in Applied Electronics from Anna University, India in 2006. Currently, she is working as an Assistant Professor in the Department of Electronics and Communication Engineering, Hindustan University, Chennai. Her research interests include Digital Circuits and logic design, Low power VLSI and Reversible logic and synthesis.