VLSI Implementation of Parallel Prefix Subtractor using Modified 2’s Complement Technique and BIST Verification using LFSR Technique

Malti Kumari¹, Vipin Gupta², Gaurav K Jindal³

¹M.Tech Student Suresh Gyan Vihar University, Jaipur, Rajasthan, India
²Assistant Professor, Suresh Gyan Vihar University, Jaipur, Rajasthan, India
³Design Engineer, Priganik Technologies, Jaipur, Rajasthan, India

Abstract: Parallel prefix Subtractor is the most flexible and widely used for binary addition/subtraction. Parallel Prefix Subtractor is best suited for VLSI implementation. No any special parallel prefix Subtractor structures have been proposed over the past years intended to optimize area. This paper presents a new approach to new design the basic operators used in parallel prefix architectures which subtract the unit by using modified technique of 2’s complement. Verification will also be done using LFSR technique so we don’t need to apply any manually input to perform the subtraction process. We can analysis and create the difference in terms of area between parallel prefix Subtractor and BIST architecture of parallel prefix Subtractor. The number of multiplexers contained in each Slice of an FPGA is considered here for the redesign of the basic operators used in parallel prefix tree. The experimental results indicate that the new approach of basic operators make some of the parallel Prefix Subtractor architectures faster and area efficient.

Keywords: Parallel Prefix Adder, 2s complement, Optimize Area, BIST architecture, LFSR Approach.

1. Introduction

1.1 Parallel-Prefix Addition Basics

The parallel prefix is the discriminating component in most computerized circuit plans including advanced sign processors (DSP) and microchip information way units. Thusly, broad exploration keeps on being centered around enhancing the force delay execution of the viper. In VLSI usage, parallel-prefix adders are known to have the best execution. Reconfigurable rationale, for example, Field Programmable Gate Arrays (FPGA) has been picking up in notoriety as of late in light of the fact that it offers enhanced execution regarding speed and control over DSP-based and chip-based answers for some handy outlines including versatile DSP and information transfers applications and a noteworthy decrease being developed time and cost over Application Specific Integrated Circuit (ASIC) plans. The force point of interest is particularly imperative with the developing ubiquity of portable and versatile hardware, which make far reaching utilization of DSP capacities. Then again, in view of the structure of the configurable rationale and directing assets in Fpgas, parallel-prefix adders will have an alternate execution than VLSI executions. Specifically, most advanced Fpgas utilize a quick convey chain which advances the convey way for the basic Ripple Carry Adder (RCA). In this paper, the viable issues included in planning and actualizing tree-built adders in light of Fpgas are. A proficient testing technique for assessing the execution of these adders is examined. A few tree-based viper structures are actualized and described on a FPGA and contrasted and the Ripple Carry Adder (RCA) and the Carry Skip Adder (CSA). At last, a few conclusions and proposals for enhancing FPGA plans to empower better tree-based viper execution are given.

2. Related Work

Xing and Yu noted that postpone models and expense investigation for viper outlines produced for VLSI engineering don't delineate to FPGA plans. They thought about the outline of the swell convey viper with the convey lookahead, convey skip, and convey select adders on the Xilinx 4000 arrangement Fpgas. Just an enhanced type of the convey skip viper performed better than the swell convey snake when the viper operands were over 56 bits. An investigation of adders actualized on the Xilinx Virtex II yielded comparable results [9]. In [10], the creators considered a few parallel prefix adders actualized on a Xilinx Virtex 5fpga. It is observed that the basic RCA viper is better than the parallel prefix outlines on the grounds that the RCA can exploit the quick convey chain on the FPGA. Kogge-Stone The Kogge-Stone tree [22] Figures 1-5 accomplishes both log2n stages and fan-out of 2 at each one stage. This takes on at the expense of long wires that must be directed between stages. The tree additionally contains more PG cells; while this may not affect the range if the viper design is on a consistent lattice, it will expand power utilization. Regardless of these expense, Kogge-Stone viper is for the most part used for wide adders because it shows the lowest delay among other structures.
Figure 1: The Parallel Prefix addition

An alternate convey tree snake known as the crossing tree convey look ahead (CLA) viper is likewise inspected [6]. Like the inadequate Kogge-Stone viper, this outline ends with a 4-bit RCA. As the FPGA utilizes a quick convey chain for the RCA, it is fascinating to contrast the execution of this snake and the scanty Kogge-Stone and consistent Kogge-Stone adders. Likewise of enthusiasm for the spreading over tree CLA is its testability characteristics [7].

Figure 2: 128-bit Kogge-Stone adder

Figure 3: Spanning Tree Carry Lookahead Adder (16 bit)

3. New Approach 2’S Compliment

Let us consider the multiplier data A to be used with the negative partial product factors. To calculate the 2’s complement first is to inverse all the bits of the data A denoting them as Abar. Now perform "Exclusive OR" (XOR) operation on Abar(0) with 1'b1, Abar(1) xor Abar(0), Abar(2) xor Abar(1) and so on till a 1'b0 is found while traversing the data bits A(i). Once 1'b0 is arrived keep the remaining bits as it is without any change.

Let us consider an example where A=10101000, then 2’s complement of A be denoted as A2_c_bar, then

Step 1: Abar=01010111.
Step 2:
A2_c_bar (0) = 1 xor 1 = 0
A2_c_bar (1) = 1 xor 1 = 0
A2_c_bar (2) = 1 xor 1 = 0
A2_c_bar (3) = 1 xor 0 = 1
A2_c_bar (4) = A'4 = 1
A2_c_bar (5) = A'5 = 0
A2_c_bar (6) = A'6 = 1
A2_c_bar (7) = A'7 = 0

4. Parallel Prefix Subtractor

Given figure 4 represents the subtraction part using parallel prefix Subtractor using modified approach of 2’s compliment. Output analysis of this approach will be explained in detail in this paper in result section.

Figure 4: Parallel Prefix Subtractor

5. BIST Approach

Built in self test architecture, which analysis the on chip verification of Circuit under Test (CUT). We do not need to apply any input for any input drivers. Figure 5 & Figure 6 represent the logical architecture bist capability using LFSR technique for any circuit in vlsi design.
6. BIST approach of Parallel prefix Subtractor (PPS)

6.1 Conventional parallel prefix Subtractor

Input a = 0001000100010001 (4'h1111)
Input b = 0001000100010001 (4'h1111)
Output: = 0000000000000000 (4'h0000)

Area analysis
Logic Utilization:
- Number of 4 input LUTs: 48 out of 3,940 (1%)
- Logic Distribution:
  - Number of occupied Slices: 31 out of 31,921 (1%)
  - Number of slices containing only related logic: 81 out of 31,904 (0.26%)
  - Number of slices containing unrelated logic: 0 out of 31,921 (0.00%)

*See NOTE below for an explanation of the effect of unrelated logic

Total Number of input LUTs: 48 out of 3,940 (1%)
Number used as logic: 48
Number used as a route-thru: 2
Number of bonded IOEs: 43 out of 141 (30%)

Total equivalent gate count for design: 366
Additional VHDL gate count for IOEs: 16,852
Peak memory usage: 99 KB

Paper ID: SEP14104

Figure 5: LFSR Circuit

Figure 6: BIST approach

Figure 7: BIST with PPS

Figure 8: Functional simulation waveform

Figure 9: RTL view of conventional (Top design)

Figure 10: RTL view of conventional (Internal design)

Figure 11: Area description of conventional design
Timing analysis

Timing analysis requires last place check. 

Maximum combinational path delay: 27.116ns

Timing Detail:

All values displayed in nanoseconds (ns)

Timing constraint: Default path analysis

delay: 27.116ns (Levels of Logic = 19)

Source: Bch (FAB)

Destination: Cont (FAB)

Data Path: Bch to Cont

<table>
<thead>
<tr>
<th>Cell</th>
<th>Input</th>
<th>fanout</th>
<th>Delay</th>
<th>Delay</th>
<th>Logical Name (Net Name)</th>
</tr>
</thead>
<tbody>
<tr>
<td>LUT4:13-30</td>
<td>2</td>
<td>0.720</td>
<td>0.465</td>
<td>0.000</td>
<td>...</td>
</tr>
<tr>
<td>LUT3:12-12</td>
<td>2</td>
<td>0.720</td>
<td>0.465</td>
<td>0.000</td>
<td>...</td>
</tr>
<tr>
<td>LUT3:12-10</td>
<td>2</td>
<td>0.720</td>
<td>0.465</td>
<td>0.000</td>
<td>...</td>
</tr>
<tr>
<td>LUT3:12-8</td>
<td>2</td>
<td>0.720</td>
<td>0.465</td>
<td>0.000</td>
<td>...</td>
</tr>
<tr>
<td>LUT3:12-6</td>
<td>2</td>
<td>0.720</td>
<td>0.465</td>
<td>0.000</td>
<td>...</td>
</tr>
<tr>
<td>LUT3:12-4</td>
<td>2</td>
<td>0.720</td>
<td>0.465</td>
<td>0.000</td>
<td>...</td>
</tr>
</tbody>
</table>

Figure 12: Timing description of conventional design

7.2 Proposed parallel prefix Subtractor

Input a= 0001000100010001(4’h1111)

Input b= 0001000100010001(4’h1111)

Output: 0000000000000000(4’h0000)

Figure 13: Waveform of Proposed Design

RTL VIEW of proposed design

Figure 14: Bottom level design

Area Analysis

<table>
<thead>
<tr>
<th>Design</th>
<th>Area (gate Count)</th>
<th>Delay</th>
</tr>
</thead>
<tbody>
<tr>
<td>Conventional</td>
<td>366</td>
<td>27.116ns</td>
</tr>
<tr>
<td>Proposed</td>
<td>324</td>
<td>24.091ns</td>
</tr>
</tbody>
</table>

Table 1: Comparison Table

Comparison Table of PPS (Parallel Prefix Subtractor)

BIST result

Waveform of bist architecture: result is done and set on 1 while expected output and proposed output is same

Volume 3 Issue 9, September 2014

www.ijsr.net

Licensed Under Creative Commons Attribution CC BY
9. Conclusion

In this paper we approach Parallel prefix Subtractor using prefix algorithm. Proposed design based on modified 2’s compliment method whereas area reduced by approx 12% and delay reduced by 13%. BIST architecture also introduced for on chip verification using LFSR technique. In future this work can be extend to multiplication part for modulo arithmetic operation.

References

[12] Power-conscious synthesis of parallel prefix adders under bitwise timing constraints, Proc. the Workshop on System Integration of Mixed Information technologies(SASIMI), Sapporo, Japan, October 2007, pp. 7–14
[14] M. Pedram. Power minimization in ic design: