FIR Filter Design Based on CSD and MSD Concepts for Fixed Applications

Raseena K.A1, Vincy Mathew2

1Malabar College of Engineering and Technology, Department of Electronics and Communication, Thrissur, India
2Assistant professor, 1 Malabar College of Engineering and Technology, Department of Electronics and Communication, Thrissur, India

Abstract: In some applications the coefficients of FIR filters are remain fixed. This paper describes the design of FIR filter for fixed applications. The FIR filter design using transpose form structure is usually a pipelined architecture and it holds MCM (multiple constant multiplications) technique. This MCM technique results in large computation saving. In this paper, the speed of FIR filter is increased by combining CSD (canonic signed digit) and MSD (minimal signed digit) representation of numbers. The MSD representation is suitable for common sub expression elimination method. It significantly reduces the number of adders needed for in the filter design process. The proposed structure significantly reduces the area delay product (ADP) and energy per sample (EPS) than the existing FIR structure.

Keywords: FIR, MCM, CSD, MSD, ADP, EPS

1. Introduction

Many applications require FIR filters of large order to meet the stringent frequency specifications [2]–[4]. Very often these filters need to support high sampling rate for high-speed digital communication [5]. The number of multiplications and additions required for each filter output, however, increases linearly with the filter order. Chen and Chiueh have proposed a canonic sign digit (CSD)-based RFIR filter, where the nonzero CSD values are modified to reduce the precision of filter coefficients without significant impact on filter behavior. But, the reconfiguration overhead is significantly large and does not provide an area-delay efficient structure. The architectures are more appropriate for lower order filters and not suitable for channel filters due to their large area complexity. We explore the possibility of realization of block FIR filter in transpose form configuration in order to take advantage of the MCM schemes and the inherent pipelining for area-delay efficient realization of large order. The main contributions of this paper are as follows:

1)A low-complexity design method using MCM scheme for the block implementation of fixed FIR filters.
2)CSD and MSD representations of numbers increase the speed of FIR filter.
3)The proposed structure significantly reduces the area delay product (ADP) and energy per sample (EPS) than the existing FIR structure.

2. Filter Design

The input for the ADSL circuit requires 12 bits at 4 MHz. The ADC can be realized with a multistage multi bit sigma-delta modulator as suggested by Azadet, and followed by a suitable decimation filter to reduce the sampling rate and achieve the required resolution. We use a decimation factor of 16 to decimate from 64 MHz to 4 MHz. The pass band of the filter is 2 MHz with a pass band ripple of 0.001, which corresponds to the flat response of the filter. The stop band attenuation is specified according to the output word resolution. Considering 6 dB per bit, a minimum of -72 dB of attenuation is required to support the accuracy of our 12-bit output.

After the decimation filter is decided, we use the VHDL to implement the hardware. As the input signals are fed from the sigma-delta modulator to the multiplexer, which is used to choose which one input signal will be sent to the comb filter, the decimation filter will process the input signal with down rate of 128. Note that the comb filter works with larger value of down rate (=16), whereas that of FIR is 2. The main function of FIR is to strain the high-frequency noise. Furthermore, we need to emphasize that this research is focused on the integration of comb filter and low pass (LP) finite impulse response (FIR) to implement an application-specific integrated circuit (ASIC).

Figure 1: Digital filter design

In order to verify the frequency-divided function of decimation filter, we firstly use the MATLAB software to complete this verification. Fig. 1 shows the Simulation block of decimation with MATLAB software. Note that the input signal is operating with 2.5 kHz. After oversampling, the...
sampling frequency is up to 2.56 MHz and feeds to the input of the first stage of FIR filter. The input 2.56 MHz signal will be divided by 128 and send to the output of the decimation filter. That is, the output frequency is 20 kHz. This means that the high-frequency noise which is higher than 20 kHz will be removed from the decimation filter. In practice, if we feed the decimation filter with frequency of 2.5 kHz, the main frequency, 2.5 kHz, is filtered out obviously at the output port of the decimation filter.

In addition to the architectural-level technique, circuit-level techniques are also presented and used in the FIR filter implementation. In the CSHM structure, adders are critical for performance. A new carry-select adder, which is based on the dual transition skewed logic (DTSL), is presented and efficiently used in our filter implementation. The proposed carry-select adder based on DTSL is superior to the Domino-based carry-select adder in terms of power and performance. Flip-flops are also crucial elements from both a delay and power standpoint. Conditional capture flip-flop (CCFF) is explained and used in our filter design. CCFF is a dynamic style flip-flop that has a negative setup time and small clock-to-output delay. Moreover, depending on data switching activity, CCFF also reduces the power consumption.

Using the CSHM presented in the figure 2, a 10–tap FIR filter with programmable coefficients has been implemented for fabrication. FIR filter can be implemented in direct form (DF) [1] or transposed direct form (TDF) architecture.

In the DF FIR filter, a large adder in the final stage lies on the critical path and it slows down the FIR filter. For high-performance filter structure, TDF is used in our implementation. Floor planning was done to minimize the total interconnect lengths, especially for global signals. The pre computer is placed in a rectangular area on the top of the floor plan so that its outputs can be distributed to all the taps through shortest possible paths. The power supply of the core is separated from the power supply of PADS to be able to separately measure the power of the filter core and the power of the PADS and interfacing to the testing instrument.

3. General Background

The MCM-based structure for FIR filters for block size L = 4 is shown in Fig. 3 for the purpose of illustration. The MCM-based structure (shown in Fig. 3) involves six MCM blocks corresponding to six input samples. Each MCM block produces the necessary product terms. The sub expressions of the MCM blocks are shift added in theadder network to produce the inner-product values (r1, m), for 0 ≤ i ≤ −1 and 0 ≤ m ≤ (N/L) − 1. The inner-product values are finally added in the pipelined adder unit (PAU) to obtain a block of filter output.

Typical DSP algorithms involve large number of multiplications, and multipliers consume significant amount of area and computation time. It is therefore important to reduce the area and time complexity of implementations of multipliers. In some applications such as linear transformations and transposed form finite impulse response (FIR) filters, the same variable is multiplied by a set of constant coefficients, which are known a priori. Such structures are referred to as multiple constant multiplications (MCM). [2]

Efficient implementation of MCM is important for high-speed, low-complexity and low-power DSP systems. Typically the multiplications in an MCM block are realized.

Figure 2: Computation sharing multiplier (CSHM) architecture

Figure 3: MCM-based structure for fixed FIR filter of block size L = 4 and filter length N = 16.

Volume 6 Issue 4, April 2017

www.ijisr.net

Licensed Under Creative Commons Attribution CC BY
by a network of adders (subtractors) and hardwired shifts with sharing of partial products across all the multiplications. The MCM problem is extensively studied and many different algorithms have been proposed by researchers to optimize the area consumption and computation time of the MCM block. All these approaches can be put into two broad categories: i) the common sub expression elimination (CSE) technique and ii) the graph-dependence (GD) algorithms.[2]

The CSE technique primarily searches for the most frequently occurred common sub expressions which could be maximally shared across the multipliers in the MCM block.

Potkonjak et al. and Hartley are pioneers to explore the redundancy in MCM blocks using CSE technique. Potkonjak represented coefficients in signed digit (SD) form and used a recursive bipartite matching algorithm to identify the maximally-shared common sub expressions. [2]

Hartley expressed the coefficients in canonical signed digit (CSD) form and arranged them in a two-dimensional array to search for identical bit patterns in horizontal (intra-coefficient), vertical and oblique (inter-coefficient) directions. Thereafter, more CSE algorithms with different common sub expression identification strategies are proposed to reduce the logical operators (LOs) and logic depth (LD). However, the disadvantage of CSE algorithms is that the performance of these algorithms depends on the number representation. [2]

GD algorithms, on the other hand, make no assumptions on the number representation, such that they offer more degrees of freedom to the optimization of the MCM problem. The idea of representing multiplier blocks with directed acyclic graphs (DAGs) was introduced. [2]

4. Proposed Method

The realization of novel high speed filter architecture is implemented using the minimal signed digit (MSD) representation. The MSD representation is suitable for common sub expression elimination, and it significantly reduces the number of adders required for the filter synthesis.

The canonical signed digit (CSD) of a given number is unique, and the filters are implemented using hardware efficient CSD multipliers. The hardware complexity is further reduced by searching for a common sub expression among multiple constants in applications requiring multiple constant multiplications (MCMs).

Figure 5 shows the representation of numbers using CSD and MSD methods. After the representation in appropriate form MCM algorithm is applied. MSD representation minimizes the number of adders required.

Many of the approaches have tried to select the common sub expressions after representing the constants in the CSD representation. The minimal signed digit (MSD) representation has the same number of non-zero digits as the CSD representation but provides multiple representations for a constant. The MSD representation is suitable for common sub expression elimination, and it significantly reduces the number of adders required for the filter synthesis.

The CSD representation is a radix-2 signed digit system with the digit set \{1, 0, \underline{1}\}, where \underline{1} denotes -1. The CSD representation for a given number is unique and has two properties; the first is that the number of non-zero digits is minimal and the second is that the product of adjacent two digits is zero. The CSD number system is an efficient way of representing the coefficients, as it reduces 33% of non-zero digits compared with the binary representation. The CSD representation is widely used in implementing MCMs because it guarantees the least number of additions for a given constant multiplication. However, this results in limited sub expressions for multiple constants. If the second property is relaxed in the CSD representation, it is called minimal signed digit (MSD) representation.

The MSD representation is more appropriate in finding common sub expressions for multiple constants if a proper MSD form is selected for each constant to be synthesized. The only transformations needed to convert the CSD representation to MSD representations are 10 \underline{1} \rightarrow 011 and \underline{1} 01 \rightarrow 0 1\underline{1}. The CSD representation is registered as the first MSD representation. A pattern of either 10 \underline{1} or \underline{1} 01 is searched next, starting from the most significant digit and transformed into 011 or 0 1\underline{1} respectively. A new MSD representation is generated for each transformation. The transformation is applied repeatedly to the new MSD representations found in the previous transformations until there is no such pattern. The pattern is searched in an MSD representation from the next position of the digit where a transformation is applied to generate the MSD representation.
5. Result and Comparison

In this section, results of the exact algorithm on randomly generated and real-sized FIR filter instances under CSD and MSD representations are presented and compared.

The proposed FIR filter using CSD and MSD concept is simulated using ModelSim software and the output parameters are analyzed using Xilinx software.

The area, power and time delay of the proposed FIR filter is analyzed using Xilinx software. Table 1 shows comparison between proposed method and existing method. The possibility of realization of FIR filter in transpose form configuration to achieve efficient area and delay for large order FIR filters were explored. The proposed method is also focused on the minimization of area by replacing the multiplication operations with constants by addition, subtraction, and shifting operations. Since shifts are free in terms of hardware, the MCM problem can be defined as the minimization of the number of addition/subtraction operations to implement the constant multiplications.

<table>
<thead>
<tr>
<th>Parameter</th>
<th>Existing method</th>
<th>Proposed method</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>CSD Based FIR</td>
<td>Combined CSD and MSD Based FIR</td>
</tr>
<tr>
<td>Slices</td>
<td>765</td>
<td>496</td>
</tr>
<tr>
<td>LUTs</td>
<td>1066</td>
<td>699</td>
</tr>
<tr>
<td>Delay ( ns )</td>
<td>14.393</td>
<td>4.672</td>
</tr>
<tr>
<td>Power (mW)</td>
<td>186</td>
<td>181</td>
</tr>
</tbody>
</table>

The synthesis report is shown in Figure 7. The number of slices required for the CSD based method was 765. It is reduced to 496 when CSD based FIR filter is replaced by combined CSD and MSD based FIR filter. Correspondingly the area of the filter is also reduced. The number of look up tables required for the CSD based FIR filter was 1066. It is reduced to 699 when combined CSD and MSD based FIR filter is used. The time delay report is shown in Figure 8. The time delay of the existing method was 14.393 ns. Using the proposed method, this time delay is reduced to 4.672. Area delay product of the proposed system is 42% less than the existing system. The energy per sample of the proposed Scheme is 40% less than the existing FIR filter structures.

Application-specific integrated circuit synthesis result shows that the proposed structure for block size 4 and filter length 64 involve 13% less ADP and 12.8% less EPS than that of the existing direct-from block FIR structure.

Figure 6: Graph showing time delay (ns) of two systems

Figure 7: Graph showing power consumption of two systems

Figure 8: Synthesis report of the proposed system from Xilinx software

Figure 9: Time delay report of the proposed system from Xilinx software

Figure 10: Power report of the proposed system from Xilinx software
The power report is shown in figure 9. The total power required for the proposed system is 181mW. For the existing system using CSD based FIR filter, required power was 186mW. So the power is reduced by 5mW.

6. Conclusion

The impact of power consumption, delay, area has been successfully done. Smaller truncation error and error compensation (MCM used done). It yields considerable time saved for the compensated circuit. A high-accuracy, low-cost, and flexible fixed-width got by using this multiplier based on MCM. The possibility of realization of FIR filter in transpose form configuration to achieve efficient area and delay for large order FIR filters were explored. Application-specific integrated circuit synthesis result shows that the proposed structure for block size 4 and filter length 64 involve 13% less ADP and 12.8% less EPS than that of the existing direct-from block FIR structure.

References


Author Profile

Raseena K. A has received B.Tech degree in Electronics and Communication Engineering from Calicut University, currently pursuing M.Tech degree in Applied Electronics and Communication under Kerala Technological University. Research interests include VLSI and embedded systems.

Vincy Mathew completed M.Tech and B.Tech from Karunya University. She is currently working as assistant professor in Malabar College of engineering and technology, Thrissur. Her research interests include VLSI.