# Rapid On-Chip Communication in 2D Networks Using 8-Port Router in A Multicast Environment and Their Realization

### Afroz Fatima<sup>1</sup>, Shaik Mohammed Waseem<sup>2</sup>

<sup>1</sup>M.Tech (Embedded Systems), Department of Electronics and Communication Engineering, GRIET, India

<sup>2</sup>M.Tech (VLSI), Department of Electronics and Communication Engineering, GRIET, India

**Abstract:** Effective on-chip communication is the most crucial factor to be observed in a multicast traffic. In conventional systems, a two-dimensional (2D) network with inter-layer communication on 5-Port, 6-Port and 7-Port router mechanism were analyzed which resulted in huge delay in delivering the packet from source to destination, also the hop latency count was increased to a greater extent, thereby effecting the overall network performance. In this paper, we propose an 8-Port router architecture which helps in improving the timing characteristics and also to reduce the average hop latency count in the network. The results obtained evaluate the performance of the 8-Port router in terms of total resources being utilized, improved timing characteristics and the circuit statistics occupied.

Keywords: On-chip communication, multicast, 2D networks, inter-layer, router.

#### **1.Introduction**

On-chip communication is one of the key factors to be considered in networks, as most of the data transfer action happens at a slow pace which affect the overall network performance. From [3] & [4] the existing techniques of onchip transferring of data involving either 5x5, 6x6 or 7x7 routers on a two-dimensional network, involved larger delay in sending the data packet from one point to another, increased the amount of area being utilized in a network and also increased the latency period by showing a significant impact on throughput and efficiency of the network. To get full benefits of parallel processing containing tens to hundreds of processors, a multiprocessor system needs efficient on-chip communication architecture [1]. Thus in this paper, we focus upon reducing the amount of delay occurred in the conventional systems for a 2D network with the help of a 8x8 router, which possess the ability of improving the timing characteristics by performing fast data transfer between the devices, thereby reducing the hop latency count between the incoming nodes and thus making an efficient 2D network. This is a general concept, proposed for complex on-chip communication and has better scalability, throughput and reduced power consumption [5].

The paper is organized into the following sections. Section-2 summarizes the basic concepts of 2D networks and 8-Port router. Section-3realizes the 8-Port router on a 2D network. Section-5 discusses the results. Section-6 concludes the paper.

## 2. Background

#### 2.1Two-Dimensional (2D) Networks

NoC consists of multiple Processing Elements (PE's) connected together via the router (i.e. switch) through the network interface links. Figure 1 shows a simple 2D NoC

mesh architecture with two layers where each layer is divided into two network groups: a high channel (GH) subnetwork and a low channel (GL) sub-network [2]. Depending upon the routing phenomena chosen, the data transfer takes place in either of the two network groups. For a typical NoC mesh architecture, any two nodes can be allowed to communicate from one another. For each node in an  $m \ge n$  mesh, a label L(X, Y) is assigned as,

$$L(X, Y) = \begin{bmatrix} Y \times N + X; & \text{if } Y \text{ is even} \\ Y \times N + N - X - I; & \text{if } Y \text{ is odd } [1] \end{bmatrix}$$

where X and Y are the coordinates of the node and if L(Y)>L(X); Routing takes place in GH else in GL [2]. In this paper, we make use of a 4x4x2mesh-based 2D NoC-Bus Hybrid architecture comprising of 32 nodes with an 8x8 router for inter-layer communication. The 2D network was examined on a multicast traffic pattern [6].



Figure 1: A two-dimensional mesh type NoC

#### 2.2. Eight-Port Router Methodology & Implementation

Routers are intelligent devices that receive incoming data packets, inspect their destination and figure out the best path for the data to move from source to destination [3]. The

Volume 3 Issue 9, September2014 <u>www.ijsr.net</u> Licensed Under Creative Commons Attribution CC BY

#### International Journal of Science and Research (IJSR) ISSN (Online): 2319-7064 Impact Factor (2012): 3.358

router employed is an 8-port, which connect up to eight links (E, W, N, S, NE, SE, SW, L i.e. East, West, North, South, North East, South East, South West and Local) designed for planar interconnect to seven mesh neighbors and to one PE as depicted in the Figure 2(a). Figure 2(b) provides the detailed view of 8-port router made up of different components like the buffer unit; the arbiter; the routing unit and the crossbar switch [4]. Separate buffer units are employed for each channel, and each buffer has a port controller in order to sense the incoming signal and direct it to the arbiter. Upon an input arrival, it is first stored in the buffer units where in the port controllers sense the incoming signals by the Congestion Flag (CF) (i.e. ECF, WCF, NCF, SCF, NECF, SECF, SWCF and LCF) and invokes a request on to the routing unit through the arbiter, to specify a path to be routed in the network [7]. The routing unit senses all the signals from the arbiter and chooses the desired route function (viz. XY, Hamiltonian, Shortest-path Routing) to indicate the route for a given packet in a 2D network. The arbiter then sends the selected route function to the cross-bar switch (an 8:1 multiplexer) to select one of the outputs from the eight inputs. This process repeats until the packet becomes empty.







# **3. Realization of 8-Port Router in a 2D Network**

Consider a two-dimensional mesh network (4x4x2) with 32 nodes labeled from 0 to 15 differentiated by their layer IDs as depicted in the Figure3. Proper flow control methods should be chosen for obtaining an efficient design [8]. For

the effective implementation of the router on a 2D network, we consider the packet format of size 128bits (16 nodes of 4bits each) as shown in Figure4. The packet size is divided into 3 groups: Data, Source and Destination Addresses. The Data employed could be chosen up to 64bits larger in size, and the Source &Destination Address are maintained at 32bits each.



Figure 3: A 2D network in a multicast environment



(a) First-try successful (b) Second-try successful **Figure 5:** Steps in the router for the 2D network

For the observance of on-chip communication in a 2D network, we consider the data transmission between the node 7 and 15 (Figure3). From the layer-0, the data transfer starts at source node 7 and at the respective router node it gets stored in the buffer; port controllers at each buffer unit perform read and write operation by sensing the signal state whether it is free or congested. If it is congested, the current node will not acknowledge its neighboring node and the data remains in the buffer, and further waits for the current data

to be routed, without causing an interruption [From Fig5(a & b)]. If it is free, the routing unit will be invoked to specify a path to route the data (in this case an ascending order was considered) and the data will be moved to the next neighboring node specified by the routing unit i.e. to node 8 (south east direction at layer-0 and layer-1, selected via cross-bar switch). This process repeats until all the nodes are occupied and the data reaches the destination node 15 (north east direction at the layer-0 and layer-1). The paths covered are {70, 80, 90, 100, 110, 120, 130, 140, 150[151]} and {70[71], 81, 91, 101, 111, 121, 131, 141, 151}, which means a maximum latency of 9 hops in both the cases.

# 4. Results and Discussion

To demonstrate the efficiency of the 8-Port router for the 2D hybrid mesh architecture, synthesis and simulation was performed in VHDL on the device xc3s500e-4fg320. The proposed architecture was analyzed for multicast traffic pattern and the arbitration scheme chosen was round-robin algorithm. The results listed in tables 1, 2 & 3, show the resource, logic utilization and the timing measures for 32 nodes & 32 routers and the buffer size for each FIFO was considered to be 8-flits large and the no. of payloads considered were 6, on a 2D network. The data flow timing got initiated at 15ns and reached the final destination point at 835ns, therefore marking the delay time to be 820ns between the nodes 7 and 15 in a 2D network.

**Table 1:** Resource Utilization Details of 8-Port Router

|                  |       |           | UTILIZATION |
|------------------|-------|-----------|-------------|
| RESOURCE         | USED  | AVAILABLE | PERCENTAGE  |
| Slices           | 31056 | 4656      | 667         |
| Slice Flip Flops | 22163 | 9312      | 238         |
| 4 Input LUTs     | 59690 | 9312      | 641         |
| Bonded IOBs      | 3902  | 232       | 1682        |
| GCLKs            | 24    | 24        | 100         |

Table 2: Timing Summary of 8-Port Router

| Max. Frequency                        | 49.718MHz |
|---------------------------------------|-----------|
| Min. Period                           | 20.871ns  |
| Min. Input Arrival Time Before Clock  | 7.386ns   |
| Max. Output Required Time After Clock | 16.748ns  |

 Table 3: Logic Circuit Statistical Information of 8-Port

 Router

| Cell Name | Library<br>Name | Number of Gates |
|-----------|-----------------|-----------------|
| IOs       | xcv2p           | 4210            |
| BUF       | xcv2p           | 18              |
| GND       | xcv2p           | 1               |
| INV       | xcv2p           | 841             |
| LUT2      | xcv2p           | 11321           |
| LUT2_D    | xcv2p           | 26              |
| LUT2_L    | xcv2p           | 33              |
| LUT3      | xcv2p           | 7100            |
| LUT3_D    | xcv2p           | 184             |
| LUT3_L    | xcv2p           | 538             |
| LUT4      | xcv2p           | 18152           |
| LUT4_D    | xcv2p           | 752             |
| LUT4_L    | xcv2p           | 708             |
| MUXF5     | xcv2p           | 1805            |
| VCC       | xcv2p           | 1               |
| FD_1      | xcv2p           | 350             |
| FDC       | xcv2p           | 7514            |
| FDCE      | хсv2р           | 2910            |
| FDE       | xcv2p           | 6840            |
| LD        | xcv2p           | 6750            |
| BUFG      | хсv2р           | 23              |
| BUFGP     | хсу2р           | 1               |
| IBUF      | хсv2р           | 1961            |
| OBUF      | хсу2р           | 2041            |
| RAMs      | xcv2p           | 7381            |

# 5. Conclusion

With the synthesis results shown, we can conclude that the eight port router architecture discussed is an effective model which helped in reducing the amount of delay in sending the packet among multiple PEs, reducing the latency count and also provided a significant increase in the throughput over the conventional router mechanisms in a two-dimensional network. The reason for this improvement is the maximum number of channels being available for the PEs to choose from and also the efficiency of the switching element to select an immediate neighbor signaled by the route function. As a result of which the average latency count gets reduced to a maximum extent, thereby improving the overall system performance. The current work could be extended onto various domains where an accelerated communication of data is needed in real-time like the multimedia applications.

# References

- [1] L. Benini and G. De Micheli, "Networks on chips: a new SoC paradigm", Proc. IEEEComputer Society, DATE Conference and Exhibition, 35(1):pp. 70–78, 2002.
- [2] SanazRahimiMoosavi, Amir-Mohammad Rahmani, PasiLiljeberg, JuhaPlosila, and HannuTenhunen,"Enhancing Performance of 3D Interconnection Networks Using Efficient Multicast Communication Protocol", Proc. 21st Euromicro

#### Volume 3 Issue 9, September2014 www.ijsr.net

International Conference on Parallel, Distributed, and Network-Based Processing, pp. 294-301, 2013.

- [3] Swapna S, AyasKanta Swain and Kamala KantaMahapatra, "Design and Analysis of Five Port Router for Network on Chip", Proc. IEEE Prime Asia, pp. 51-55, 2012.
- [4] Yuan Xie, Jason Cong, SachinSapatnekar, "Three-Dimensional Integrated Circuit Design", EDA, Design and Microarchitectures, Springer, 2010.
- [5] A. Jantsch and H. Tenhunen."Networks on Chip".Kluwer Academic Publishers, 2003.
- [6] P. Abad, V. Puente, and J.A.Gregorio."Enabling Fully Adaptive Multicast Routing for CMP Interconnection Networks", Proc. IEEE 15th International Symposium on High Performance Computer Architecture, pp.355–366, 2009.
- [7] SudeepPasricha and NikilDutt, "On-Chip Communication Architectures, System on Chip Interconnect", Morgan Kaufmann Publishers, Elsevier, 2008.
- [8] William James Dally and Brian Towles, "Principles and Practices of Interconnection Networks", Morgan Kaufmann Publishers, Elsevier, 2004.

# **Author Profile**

**Afroz Fatima** received the B.E (EIE) degree from Osmania University, Hyderabad. She is pursuing M. Tech (Embedded Systems) from GRIET (JNTUH). Her research interests include On-Chip & Off-Chip Communication, Real-Time Embedded Systems and Industrial Control Systems.



Shaik Mohammed Waseem received the B.E (ECE) degree from Osmania University, Hyderabad. He is pursuing M.Tech (VLSI) from GRIET (JNTUH). His research interests include Image Processing, On-Chip &

Off-Chip Communication, Design for Fault Tolerant Systems and Digital System Design.