Weitere ähnliche Inhalte Ähnlich wie Diametrical Mesh of Tree (D2D-MoT) Architecture: A Novel Routing Solution for NoC (20) Mehr von IDES Editor (20) Kürzlich hochgeladen (20) Diametrical Mesh of Tree (D2D-MoT) Architecture: A Novel Routing Solution for NoC1. ACEEE Int. J. on Communications, Vol. 03, No. 01, March 2012
Diametrical Mesh of Tree (D2D-MoT) Architecture: A
Novel Routing Solution for NoC
Prasun Ghosal*, Sankar Karmakar
Department of Information Technology
Bengal Engineering and Science University, Shibpur
Howrah 711103, WB, India
Email: prasun@ieee.org, nitpiku@gmail.com
Abstract—Network-on-chip (NoC) is a new aspect for designing fundamental building blocks viz. (i) Switch, which are called
of future System-On-Chips (SoC) where a vast number of IP as routers, (ii) The Network Interfaces (NI) which are also
cores are connected through interconnection network. The called network adapters, and (iii) The last one is Link. The
communication between the nodes occurred by routing packets components are shown in Figure 1.
rather than wires. It supports high degree of scalability,
The backbone of the NoC consists of switches, whose
reusability and parallelism in communication. In this paper,
we present a Mesh routing architecture, which is called main function is to route the packets from source to
Diametrical 2D Mesh of Tree, based on Mesh-of-Tree (MoT) destination. Some NoC design depends on octagon or ring
routing and Diametrical 2D Mesh. It has the advantage of connectivity. This provides the logical control. NoC can be
having small diameter as well as large bisection width and based on circuit or packet switching, or combination of both.
small node degree clubbed with being the fastest network in An NI connects each core to the NoC. NIs convert
terms of speed. The routing algorithm ensures that the packets transactions of requests/responses into packets and vice
will always reach from source to sink through shortest path versa. Packets are split into a FLow control unITS called as
and is deadlock free. FLITS before transmission. The Look Up Table (LUT)
Index Terms—Network on Chip, NoC routing, Diametrical
mesh of tree routing, D2D MoT
I. INTRODUCTOIN
A. Introduction to Network-on-chip
Present day system-on-chip (SoC) design contains
billions of transistors. One of the major problems associated
with future SoC design comes up from non-scalable global Figure 1: Basic components in NoC routing
wire delays. Global wires carry signals across a chip but these specifies the path that packet will follow inside the network
wires do not scale in length with technology scaling. In ultra- and reach the destination [7].
deep-sub-micron processes, 80% or more of the delay of
critical paths will be due to interconnects. Secondly, for a II. PROBLEMS IN NOC ROUTING
long bus line, the intrinsic parasitic resistance and capacitance Design of an NoC consists of several problem areas [3].
can be quite high. If the bus length increases and/or the These are as follows.
number of IP core blocks are increased then the associated 1. The topology synthesis problem.
delay in bit transfer over the bus become arbitrarily large and 2. The channel width problem.
exceeds the targeted clock period. Thirdly, the power 3. The buffer sizing problem.
consumption increases with the circuit size. Finally, In SoC, a 4. The floor-planning problem.
bus allows only one communication at a time, so all buses of 5. The routing problem.
the hierarchy are blocked, as its bandwidth is shared by all 6. The switching problem.
the system attached to it. 7. The scheduling problem.
To overcome all the above mentioned limitations we 8. The IP mapping problem.
consider the architecture of Network-on-Chip (NoC) [3]. NoC Among these the most important problem in NoC design is
is a new electronic device for designing future SoCs where routing problem. The network performance and power
various IP cores are connected to the router based network. consumption are greatly affected due to this phase only.
The network is used for packet switched on-chip A basic routing problem in NoC may be stated like this:
communication among cores [2] [3] [4] [5] [6] [7]. Input: An application Graph, a communication architecture
B. Basic Components of NoC Routing A(R,ch), the source and destination routers.
The basic components of NoC routing consist of three Find: A decision function at router r, RD (r,s,d,r(n) for selecting
*
corresponding author
an output port to route the current packet(s) while achieving
a certain objective function.
© 2012 ACEEE 30
DOI: 01.IJCOM.3.1.88
2. ACEEE Int. J. on Communications, Vol. 03, No. 01, March 2012
There may be different approaches to solve the current Bisection Width: 1, No of Routers required: (N-1), Node
problem. Two things, one is the complexity of implementation degree: 5(leaf), 3(stem), 2(root).
and another is performance requirement, are the most
E. Octagon
considerable during solving the routing architecture design.
Compared to the adaptive routing, deterministic routing is Each node in this network (Figure 2) is associated with an
mostly useful over the uses of less resources and guarantee IP and a switch. For a system consisting of more than eight
to arrival packets. But adaptive routing algorithms give better nodes, the network is extended to multidimensional space.
throughput. For a network having N number of IP blocks, Diameter: 2[N/
8], Bisection width: 6 for N8 or 6(1+[N/8]) for N>8, No of
III. EXISTING ARCHITECTURES IN NOC router required: 8 for n 8 or 8 and (1+[N/8])”[N/8] for N>8.
Node Degree: 4(member node), 7(bridge node).
Different topologies have been developed for NoC archi-
tecture [3] [7]. Some are as follows.
Figure 2: Different topologies
A. Cliché topology F. SPIN
This architecture consists of an M ´ N mesh of switch, Every node has four children and the parent is replicated
the edges is connected four neighboring switches and one four times at any level of the tree, the functional IP blocks
IP block. It has M rows and N columns, Diameter: (M+N-2), reside at the leaves and the switches reside at the vertices,
Bisection Width: min(M,N), Number of Routers required: (M for N number of IP blocks the network has, Diameter: log2N,
X N), Node Degree: 3(corner), 4(boundary), 5(central). This Bisection Width: N/2, Number of router needed: Nlog2 (N/8),
is shown in Figure 2. Node Degree: 8(non root), 4(root). This is shown in Figure 2.
B. Torus Topology G.. Butterfly Fat Tree
In torus architecture [14], the only difference with mesh In this network (Figure 2), IPs are placed at the leaves and
is that the switches at the edges are connected to the switches switch placed at the vertices. For N number of IPs the network
at the opposite edge through wrap-around channels. It has has Diameter: log2N, Bisection Width: N, Number of router
also M rows and N columns, Diameter: [M/2] + [N/2], Bisection needed N/2, Node Degree: 6(non root), 4(root).
Width: 2 X min (M,N), No of Routers required: (M X N), Node
H. MoT (Mesh of Tree)
Degree: 5(five). This is shown in Figure 2.
A 4 X 4 MoT network (shown in Figure 2) consist of 4 row
C. Folded Torus trees and 4 column trees. The PCs act as sources for packets
In folded Torus the only difference from Torus is that the and MMs are at the roots of the trees. Each terminal of MoT
long end-around connections are avoided by folding the network could serve as a processor cluster up to 16
Torus, to avoid the excessive delay due to long wires, for processors. For N X N MoT the network has Diameter: 4
folded Torus having M rows and N columns. Diameter: [M/2] log2N, Bisection width: N, Number of routers required
+ [N/2], Bisection Width: 2 X min (M,N), No of Routers (3N2_2N), Node Degree : 2(leaf), 3(stem), 18(root).
required: (M X N), Node Degree: 5(five). This is shown in An M ´XN MoT [11] [12] where M and N denote the number
Figure 2. of row and column trees has
1. Number of nodes = 3 X (M X N) – (M+N)
D. Binary Tree
2. Diameter = 2 log2M + 2 log2N
In Binary tree (Figure 2) 4 IP cores are connected at the 3. Bisection width = min(M,N)
leaf level node, but none at the others. A Binary tree based 4. Recursive structure
network with N number of IP core has Diameter: log2N, 5. A maximum of two layers, horizontal and vertical, are
© 2012 ACEEE 31
DOI: 01.IJCOM.3.1.88
3. ACEEE Int. J. on Communications, Vol. 03, No. 01, March 2012
sufficient for routing. Y indicates the number of column that the node is located in
1) Addressing (see Figure 4). The number of bits in X and Y parts are
The address of routers in M N MoT consists of four determined by the number of rows and columns in Diametrical
fields. 2D Mesh. For example, for 16 IP cores, Diametrical 2D Mesh
1. Row Number has 4 nodes in each rows and 4 ones in each column, due to
2. Column Level this fact both X and Y parts have to use 2 bits. In other
3. Column Number words, 2 bits show the number of row, X, and 2 bits indicate
4. Row Level the number of column, Y. Also, for 25 IP cores, Diametrical 2D
The details of the addressing scheme may be omitted due to Mesh has 5 nodes in each row and column. Therefore, both
paucity of space. X part bits and Y part bits must be three. This means that 6
2) Routing Algorithm bits addresses are used to label the Diametrical 2D Mesh
The routing algorithm given here is the deterministic with 25 IP cores. Both 16 and 25 Diametrical 2D Mesh
routing approach. The routing algorithm ensures that the addressing are demonstrated in Figure 4.
packet will reach to destination always through specified
shortest path. We use the following abbreviations to describe
the algorithm. Let RN: Row Number, CL: Column Level, CN:
Column Number, RL: Row Level, addr (curr): address of the
current node, addr (dest): address of the destination node.
Each router executes the same algorithm as proposed in Figure
3.
Figure 4: Diametrical 2D mesh architecture
1) Routing in Diametrical 2D Mesh
The routing protocol deals with resolution of the routing
decision made at every router. Routing method affects the
cost (area and power consumption) and the performance
(average latency and throughput) issues in the NoC design.
We propose extended XY routing for diametrical 2D mesh.
Fundamentally, this routing algorithm is the shortest path
and inherited from the well-known 2D Mesh XY routing.
Figure 3: MoT routing algorithm
Similar to XY routing, Extended XY is very simple due to
I. Diametrical 2D Mesh simple addressing scheme and structural topology. Figure 3
Diametrical 2D Mesh is a performance efficient topology shows the details of the routing pseudo code. As can be
[13], because its network diameter is reduced considerably in seen, we define Xoffset and Yoffset values in the pseudo code,
comparison to 2D Mesh. Although, in this network the area which are calculated as follows.
is increased because of 8 extra links, its power consumption
is decreased due to average hop count reduction. Number of
extra links in Diametrical 2D Mesh do not grow by the growth Where Xcurrent is the X value of a current node and Xdest is the
of IP cores. On the other hand, by growth of IP cores, number X value of a destination node. In addition, Ycurrent is the Y
of extra links constantly and statistically equals 8. This link value of a current node and Ydest is the Y value of a destination
redundancy is decreased by the growth of IP cores. For node. Xoffset and Yoffset are the values indicating the number of
example, the link redundancy in 16 IP cores 2D Mesh is 1/3 rows and columns between a current and a destination node
and in 25 IP cores 2D Mesh, this ratio is 1/5. Furthermore, the respectively. If both Xoffset and Yoffset values are zero, it means
network diameter is decreased 50% with any number of IP that the current node is the destination and the packet reaches
cores in comparison to 2D Mesh. All in all, the main purpose the destination node. Extended XY routing utilizes
of adding 8 extra links to 2D Mesh topology is the reduction conventional XY routing in the following conditions.
of diameter in 2D Mesh when 2D Mesh is expanded by large Case I: When the diameter channel is not used:
number of IP cores. Moreover, it was tried to minimize the
1. If a current node and a destination node are located in the
area and power consumption redundancy by defining
same row or column, or Xoffset + Yoffset < d-1, according to
constant 8 links that decrease diameter and connect four
these conditions, conventional XY routing decisions are
edge sub-networks.
performed.
1) Diametrical 2D Mesh Addressing
Case II: When diameter channel is used:
Diametrical 2D Mesh uses addresses those are composed If Xoffset + Yoffset > d-1:
of two parts; X and Y, which X shows the number of row and
© 2012 ACEEE 32
DOI: 01.IJCOM.3.1.88
4. ACEEE Int. J. on Communications, Vol. 03, No. 01, March 2012
1. If a source switch has a diameter link, the flits are forwarded V. PROPOSED ROUTING ALGORITHM
via a diameter channel.
The routing algorithm follows the deterministic routing
2. If a source switch has not a diameter channel then firstly,
approach. The routing algorithm ensures that the packet will
based on XY routing the flits are forwarded to the nearest
reach to destination always through specified shortest path.
intermediate node which has a diameter, and, secondly, the
Thus the proposed network is always Live lock free. We use
flits are forwarded via a diameter channel.
the following abbreviations to describe the algorithm. Let
RN: Row Number, CL: Column Level, CN: Column Number,
IV. PROPOSED DIAMETRICAL 2D MESH OF T REE (D2D-MOT)
RL: Row Level, addr (curr): address of the current node, addr
ROUTING ARCHITECTURE
(dest): address of the destination node. Each router executes
With the help of Diametrical 2D-Mesh topology and the same algorithm as proposed in Figure 6.
MoT(Mesh of Tree) topology proposed earlier, the new ar-
chitecture is formed which we called Diametrical 2D Mesh of
Tree (D2D-MoT)routing architecture for NoC. Here 4 4 rows
and column trees are used to form the architecture. The leaf
level nodes are common to both the trees. In figure L, S and
R denote the Leaf, stem and root level nodes. All these nodes
are replaced by routers in practical. At root level, the router is
attached to IP cores. Here each leaf node is connected diago-
nally in same module. There is a stem node between two leaf
nodes which is connected. Also, between two stem nodes,
there is a root node, which is connected. There are eight root
nodes, four roots are external and four are internal, these four
internal root nodes are connected oppositely. It has ten extra
links that increases the wire length, but reducing the extra
hops, and thereby increasing the network performance due
to high speed of operation. Also, the Diametrical 2D bypasses
channels, that causes around 50 percent reduction in net-
work diameter. Hence, the design of the proposed architec-
ture provides a balanced improvement over the performance Figure 6: Proposed D2D-MoT routing algorithm
as well as cost of the routing in an NoC.
The proposed architecture is shown in Figure 5. VI. EXPERIMENTAL RESULT
The proposed algorithm is implemented in a standard
desktop environment running Linux operating system on a
chipset with Intel Pentium processor running at 3 GHz using
GNU GCC compiler. The important section of a sample run is
shown below.
Adjacency Matrix
**************************
Input the no of nodes in each side = 3
Total Number of nodes in n x n Matrix = 9
No IP Blocks is 18
********************************
Input adjacency matrix is:
********************************
1 10 00 1 11 1
0 01 11 0 10 1
Figure 5: Proposed diametrical 2D mesh of tree architecture 0 11 10 0 01 1
1 10 00 1 11 0
The characteristic parameter values of this network architec- 1 11 10 0 00 0
ture (shown in Figure 5) are presented in Table I 0 10 11 0 00 1
1 10 11 1 00 0
T ABLE I 0 11 01 1 01 1
PERFORMANCE PARAMETER VALUES OF PROPOSED D2D-MOT ARCHITECTURE 1 11 00 1 01 1
No of 1 in the Matrix = 46
No of link is = 23
© 2012 ACEEE 33
DOI: 01.IJCOM.3.1. 88
5. ACEEE Int. J. on Communications, Vol. 03, No. 01, March 2012
***XY CO-Ordinate is*** REFERENCES
00 01 02 03 04 05 06 07 08
10 11 12 13 14 15 16 17 18 [1] Naveed Sherwani, Algorithms for VLSI Physical Design
20 21 22 23 24 25 26 27 28 Automation, Kluwer Academic Publishers, 1999.
30 31 32 33 34 35 36 37 38 [2] P. Abad, V. Puente, J. A. Gregorio, and P. Prieto. Rotary Router:
40 41 42 43 44 45 46 47 48 An Efficient Architecture for CMP Interconnection Networks. In
50 51 52 53 54 55 56 57 58
Proc.of the International Symposium on Computer Architecture
60 61 62 63 64 65 66 67 68
70 71 72 73 74 75 76 77 78 (ISCA), pages 116–125, San Diego, CA, June 2007.
80 81 82 83 84 85 86 87 88 [3] T. Bjerregaard and S.Mahadevan. A survey of research and
Input strating vertex = 5 practicesof network-on-chip. ACM Comput. Surv., 38(1):1, 2006.
Input destination = 7 [4] R. D. Mullins, A. West, and S. W. Moore. Low-Latency Virtual-
Shortest path = 5 => 8 => 7 Channel Routers for On-Chip Networks. In Proc. of the
Minimum distance = 2 InternationalSymposium on Computer Architecture (ISCA), pages
188–197, Munich,Germany, 2004.
VII. CONCLUSION AND FUTURE WORK [5] D. Seo, A. Ali, W.-T. Lim, N. Rafique, and M. Thottethodi.
Near-Optimal Worst-Case Throughput Routing for Two-
In D2D-MoT topology, ten (10) extra links are connected Dimensional Mesh Networks. In Proc. of the International
[(i-1) 2 in D2D-MoT to each diagonally router and four Symposium on ComputerArchitecture (ISCA), pages 432–443,
internal router], Due to this the wire length increases a little Madison, WI, 2005.
bit but simultaneously it reduced the diameter to 50% [6] Balkan, A.O., Gang, Q. and Vishkin, U. (2006) ‘A mesh-of-
compared to other topology. For this it leads less average trees interconnection network single-chip parallel processing’, IEEE
hop count and average latency and power consumption. The 17th International
performance depends on network average latency and [7] Benini, L. and Micheli, G.D. (2002) ‘Network on chips: a new
throughput. Hence we get good overall performance SOC paradigm’, IEEE Computer,Vol. 35, No. 1, pp.70–78.
[8] Chelcea, T. and Nowick, S.M. (2000) ‘A low latency FIFO for
improvement with this architecture.
mixed clock systems’, Proceedings of IEEE Computer Society
The D2D-MoT is really advantageous for large number Workshop on VLSI, pp.119–126.
of IP cores. For small number of network or IP cores there is [9] Cummings, C.E. and Alfke, P. (2002) ‘Simulation and synthesis
no such significant improvement of its performance compared techniques for asynchronous FIFO design with asynchronous
to other topologies. In future, we shall try to improve our pointer comparisons’, Synopsys Users Group (SNUG)
architecture and algorithm to find out better performance, to Conference,
improve energy and power consumption of the entire network [10] Dally, W.J. (1992) ‘Virtual-channel flow control’, IEEE
for large number of IPs. Also, another aim is to perform a Transactions on Parallel and Distributed Systems, Vol. 3, No. 2,
more careful analysis and incorporation of the parasitic and pp.194–205.
[11] K. Manna, S. Chattopadhyay, and I. Sen Gupta, “Performance
leakage effects in the design of ultra-low power NoCs.
Evaluation of a Novel Dimension Order Routing Algorithm for
Mesh-of-Tree based Network-on-Chip Architecture”, In
ACKNOWLEDGEMENT Proceedings of the 1st International Conference on Parallel,
This research work is partially supported by the grants Distributed and Grid Computing (PDGC-2010), pp. 135-139.
[12] S. Kundu, R. P. Dasari, S. Chattopadhyay, K. Manna, “Mesh-
from DIT, Govt of WB under VLSI Design project.
of-Tree Based Scalable Network-on-Chip Architecture”, In
Proceedings of 2008 IEEE Region 10 Colloquium and the Third
ICIIS, Kharagpur, INDIA.
[13] M. Reshadi, A. Khademzadeh, A. Reza, and M. Bahmani, “A
Novel Mesh Architecture for On-Chip Networks”, D & R Industry
Articles, http://www.design-reuse.com/articles/23347/on-chip-
network.html
[14] Dally, W.J. and Seitz, C.L. (1986) ‘The torus routing chip’,
Journal of Distributed Computing, Vol. 1, No. 4, pp.187 – 196.
© 2012 ACEEE 34
DOI: 01.IJCOM.3.1.88