## **LETTER RONoC: A Reconfigurable Architecture for Application-Specific Optical Network-on-Chip**

## Huaxi $GU^{\dagger a}$ , Member, Zheng CHEN<sup>†</sup>, Yintang YANG<sup>††</sup>, and Hui DING<sup>†</sup>, Nonmembers

**SUMMARY** Optical Network-on-Chip (ONoC) is a promising emerging technology, which can solve the bottlenecks faced by electrical on-chip interconnection. However, the existing proposals of ONoC are mostly built on fixed topologies, which are not flexible enough to support various applications. To make full use of the limited resource and provide a more efficient approach for resource allocation, RONoC (Reconfigurable Optical Network-on-Chip) is proposed in this letter. The topology can be reconfigured to meet the requirement of different applications. An 8×8 nonblocking router is also designed, together with the communication mechanism. The simulation results show that the saturation load of RONoC is 2 times better than mesh, and the energy consumption is 25% lower than mesh.

key words: reconfiguration, optical Network-on-Chip, applicationspecific, optical router

### 1. Introduction

Network-on-Chip is a promising way to integrate multi-core system on a single chip [1], [2]. Compared with the conventional electronic interconnect, optical interconnect is capable of providing more bandwidth at a lower power budget. Therefore, photonic technology is becoming an increasingly attractive solution to the various challenges faced by on chip communication. Meanwhile, recent developments in silicon photonics have made it feasible to fabricate all the necessary optical building blocks on a single chip [3], achieving high-bandwidth and energy-efficient intra-chip communications. Various architectures have been proposed, most of which are established on fixed topology [4]–[7]. However, the NoC architecture may face various applications which own different traffic characteristics. Thus, the ONoC proposed in literature often lacks flexibility to meet the different requirements of the applications.

On the other hand, since the technology of optical buffers is not so mature in today's photonic technology, optical packet switching is difficult to implement for on chip networks. Hence, a circuit-switching-like transmission mechanism is often employed for ONoC. However, limited path diversity is one of the main disadvantages, especially for the mesh based ONoC. The high contention probability leads to high latency and degraded throughput. One solution is to employ the concept of reconfiguration to meet different requirement of various applications [8], [9]. It can also help to increase the success rate of the path establishment. In the electrical domain, some methods to complete reconfiguration have been available. Stensgaard et al. [10] presented a NoC architecture called ReNoC that allows the network topology to be statically reconfigured by using an intelligent switch. Binzhang Fu and Yinhe Han et al. proposed a configurable wormhole routing algorithm [11]. In the optical domain, related researches are relatively few. Artundo et al. propose a reconfigurable optical interconnect and its topology in optical layer is adapted automatically to the evolving traffic situation, allowing a large fraction of the short coherence messages to use the optical links [12]. Gao et al. present the RePNoC architecture which is optional to preset the topology by configuring the connections according to different application patterns [13]. The connections between source and destination IP cores with long links can be set up by bypassing the intermediate routers.

In this letter, a reconfigurable architecture for application-specific optical Network-on-Chip is proposed, the interconnection of which can be reconfigured for different applications, achieving lower latency and power consumption. RONoC topology has a simple architecture and is easy to configure the connections just by altering the states of different groups of microring resonators (MRs). A configuration algorithm for different traffic patterns is also developed, which applies universally to diverse applications. The simulation results prove that RONoC provides a lowlatency and energy-efficient performance.

# 2. The Architecture of Reconfigurable Optical Network-on-Chip

#### 2.1 Network Architecture

The mesh topology has the advantages of regular shape and simple structure, so it has wide application in the area of NoC. However, it cannot well satisfy specific applications, especially when the traffics among different communication pairs fluctuate dramatically. Therefore, it is necessary to reconfigure the topology to meet the requirement of different applications. Consequently, RONoC, a variant of traditional mesh topology, is employed for the ONoC architecture, which inherits the advantages of mesh and achieves better flexibility. Figure 1 illustrates the architecture of RONoC. There are two types of nodes in the RONoC, i.e. routers and IP cores. The nodes (IP core or router) are placed in rows and columns like mesh topology. Each router has eight

Manuscript received July 1, 2013.

<sup>&</sup>lt;sup>†</sup>The authors are with State Key Laboratory of Integrated Service Networks, Xidian University, Xi'an, 710071, China.

<sup>&</sup>lt;sup>††</sup>The author is with School of Microelectronic, Xidian University, Xi'an, 710071, China.

a) E-mail: hxgu@xidian.edu.cn

DOI: 10.1587/transinf.E97.D.142



**Fig. 1** Architecture of  $4 \times 4$  RONoC.

ports, four of which are used to link up with the four neighboring routers. The other four ports are used to connect the four IP cores around the router. If more IP cores are connected to the router, the number of ports will increase, which leads to higher hardware cost and more power consumption. Each IP core is connected to the router by waveguides and three pairs of MRs. For each pair of MRs, one is used for receiving, and the other is for transmitting. The optical/electronic (O/E) and electronic/optical (E/O) interfaces between the router and IP core take charge of converting optical signal into electronic signal and vice versa.

Compared with some popular topologies used in ONoC, like mesh, the proposed RONoC owns some advantages. The diameter and average distance of the RONoC are much smaller than that of mesh, which helps to decrease the latency and energy consumption. More link options provide a more flexible choice when configuring the topology.

#### 2.2 Implementation of the Reconfiguration

The RONoC needs to be configured to decide which router the IP core will connect to. When the application changes, the RONoC should be reconfigured to meet the new requirements. The configuration and reconfiguration of RONoC is completed by controlling the three pairs of MRs around the IP core. In default situation, these six MRs are all in the off state. Hence, each IP core is connected to the south-west router. In this situation, the topology of RONoC is the same with the mesh topology. If the IP core needs to be connected to other routers, the reconfiguration is carried out by powering on the related MRs. For example, if the IP cores (0, 0) need to be connected to the router (1, 1), the resonators 3 and 4 should be powered on. The rest resonators 1, 2, 5 and 6 are still in the off state. Hence, the optical signal issued by the IP core (0, 0) will be coupled by the resonator 3 to the router (1, 1). The signal from the southwest part of the router (1, 1) will be coupled by the resonator 4 to the IP core



**Fig. 2** Layout of  $8 \times 8$  nonblocking switching fabric.

(0, 0).

After each IP core executes such configuration operation, the reconfiguration of the topology is completed. Certainly, the reconfiguration depends on the specific application, therefore we propose a high-efficiency configuration algorithm for different applications which will be introduced in details in Sect. 3.

#### 2.3 Router Architecture

Optical router is one of the key components of RONoC, which implements the function of routing packets. To route packets from and to the eight distinct ports without blocking, an  $8 \times 8$  optical router architecture is designed. It consists of a switching fabric and a control unit. The switching fabric is the core of the optical router, as is shown in Fig. 2. It switches optical signals from one input port to another output port without blocking. In our design, the switching fabric is assembled by 54 MRs and 14 waveguides. The control unit is built from traditional CMOS transistors. It uses electrical signals to configure the switching fabric by powering on and off each MR according to the routing information. The control units of all the routers in RONoC use the electronic network to set up the optical paths.

#### 2.4 Communication Mechanism

To reduce the blocking probability in the circuit-switched network, a hybrid communication mechanism is designed to improve the performance. For the IP cores connected to the same optical router, they can communicate directly if both the input and output are free, thus lowering the latency penalty efficiently. When the output port is granted, the related resonator inside the router will be powered on. Hence, the optical signal from the source IP core pass through the router and arrives at the destination IP core directly. Otherwise, for the source and destination cores not connected to the same router, a physical path should be reserved from the source to the destination before the optical data is transmitted. The path, consisting of several intermediate routers and the waveguides connecting them, is manipulated by three electrical control signals: SETUP, ACK and RELEASE. The SETUP signal is issued before the transmission of optical data. It progresses toward the destination by using the XY routing algorithm and reserves intermediate links. If the SETUP signal is blocked, it will wait until the required resource is released by the corresponding RELEASE signal. When the SETUP signal reaches its destination, an ACK signal is sent back to the source and powers on the related resonators in the corresponding optical routers along the reserved path. Once receiving the ACK signal, the source sends the optical data. Then the transmission of data begins, after which the source sends a RELEASE signal to tear down the reserved path.

#### Configuration Algorithm for the Reconfigurable 3. **Optical Network-on-Chip**

To make full use of RONoC, a configuration algorithm is proposed aiming at optimized performances for specific applications.

The first step in the proposed algorithm is to determine the number of routers of NoC. The number of the routers for RONoC should be large enough to interconnect the IP cores for a certain application. However, the fewer number of routers, the less overhead occurs. Taking the chip shape into account, the gap between the number of routers in x dimension and y dimension should be reduced as small as possible.

The next step is to configure the RONoC for the application, i.e. mapping IP cores to the network and connecting IP cores to the appropriate routers. Power consumption is a big concern for NoC design. To save energy, IP cores with large communication traffic should be placed as near as possible. Before mapping, we divide the IP cores into clusters. The size of each cluster is no more than 4, because each router in RONoC can connect at most 4 IP cores. Then the clusters are placed onto the network to achieve lowest energy consumption. The algorithm is listed in Fig. 3.

#### 4. Simulation and Results

We build a network simulator for RONoC based on OPNET and compare it against optical NoC based on mesh network in terms of latency, energy consumption, and insertion loss. As VOPD is widely employed as a benchmark for application specific NoC [14], [15], we compare RONoC with optical NoC based on mesh using this application. The mesh network adopts genetic algorithm (GA) and simulated annealing (SA) to map the VOPD application. Both of the networks employ circuit-switching mechanism. Setup packet is set to be 32-bit and data packet is 1024-bit long. The transmission bandwidth is assumed to be 12.5Gbps, which can be achieved by the current nanophotonic devices [16].

The comparison of ETE (End to End) delay is shown in Fig. 4. RONoC has a doubled saturation load than that of mesh topology, no matter GA or SA algorithm is used. The performance improvement of RONoC is mainly brought by shorter hops. And the direct connections of adjacent cores Configuration Algorithm for RONoC

- 1: initialize N← the number of IP cores, O←N:
- /\*specify the size of RONoC\*/ 2: do{

3: get x and y satisfying Q=x×y;

- 4: 0←0+1;}
- 5: while(x, y don' t satisfy the constraints of two dimensions)
- 6: for(all IP cores ni) /\*divide IP cores into several clusters\*/
- if the communication volume from ni to ni is the highest then 7: create IP pair(ni,ni);

8.

- 9: end if 10: end for
- 11: for(all IP pairs)
- if IP pair(ni,nj) and IP pair(nj,ni)exist at the same time then 12.
- 13: delete IP pair(ni,nj) and IP pair(nj,ni);
- 14: create cluster{ni,ni}
- 15: end if
- 16: end for
- 17: put left IP pairs in order as the intra-pair communication volume decreases;
- 18: for(all left IP pairs(ni,nj)) 19: if ni or nj is one element in one existed cluster A then
- if the number of IP cores in cluster A is less than 4 then 20:
- 21: add n; or n; into cluster A
- 22: end if
- else then 23:
- 24: delete IP pair(ni,nj),construct cluster{nj}or {ni};
- 25. end else
- 26: end if
- 27: end for
- 28: for (all clusters) /\*map clusters to RONoC\*/
- 29: calculate intra-cluster communication volume of each cluster
- 30: end for

31: search for cluster B of which intra-cluster communication volume is the highest; 32: map IP cores in cluster B to RONoC with performance optimized;

- 33: label cluster B mapped;
- 34: while(there are clusters unmapped) then
- search for one unmapped cluster C from which the communication volume 35: is the highest to the mapped clusters;

map IP cores in cluster C to RONoC with performance optimized; 36. 37: end while





also contribute to the performance gain, because these cores can communicate directly without a path reservation phase.

The energy consumption comprises the energy for path reservation in the electrical control network  $(E_{Eletrical})$  and the energy for control related optical devices in the optical network ( $E_{Optical}$ ). The specific calculation of the energy consumption is determined by (1),

$$E_{total} = E_{Electrical} + E_{Optical}$$
  
=  $(E_{setup} + E_{ack}) + \left(\sum_{on} t_{on} \cdot P_{ring} + E_{O/E} + E_{E/O}\right)$   
(1)

where  $E_{setup}$  and  $E_{ack}$  are the energy consumed by the setup packets and ACK packets,  $P_{ring}$  is the power for tuning an MR,  $E_{E/O}$  and  $E_{O/E}$  are the energy needed in transmitter



Fig. 5 Comparisons of energy consumption and insertion loss.

and receiver circuits. Figure 5 illustrates the energy comparison of RONoC and mesh. The electrical energy and optical energy consumed by each packet in these three scenarios hardly have an obvious difference, but the average total energy in RONoC is only 25% and 47% of that in Mesh-GA and Mesh-SA, respectively. The main reason is that many packets in RONoC are transmitted by skipping the path reservation phase, therefore they have no energy consumed in the electrical control network.

The power attenuation is dependent on the optical losses, such as waveguide propagation  $(IL_{travel})$ , waveguide bendings  $(IL_{bend})$ , crossings  $(IL_{cross})$ , off and on-resonance of passive or active MRs  $(IL_{through} \text{ and } IL_{drop})$ , modulators  $(IL_{modulator})$ , detectors  $(IL_{detector})$ , etc. The analysis of insertion loss is based on the following formula,

$$Loss = L_{wg} \cdot IL_{travel} + \sum IL_{bend} + \sum IL_{cross} + \sum IL_{through} + \sum IL_{drop} + IL_{modulator} + IL_{detector}$$
(2)

where  $L_{wg}$  is the length of the waveguide. Although much more waveguide crossings and ring drops emerge in the RONoC, RONoC can still maintain a similar average optical loss as the Mesh-GA does. As the Fig. 5 shows, the max optical loss of RONoC is even a bit lower than that of Mesh-GA. The shorter hop of RONoC amortizes the dense loss penalty of its architecture.

#### 5. Conclusion

The RONoC architecture is proposed to enable the topology to dynamically match the communication patterns of various applications. The reconfiguration operation is simple, which just needs to power on or off the related resonators. A configuration algorithm for RONoC is developed to achieve lower power consumption. The simulation results show that RONoC has a much higher throughput and smaller power consumption. In our future work, some dynamic traffic patterns, like all to all communication, will be considered. New configuration algorithm will be developed.

#### Acknowledgments

This work is supported partly by the National Sci-

ence Foundation of China under Grant No.61070046 and No.60803038, the special fund from State Key Lab (No.ISN1104001), the Fundamental Research Funds for the Central Universities under Grant No.K5051301003, the 111 Project under Grant No.B08038.

#### References

- K. Goossens and A. Hansson, "The aethereal network on chip after ten years: Goals, evolution, lessons, and future," Proc. 47th DAC, pp.306–311, Anaheim, CA, USA, June 2010.
- [2] H. Matsutani, M. Koibuchi, Y. Yamada, D.F. Hsu, and H. Amano, "Fat H-Tree: A cost-efficient tree-based on-chip network," TPDS, vol.20, no.8, pp.1126–1141, Aug. 2009.
- [3] G. Chen, H. Chen, M. Haurylau, N. Nelson, D.H. Al-bonesi, P.M. Fauchet, and E.G. Friedman, "Predictions of CMOS compatible onchip optical interconnect," Proc. ACM/IEEE Int. Workshop Syst. Level Interconnect Prediction, pp.13–20, 2005.
- [4] D. Vantrease, R. Schreiber, M. Monchiero, et al., "Corona: System implications of emerging nanophotonic technology," Proc. 35th ISCA, June 2008.
- [5] G. Kurian, J.E. Miller, J. Psota, J. Eastep, J. Liu, J. Michel, L.C. Kimerling, and A. Agarwal, "Atac: A 1000-core cache-coherent processor with on-chip optical network," Proc. 19th PACT, pp.477–488, NewYork, NY, USA, 2010.
- [6] Z. Li, M. Mohamed, and X. Chen, "Iris: A hybrid nanophotonic network design for high-performance and low-power on-chip communication," JETC, vol.7, no.2, June 2011.
- [7] J. Chan and K. Bergman, "Photonic interconnection network architectures using wavelength-selective spatial routing for chip-scale communications," J. Opt. Commun. Netw., vol.4, pp.189–201, 2012.
- [8] M. Modarressi and H. Sarbazi-Azad, "Power-aware mapping for reconfigurable NoC architectures," Proc. 25th ICCD, pp.417–422, Lake Tahoe, CA, USA, Oct. 2007.
- [9] R. Dafali, J.-P. Diguet, and M. Sevaux, "Key research issues for reconfigurable network-on-chip," Proc. 2008 Int. Conf. on Reconfigurable Computing and FPGAs, pp.181–186, 2008.
- [10] M.B. Stensgaard, J. Sparso, and M. Bystrup, "ReNoC: A networkon-chip architecture with reconfigurable topology," Proc. 2nd NoCS, pp.55–64, Newcastle upon Tyne, April 2008.
- [11] B. Fu, Y. Han, J. Ma, H. Li, and X. Li, "An abacus turn model for time/space-efficient reconfigurable routing," Proc. 38th ISCA, pp.259–270, San Jose, CA, USA, June 2011.
- [12] I. Artundo, W. Heirman, M. Loperena, C. Debaes, J.V. Campenhout, and H. Thienpont, "Low-power reconfigurable network architecture for on-chip photonic interconnects," Proc. 17th HOTI, pp.163–169, New York, NY, USA, Aug. 2009.
- [13] Y. Gao, Y. Jin, Z. Chang, and W. Hu, "Ultra-low latency reconfigurable photonic network on chip architecture based on application pattern," Proc. OFC, 2009.
- [14] S. Lin, L. Su, H. Su, D. Jin, and L. Zeng, "Hierarchical cluster-based irregular topology customization for Networks-on-Chip," Int. Conf. on Embedded and Ubiquitous Computing, pp.373–377, Dec. 2008.
- [15] D. Bertozzi, A. Jalabert, S. Murali, R. Tamhankar, S. Stergiou, L. Benini, and G. De Micheli, "NoC synthesis flow for customized domain specific multiprocessor systems-on-chip," IEEE Trans. Parallel Distrib. Syst., vol.16, no.12, pp.113–129, 2005.
- [16] Q. Xu, S. Manipatruni, B. Schmidt, J. Shakya, and M. Lipson, "12.5 Gbit/s carrier-injection-based silicon micro-ring silicon modulators," Opt. Express, vol.15, no.2, pp.430–436, 2007.