NoC-DPR: A new simulation tool exploiting the Dynamic Partial Reconfiguration (DPR) on Network-on-Chip (NoC) based FPGA
Introduction
Many applications, mapped on SRAM-based FPGAs (Field Programmable Gate Arrays), such as signal processing, including image and video, software defined radio (SDR) [1], and electronic measurement applications are increasingly using Dynamic Partial Reconfiguration (DPR) feature. Moreover, partially reconfigurable (PR) devices save chip area by programming only the necessary physical resources in each operation phase. Accordingly, area and power are saved by programming only the desired block, which allows for static leakage reduction.
The prime factor to check the feasibility of using DPR techniques, such as ICAP and JTAG, is the available lead time, which is the latency between the configuration and the initiation of a PR, and denoted by: reconfiguration time (RT). Consequently, more researchers aim to optimize the RT of DPR that is related and limited to the frame size of SRAM-based FPGA layouts [1].
Due to the continuous scaling of CMOS devices, manufacturers are increasing the number of functions implemented on a single chip. Therefore, the concept of System-on-Chip has been introduced, which consists of processing elements (PEs) and storage elements (SEs) connected by a complex communication architecture.
Within the last few years, communication among these PEs is destined to become a vital factor in the design of large-scale systems. As the focus is to increase the number of PEs in parallel in order to maximize the capability of modern designs, thus the processing power has increased and data-intensive applications have emerged. Consequently, several challenges of the communication among these PEs, when configured on FPGAs, have become significant and require innovative solutions. Therefore, a prominent concept for communication known as Network-on-Chip (NoC) has been adapted for FPGAs to handle these PEs communication challenges.
To investigate this NoC concept, a state-of-art tool denoted by NoC-DPR is developed [2], which is a cycle-accurate simulator for NoCs that support DPR. This tool is used to simulate the performance of NoC-based FPGA. In NoC-DPR, a NoC simulator namely: NoCTweak [3] and a SystemC Library called ReChannel [4], which is a DPR simulation library, are integrated. All PEs of NoC are reconfigured dynamically to adopt a new application at run-time.
This paper is organized as follows. Section 2 provides an overview of previous related research efforts in DPR simulation and NoC simulation. In Section 3, the NoC-DPR simulator architecture is presented. Section 4 investigates the NoC-DPR performance compared to NoCtweak simulator. In Section 5, The DPR experiment is analyzed along with the results. Section 6 illustrates the case study of the embedded application using NoC-DPR simulator. Design insights and recommendations to implement DPR for NoC-based FPGA are stated in Section 7. Finally, Section 8 concludes the paper and presents the future work.
Section snippets
Related work
Since the first generation of Xilinx FPGAs that support DPR, Virtex-II at early 2007 [1], the design for DPR was a slight complex task due to the lack of supporting tools, and the requirement of full understanding of the FPGAs architecture. Therefore, FPGA designers use DPR simulators at early design stages as a proof of concept, and to reduce the time to market. Several approaches [[5], [6], [7]] have been proposed to model dynamically reconfigurable systems at system-level using SystemC,
NoC-DPR simulator architecture
NoC-DPR simulator is a command line based tool that consist of a 2-D mesh network of routers, simulated by NoCTweak [3], as illustrated in Fig. 2. Each node consists of a Processor Element (PE), Network Interface (NI), and an associated router. Each router connects with four nearest neighboring routers forming a 2-D mesh network. Using ReChannel [4] library, each PE is dynamically reconfigured by a special type of data packet, generated from certain node (master node 0, 0). Data packets are
Network interface impact
Network interface is composed of two decoupling buffers that are responsible for storing and synchronizing flits (a flit stands for FLow control unIT, which is the minimum unit of the message). Inserting an explicit NI between the PE and the router, affects on the network performance specifically on the latency and the throughput.
The latency after inserting NI is measured and compared to the latency of NoCTweak. Network of wormhole routers with buffer size 2-flits per input port running at
Results and discussion
The test experiment in this work aims to simulate the DPR on NoC-based FPGA platform using different network sizes, and different number of parallel DPR; thus the comparison is held with respect to the Reconfiguration Time (RT). Initially, Virtex-5 xc5vfx100t FPGA is used to select different partial reconfiguration regions (PR), then the bitstream sizes of each configuration region are calculated using Xilinx ISE v14.7 tool. Finally, RT is determined by using partial reconfiguration cost
Case study: NoC-DPR with embedded application
Many embedded applications such as 802.11a WiFi receiver [17], Video Object Plane Decoder, and multimedia system [18] are examined using NoC-DPR simulator to have an early access to the application performance at the design stage.
Each application is composed of different number of tasks with different FIR. All tasks are mapped onto the network using either random mapping or n-map mapping algorithm [19]. Furthermore, each task communicates with one or multiple destinations. The specifications of
Design recommendations
Some design insights and recommendations should be taken into account during the design of DPR on NoC-based FPGAs using the proposed NoC-DPR simulator:
- •
A general NoC platform cannot be used to implement DPR application directly. For instance, when one process element (PE) is performing DPR, network should prevent other PEs sending or receiving data to/from this PE until DPR is finished.
- •
In proposed NoC-DPR simulator, it is assumed that PE (0, 0) is the master of DPR process that is responsible
Conclusion and future work
In this work, a state-of-art NoC-DPR simulator is proposed, and some recommendations are extracted for the implementation of DPR on NoC-based FPGA to get the optimal size of network.
It is obvious that NoC-based FPGA enhances reconfiguration capabilities because multiple DPRs are performed simultaneously. However, supporting multiple DPRs needs to add more resources such as controlling unit and decoupling buffers. Accordingly, the reconfiguration time of DPR with NoC is better than
Acknowledgment
This research was funded by NTRA, ITIDA, Cairo University, Zewail City of Science and Technology.
References (19)
- et al.
Performance evaluation of dynamic partial reconfiguration techniques for software defined radio implementation on FPGA
- et al.
Exploiting the dynamic partial reconfiguration on noc-based FPGA
- et al.
NoCTweak: a Highly Parameterizable Simulator for Early Exploration of Performance and Energy of Networks On-chip
(July 2012) - et al.
ReChannel: describing and simulating reconfigurable hardware in SystemC
Advanced Methodolgy for Designing Reconfigurable SoC and Application-targeted IP-entities in Wireless Communications Webpage
(2002)- et al.
System-Level modelling for reconfigurable SoCs
- et al.
An open-source tool for simulation of partially reconfigurable systems using SystemC
- et al.
Designing for dynamic partially reconfigurable FPGAs with SystemC and OSSS
BookSim Interconnection Network Simulator
Cited by (3)
Smart Communication Using 2D and 3D Mesh Network-on-Chip
2022, Intelligent Automation and Soft ComputingDesign alternatives of Network-on-Chip (NoC) Router microarchitecture for future Communication System
2022, Proceedings - IEEE International Conference on Advances in Computing, Communication and Applied Informatics, ACCAI 2022Optimal Runtime Algorithm to Improve Fault Tolerance of Bus-Based Reconfigurable Designs
2020, IEEE Transactions on Very Large Scale Integration (VLSI) Systems