A hardware/software platform for QoS bridging over multi-chip NoC-based systems
Introduction
The demand of executing more and more applications on electronic consumer devices has led to an increase in the size and the complexity of recent embedded systems. On one hand, modular design approaches have enabled integration of numerous processor cores, memories and peripheral IPs into an embedded system; on the other hand, implementation, prototyping and verification of such large designs on chips/FPGAs have become more complex and challenging as follows.
Implementation: although Moore’s law indicates that the number of transistors that can be placed on a chip doubles approximately every two years, the recent designs can be so large that a single chip’s resources are still limited. Recently, there have been some efforts to bring 3D IC stacking on a single chip into practice [1], however, traditionally, system designers partition such large systems into smaller sub-systems to implement them as several Systems-on-Chip (SoCs), or accompany SoCs with a number of companion chips [2]. In such systems, denoted as multi-chip systems, each SoC has to communicate with other SoCs and therefore a multi-chip interconnection scheme is essential.
Prototyping: FPGA prototyping has become a common approach for early software development and hardware design verification [3]. The limited resources of an FPGA, however, are not sufficient for prototyping the current large SoCs. A system is therefore required to be partitioned into number of sub-systems, each of them is implemented on a single FPGA chip [4], and therefore a multi-FPGA system is formed [5]. Potentially, the FPGA chips are located on different circuit boards which are physically decoupled. Such systems require multi-chip interconnection mechanisms that also support board-to-board multi-FPGA communications.
Verification: both the implementations and prototypes of the current complex SoCs may still contain errors. Debug and verification processes are essential to find the erroneous parts of the systems and verify the systems’ correct functionality. Therefore, an external, fast, non-intrusive access to on-chip systems is required for two purposes: (i) to retrieve trace of on-chip data off-chip, and (ii) to enable an external host to perform debug and verification actions [6].
The common problem in the aforementioned trends is a need for an off-chip communication scheme for interconnecting individual SoCs. For this purpose, several techniques have been proposed in the literature, such as the work in [7], [8], [9], however to the best of our knowledge none of them provides a generic solution to answer the need of inter-chip, inter-FPGA, and chip/FPGA-host communications. In this work, we propose a generic, efficient hardware and software architecture for off-chip interconnection of individual SoCs. The SoCs internally have their own communication mechanisms to interconnect the cores and IPs. The off-chip interconnection mechanism is efficient when it is compatible with and seamlessly integrates in the on-chip interconnections. Since recently Network-on-Chip (NoC) [10] has become a common on-chip interconnection technology, our proposed off-chip interconnection scheme is adapted to be compatible with NoCs’ properties.
Various NoC architectures exist in the literature, e.g., Æthereal [11], Nostrum [12], Mango [13], QNoC [14], that may offer one or more quality of service classes. Many applications, e.g., signal processing and video streaming, have timing and throughput demands each require a specific QoS such as Guaranteed Throughput (GT) or Best Effort (BE). Consequently, the off-chip communication scheme should be also able to offer different QoS classes for the traffic of interconnected sub-systems to be compatible with the target on-chip interconnects.
In this paper, we propose a generic, efficient technique for interconnecting a NoC-based system with other (NoC-based) SoCs or any other external IP. In the rest of this paper, we refer to this scheme as bridging, since it makes a bridge for traffic from/to a chip to/from another chip/IP.
To make the bridge fully compatible with the on-chip interconnects we establish four design requirements of the bridge as follows; (i) the bridge should seamlessly extend the NoC such that memory-mapped accesses remain unchanged from applications point of view. In other words, in the global memory space of the system the bridge is transparent such that the memory-consistency model of the system is preserved, (ii) multi-chip bridging at the circuit board level should be supported, and the bridge should decouple temporally and physically the systems implemented on different circuit boards, (iii) the quality of service offered by the overall interconnect (i.e., sub-NoCs + bridge (s)) to the applications should be preserved, and (iv) the bridge should be implemented efficiently with high performance in terms of bandwidth and latency, and with low area cost. In the context of a bridging scheme that fulfills all these requirements, the contribution of our paper is threefold, as follows.
First, we investigate a NoC-based system to identify the possible bridge insertion points (i.e., links). At each link, we study different layers of the on-chip interconnect protocol stack [15] for possible bridging schemes. Our protocol stack model is based on the proposal in [16], where the model consists of five layers referred to as session, transport, network, data link, and physical layer. Our design space investigation is driven by the NoC properties that have impacts on the bridge’s requirements. We refer to these properties as design options, and we identify them as following: (i) parallel/serial link, (ii) flow control, (iii) buffering, (iv) routing, (v) synchronicity, and (vi) QoS. The investigation results in a novel proposal for a bridge design at the transport layer of the NoC.
Second, we propose a hardware architecture for the bridging scheme. The architecture is generic in the sense that it supports all inter-chip, inter-FPGA, and chip/FPGA-IP systems. We implement an HDL version of the bridge kernel that supports generic number of NoC connections each of which may be either best-effort or guaranteed-throughput.
The third contribution of this work is a software architecture to configure and to enable transparent global memory space communications over the bridged sub-systems. This architecture extends the run-time NoC configuration technique proposed in [17] and enables the bridge to integrate with and play as an on-chip host for NoC-based SoCs.
For our experiments, we set up a multi-FPGA system in which two instances of the bridge is utilized; one to connect two sub-NoCs each implemented on a separate FPGA, and another bridge to connect to an external host, which is a PC in our case. The experimental results show that the bridge achieves high-performance in terms of bandwidth and latency, and justify that it is able to provide required QoSs to data traffic.
The rest of this paper is organized as follows. In Section 2 we review the related work. Section 3 gives an overview of our target on-chip interconnection network. In Section 4, the bridge design requirements and the design options are explained in details. Based on them, in Section 5, we investigate the NoC’s links and the protocol stack to propose the best bridging scheme. The bridge hardware architecture is proposed in Section 6 followed by the proposal for the software architecture in Section 7. The experimental results of stand-alone bridge and of the case where the bridge is used in a multi-FPGA set-up are presented in Section 8. Finally, Section 9 concludes the contributions of this paper.
Section snippets
Related work
There are three main research topics that are directly related to our work. This section discusses them in turn, as follows. First, the existing techniques in on-chip and off-chip signaling is introduced to compare the related work that targets NoC-based systems with our proposed technique. Second, the related work in on-chip interconnect protocol stack is discussed; and third, we review interconnect configuration techniques.
Traditionally, a large system is built and prototyped by using
On-chip interconnect overview
The on-chip interconnection network has a key role in the bridging scheme. In this section we give an overview of the interconnection network. A connection between two Intellectual Property (IP) components is set up via the interconnect. The IPs are illustrated as a master and a slave in Fig. 1. The on-chip interconnect consists of traditional bus technology and a NoC architecture. The master starts a request by sending a write or read command to the bus (point 1). The bus is responsible for
Bridging requirements and design options
The requirements for a multi-chip bridge have direct impact on the bridging scheme. Therefore, in this section we first establish the design requirements as follows.
- (1)
Transparency: ideally, from an application point of view the bridge should be invisible, i.e., the global memory consistency model should be maintained. This is especially essential in the case of FPGA emulation of a partitioned system, where the emulated system should be functionally as close as possible to the real prototype. In
Protocol stack exploration
A connection between a master and a slave is formed by a set of physical communication links through the network, e.g., router–router and router-NI link. The links are illustrated as numbered points in Fig. 1. Possibly the bridge could be instantiated at each of these links. In this way, the bridge either partitions a NoC into two sub-networks, or connects two different NoC-based systems at these links.
Moreover, there is more than one layer of the interconnect protocol stack which is involved
Bridge hardware architecture
In this section we present the detailed architecture of the bridge to implement Scheme V illustrated in Fig. 2. The bridge architecture diagram is depicted in Fig. 3(a). The bridge Kernel consists of five main units that are responsible for, (i) providing off-chip board-to-board interface, (ii) forming off-chip communication data, (iii) interfacing with multiple connections of the target NoC, (iv) arbitrating between the connections, and (v) controlling the flow of on-chip and off-chip data.
Software architecture
Typically, NoC-based SoCs require a run-time configuration scheme [17]. A processor core which is usually on the same chip as the NoC is locally responsible to perform the configuration. This processor is called host. In the presence of an off-chip connection, the host may be an external IP such as a Personal Computer (PC) or a local host of another SoC on a separate chip.
In this section we first briefly present a basic interconnect configuration scheme, which is proposed in [17], and we show
Discussion and experimental results
In this section we first exercise the standalone bridge to evaluate its area cost when implemented on an FPGA, and to assess its performance under variable traffic loads. Second, we use the bridge in a NoC-based system to connect two sub-systems implemented on two separate FPGA chips. Using this setup, we show system level performance results of the bridge. The results is obtained from experimenting the system for interconnect configuration with various applications’ data transfer. The results
Conclusions
In this paper we proposed a generic, efficient off-chip bridging scheme for SoCs that are implemented on separate silicon or FPGA chips. The scheme is compatible with NoCs technology, which is commonly utilized in recent embedded systems. We have investigated the protocol stack of an on-chip interconnect to determine the best layer of implementing the bridge on possible links of the interconnect. The proposal is a scheme at the transport layer of the stack. At this layer, the bridge fulfills
References (54)
- et al.
QNoC: QoS architecture and design process for network on chip
Journal of Systems Architecture
(2004) - et al.
Bridging the processor-memory performance gap with 3D IC technology
IEEE Design and Test of Computers
(2005) - F. Steenhof, H. Duque, B. Nilsson, K. Goossens, R.P. Llopis, Networks on chips for high-end consumer-electronics TV...
- et al.
Evaluating large system-on-chip on multi-FPGA platform
Embedded Computer Systems: Architectures, Modeling, and Simulation
(2007) - A.-M. Kouadri-Mostefaoui, B. Senouci, F. Petrot, Scalable multi-FPGA platform for networks-on-chip emulation, in: Proc....
- S. Hauk, Multi-FPGA Systems, Ph.D. Thesis, University of Washington,...
- K. Goossens, B. Vermeulen, A.B. Nejad, A high-level debug environment for communication-centric debug, in: Proc. DATE,...
- A. Shacham, K. Bergman, L. Carloni, On the design of a photonic network-on-chip, in: Proc. NOCS, 2007, pp....
- M. Stepniewska, A. Luczak, J. Siast, Network-on-Multi-Chip (NoMC) for Multi-FPGA Multimedia Systems, in: Proc. Conf. on...
- S. Furber, S. Temple, A. Brown, On-chip and inter-chip networks for modeling large-scale neural systems, in: Proc....
A silicon-on-silicon field programmable multichip module (FPMCM) integrating FPGA and MCM technologies
IEEE Transactions on CPMT
NoC design flow for TDMA and QoS management in a GALS context
EURASIP Journal on Embedded Systems
BEE2: a high-end reconfigurable computing system
IEEE Design and Test of Computers
Cited by (7)
Performance realization of bridge model using ethernet-mac for noc based system with fpga prototyping
2020, Indonesian Journal of Electrical Engineering and InformaticsAn Efficient Bridge Architecture for NoC Based Systems on FPGA Platform
2020, Advances in Intelligent Systems and ComputingNoC-basedmultiprocessor architecture formixed-time-criticality applications
2017, Handbook of Hardware/Software CodesignBackoff hardware architecture for inter-FPGA traffic management
2017, Proceedings of International Conference on Advanced Systems and Electric Technologies, IC_ASET 2017Evaluation of NoC on multi-FPGA interconnection using GTX transceiver
2017, ICECS 2017 - 24th IEEE International Conference on Electronics, Circuits and Systems