# LETTER A Configurable Hardware Word Re-Ordering Block for Multi-Lane Communication Protocols: Design and Use Case

Pietro NANNIPIERI<sup>†a)</sup>, Gianmarco DINELLI<sup>†b)</sup>, Nonmembers, and Luca FANUCCI<sup>†c)</sup>, Member

**SUMMARY** Data rate requirements, from consumer application to automotive and aerospace grew rapidly in the last years. This led to the development of a series of communication protocols (i.e. Ethernet, PCI-Express, RapidIO and SpaceFibre), which use more than one communication lane, both to speed up data rate and to increase link reliability. Some of these protocols, such as SpaceFibre, are able to detect real-time changes in the number of active lanes and to adapt the data flow appropriately, providing a flexible solution, robust to lane failures. This results in a real time varying data path in the lower layers of the data handling system. The aim of this paper is to propose the architecture of a hardware block capable of reading a fixed number of words from a host FIFO and shaping them on a real time variable number of words equal to the number of active lanes.

key words: VLSI, multi-lane, high-speed, word re-ordering, SpaceFibre

## 1. Introduction

During the last few decades, within consumer applications, several protocols implemented the possibility of establishing a multi-lane link, in order to increase the achievable data rate. Examples of these protocols are PCI-Express [1], Ethernet [2] and RapidIO [3]. In the last few years, the consistent data rate requirement started to become relevant also in other fields of application. For instance, in the space field the presence of Synthetic Aperture Radar imager, mounted as satellites payload, requires data rate of several Gbps. The space community handled this problem by either adapting several existing consumer communication protocols to the stringent requirements in terms of robustness and reliability of space applications (i.e. RapidIO [4], [5]) or by the development of dedicated and application-optimized protocols, such as SpaceFibre [6], [7]. In particular, a SpaceFibre protocol implementation shall detect in real time changes in the number of active lanes of the communication link, in order to preserve data integrity and speed up the link reconnection process. In this paper, a general purpose digital Intellectual Property (IP) block (from now on named SWIP, SWitching IP) is described. The SWIP takes as input a fixed length data array and splits it on a variable number (equal or minor) of words equal to the number of lanes, which may change state in real time, without any loss of data, preserving the initial order of the words. The block is designed to be inserted in the data link layer of a multi-lane communication protocol,

DOI: 10.1587/transfun.E102.A.747

between the physical layers and the host interface, which is supposed to be implemented as a fixed length asynchronous FIFO [8]. All high-speed protocols are different in terms of requested flexibility and implementation. However, after a review of PCI-Express, Ethernet and RapidIO, and of an emerging space qualified protocol, SpaceFibre, a series of general requirements have been produced. Let us consider N as the maximum number of parallel communication lanes of the system, M the actual number of active lanes and W\_w the width in bits of the single word. The SWIP shall:

- process a fixed length input data stream in order to map it on a different length output data path without any data loss.
- have an interface with the host able to transfer N words in parallel.
- have an interface with lower layers able to keep separate the word sent to each lane.
- know the status of each communication lane.
- have a reset interface (may be synchronous or asynchronous, implementation dependent) and a global enable.
- send the words over the link in the same order in which they are read out of the host FIFO.

The necessity of a block able to shape the number of words to be sent from a fixed number to a real time variable number arises from these requirements. The proposed solution aims to handle the problem generically to be adoptable by the greatest number of multi-lane communication protocols. In Fig. 1 the black box model of the proposed circuit is shown with its inputs, outputs and main configuration sig-



Fig. 1 Switching block IP black box model.

Copyright © 2019 The Institute of Electronics, Information and Communication Engineers

Manuscript received January 4, 2019.

<sup>&</sup>lt;sup>†</sup>The authors are with Department of Information Engineering, University of Pisa, Italy.

a) E-mail: pietro.nannipieri@ing.unipi.it

b) E-mail: gianmarco.dinelli@ing.unipi.it

c) E-mail: luca.fanucci@unipi.it

IEICE TRANS. FUNDAMENTALS, VOL.E102-A, NO.5 MAY 2019

| Table I SwIP interfaces. |       |     |                                          |  |  |  |
|--------------------------|-------|-----|------------------------------------------|--|--|--|
| Name                     | Bit   | I/O | Description                              |  |  |  |
| Data_in                  | N*W_w | Ι   | N words read out of the Host Asynch.     |  |  |  |
|                          |       |     | FIFO                                     |  |  |  |
| FIFO_empty               | 1     | Ι   | Flags host FIFO emptiness                |  |  |  |
| active_lanes             | М     | Ι   | Single lane status information. $i^{th}$ |  |  |  |
|                          |       |     | bit = 1 means that corresponding lane    |  |  |  |
|                          |       |     | is active                                |  |  |  |
| LL_read                  | 1     | Ι   | Signals that the lower layer is ready    |  |  |  |
|                          |       |     | to receive a row it in the subsequent    |  |  |  |
|                          |       |     | clock cycle                              |  |  |  |
| clk                      | 1     | Ι   | Clock signal                             |  |  |  |
| reset                    | 1     | Ι   | Reset signal (Synch. or asynch.)         |  |  |  |
| enable                   | 1     | Ι   | Global enable                            |  |  |  |
| FIFO_read                | 1     | 0   | Requests N words row to be read out      |  |  |  |
|                          |       |     | of the host FIFO                         |  |  |  |
| Data_out                 | M*W_w | 0   | Words to be sent to each active lane,    |  |  |  |
|                          |       |     | keeping the correct order                |  |  |  |

SWID interfector

Tabla 1

nals. The interface is composed by the signals described in Table 1. The necessity of a word re-ordering block arises from the fact that multi-lane protocols such as SpaceFibre may allow the number of active lanes to dynamically change without any data loss. The FIFO width is equal to N\*W\_w in order to exploit the maximum achievable bandwidth. This implies that if M < N, the read words shall be mapped on the available active lanes, preserving data order and avoiding data loss. Literature lacks of works compatible with such requirements. The circuit described in [9] is composed of two buffers and one multiplexer and appears to be absolutely not flexible in terms of number of lanes to be handled. Another similar block is described in [10], but also in this case not enough flexibility is provided, as the number of output words cannot dynamically changes and appears to be limited if compared with the SWIP. Our proposed solution instead is able to redirect a data path of N parallel words on M lanes, with M realtime variable.

### 2. Design

The SWIP reshapes the data packet in case of one or more lane failures, allowing coherent data transmission. It selects M consecutive words, where M is the number of active lanes, and it assigns each one to the corresponding active lane. The proposed architecture has been developed, integrated and tested with the SpaceFibre multi-lane protocol [6]. It consists in three main components as represented in Fig. 2: the Flux-control finite state machine, the Reshaping block and two registers Reg\_0 and Reg\_1. The Flux-control FSM (from now on FC FSM) manages the data stream to perform a correct communication. It generates the signal index, responsible for selecting consecutive words stored in Reg\_0 and Reg\_1, and the signal *Fifo\_read*, which allows to shift a complete data row from *data\_in* to Reg\_0 and from Reg\_0 to Reg\_1. The signal also acts as read enable for the host FIFO. The presented interface is meant to generically represent the structure of technology independent FIFO. The Reshaping block assigns the words to the active lanes. The two registers are used to synchronize data transmission. In



the following, the mechanism behind the block scheme will be described, both in the case that the number of active lane is M = N (all lane are active, no realignment is necessary) and M < N (one or more lanes are not active, realignment is necessary).

In case that all lanes are active (M = N), a complete data row is shifted from *data\_in* to Reg\_0 and from Reg\_0 to Reg\_1. The FC FSM signals *FIFO\_read* and *index*, showed in Fig. 2, do not affect system operations.

In case of one or more lane failures, the total link data rate decreases: the SWIP shall read M consecutive words out of the host FIFO, with M < N, as it is no longer possible to transmit a complete data row per clock cycle. The signal active lanes is supposed to change synchronously with the signal *clk*. The adopted two shift registers structure, Reg\_0 and Reg\_1, has been chosen in order to provide the requested number of word each clock cycle. To be clear, let us consider a three lanes link composed of two active lanes and one inactive lane, shown in Fig. 3. The FC FSM shall compute each clock cycle both Fifo\_read and the value of index to select two consecutive words per clock cycle. Fifo\_read is set to 1 when there are not word to be read in Reg 1. In the first clock cycle, a), index is set to 0 and the system reads out W0/1. As W2 has not been read, no data is read out of the host FIFO and shift between Reg\_0 and Reg\_1 is inhibited. After one clock cycle, in b), index value is 2, thus W2/3 are read across Reg\_0 and Reg\_1. No more data have to be read from Reg\_1, thus in c) a data row is read out from the host FIFO and Reg\_0 is shifted in Reg\_1. Index is set to 1 in order to read out W4/5. Finally, after one more clock cycle in d), the situation is back to the one of the beginning, with index set to 0, and W6/7 are read. As last step, the reshaping block shall assign the words selected by the FC FSM to the active lanes, using the signal *active\_lanes* that indicates which lanes are active and which are not. In the case one or more inactive lanes become active again, the SWIP will not be able to provide a valid word to each reconnected lane for one clock cycle. However, this is not critical for most multi-lane protocols (i.e. SpaceFibre). Usually, a resynchronization mechanism is provided in case of lane reconnection with data stream being interrupted for one or more clock



Fig. 3 Example of SWIP operations.

cycles. Thus, the signal *LL\_read* will be set at '0', freezing the SWIP (*FIFO\_read* set at '0' and *index* unchanged).

A multi-lane protocol may transmit both control words (i.e. the beginning and the end of a frame, error conditions etc.) and data words. If we consider the SpaceFibre protocol, control words shall be replied on all the active lanes, while data words shall not. Moreover, if the number of words that composed a data frame is not a multiple of the number of active lanes, a special control word (named PAD) shall be inserted to form a complete row. Considering these requirements, that are shared partially or completely with other multi-lane communication protocol, the FC FSM architecture has been designed.

#### 3. Results

The proposed architecture has been synthesized both on the Xilinx ZC706 Evaluation Board (xc7z045ffg900-2), with a different number of lanes to evaluate its performance in terms of resources utilization. Synthesizes with N = 2, 3, 4 and 8 lanes have been performed as shown in Table 2. The resources utilization is analysed considering look up tables (LUT) and registers (REG) necessary to synthesize the SWIP. The frequency has been set at 62.5 MHz, which corresponds to a single lane data rate of 2.5 Gbps. It has been selected as it is compatible with most of the instrument used for testing. The SWIP has been successfully integrated in the data link layer of a two lanes SpaceFibre codec to evaluate its behavior in an operative context. The system has been implemented and tested on the Xilinx board, as documented in [11]. The SWIP functionality has been been proven both with several simulation, and with intense hardware testing (i.e. different combinations of lane cable disconnection and

**Table 2**SWIP resource utilisation on ZINQ-7000.

| Number of lanes | LUT  | util% LUT | Reg  | util% Reg |
|-----------------|------|-----------|------|-----------|
| 2               | 1203 | 0.55%     | 331  | 0.08%     |
| 3               | 1977 | 0.90%     | 478  | 0.11%     |
| 4               | 3535 | 1.63%     | 625  | 0.14%     |
| 8               | 8622 | 3.94%     | 1218 | 0.28%     |

connection).

## 4. Conclusions

In this article the architecture of a flexible hardware words re-ordering block is shown. The block is to be inserted in a codec which implements a multi-lane communication protocol. It is able to change in real time the width of the link depending on the number of active lanes, providing great flexibility to the user. The maximum width of the host FIFO is generic, and the number of lanes on which the data flux is redirected can vary real time between 0 and the maximum width of the link. Such solution overcomes the flexibility limitations of similar circuits available in literature and provides to potential designers a very flexible architecture.

#### References

- [1] PCI Express Base Specification, rev. 3.0, PCI-SIG, 2010.
- [2] IEEE Standard for Ethernet 802.3, 2015.
- [3] RapidIO.org, RapidIO Interconnect Specification Version 3.1., Available: http://www.rapidio.org/wp-content/uploads/2014/10/ RapidIO-3.1-Specification.pdf
- [4] S. Baymani, K. Alexopoulos, and S. Valat, "RapidIO as a multipurpose interconnect," J. Phys.: Conf. Ser., vol.898, no.8, Nov. 2017.
- [5] T. Scheckel, "Serial RapidIO: Benefiting system interconnects," IEEE International SOC Conference, 2005.
- [6] S. Parkes, C. McClements, D. McLaren, A.F. Florit, and A.G. Villafranca, "SpaceFibre: A multi-Gigabit/s interconnect for spacecraft onboard data handling," IEEE Aerospace Conference Proceedings, 2015.
- [7] A.F. Florit, A.G. Villafranca, and S. Parkes, "SpaceFibre multi-lane," 7th International SpaceWire Conference, 2016.
- [8] Clifford E. Cummings, "Simulation and synthesis techniques for asynchronous FIFO design," SNUG 2002 (Synopsys Users Group Conference, San Jose, CA), 2002.
- [9] J.B. Mastronarde, "Dual input lane reordering data buffer," U.S. Patent 6,510 472, issued date Jan. 21, 2003.
- [10] M. Chiabrera, "Apparatus and method for transmitting and recovering multi-lane encoded data streams using a reduced number of lanes," U.S. Patent 8,259,760, issued date Sept. 4, 2012.
- [11] P. Nannipieri, G. Dinelli, D. Davall, and, L. Fanucci, "A SpaceFibre multi lane codec system on a chip: Enabling technology for low cost satellite EGSE," PRIME 2018 - 14th Conference on PhD Research in Microelectronics and Electronics, 2018.