Skip to main content
Log in

A Resource-Efficient Communication Architecture for Chip Multiprocessors on FPGAs

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Significant advances in field-programmable gate arrays (FPGAs) have made it viable to explore innovative multiprocessor solutions on a single FPGA chip. For multiprocessors, an efficient communication network that matches the needs of the target application is always critical to the overall performance. Wormhole packet-switching network-on-chip (NoC) solutions are replacing conventional shared buses to deal with scalability and complexity challenges coming along with the increasing number of processing elements (PEs). However, the quest for high performance networks has led to very complex and resource-expensive NoC designs, leaving little room for the real computing force, i.e., PEs. Moreover, many techniques offer very small performance gains or none at all when network traffic is light while increasing the resource usage of routers. We argue that computation is still the primary task of multiprocessors and sufficient resources should be reserved for PEs. This paper presents our novel design and implementation of a resource-efficient communication network for multiprocessors on FPGAs. We reduce not only the required number of routers for a given number of PEs by introducing a new PE-router topology, but also the resource requirement of each router. Our communication network relies on the NEWS channels to transfer packets in a pipelined fashion following the path determined by the routing network. The implementation results on various Xilinx FPGAs show good performance in the typical range of network load for multiprocessor applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Cosoroaba A, Rivoallon F. Achieving higher system performance with Virtex-5 family FPGAs. Xilinx Corporation, Tech. Rep., 2006.

  2. Virtex 5 FPGA datasheet. http://www.xilinx.com/support/documentation/datasheets/ds202.pdf, May 2010.

  3. Underwood K. FPGAs vs. CPUs: Trends in peak floatingpoint performance. In Proc. ACM/SIGDA Int. Symp. Field Programmable Gate Arrays, Monterey, USA, Feb. 22-24, 2004, pp.171-180.

  4. deLorimier M, DeHon A. Floating-point sparse matrix-vector multiply for FPGAs. In Proc. ACM/SIGDA Int. Symp. Field-Programmable Gate Arrays, Monterey, USA, Feb. 20-22, 2005, pp.75-85.

  5. Hauck S, DeHon A (Eds.). Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation. Burlington: Morgan Kaufmann, MA, 2008.

  6. El-Ghazawi T, El-Araby E, Huang M, Gaj K, Kindratenko V, Buell D. The promise of high-performance reconfigurable computing. IEEE Computer, Feb. 2008, 41(2): 69–76.

    Google Scholar 

  7. Zhuo L, Prasanna V. Scalable hybrid designs for linear algebra on reconfigurable computing systems. IEEE Trans. Comput., Dec. 2008, 57(12): 1661–1675.

    Article  MathSciNet  Google Scholar 

  8. Ravindran K, Satish N R, Jin Y, Keutzer K. An FPGA-based soft multiprocessor system for IPv4 packet forwarding. In Proc. Int. Conf. Field Programmable Logic and Applications (FPL), Tampere, Finland, Aug. 24–26, 2005, pp.487-492.

  9. Saint-Jean N, Sassatelli G, Benoit P, Torres L, Robert M. HSScale: A hardware-software scalable MP-SOC architecture for embedded systems. In Proc. IEEE Computer Society Annual Symp. VLSI (ISVLSI), Porto Alegre, Brazil, May 9–11, 2007, pp.21-28.

  10. Wang X, Ziavras S G. Exploiting mixed-mode parallelism for matrix operations on the HERA architecture through reconfiguration. IEE Proc. Computers Digital Techniques, July 2006, 153(4): 249–260.

    Article  Google Scholar 

  11. Kumar S et al. A network on chip architecture and design methodology. In Proc. IEEE Computer Society Annual Symp. VLSI (ISVLSI), Pittsburgh, USA, Apr. 25–26, 2002, pp.105-112.

  12. Dally W, Seitz C. Deadlock-free message routing in multiprocessor interconnection networks. IEEE Trans. Comput., May, 1987, 36(5): 547–553.

    Article  MATH  Google Scholar 

  13. Ni L, Mckinley P. A survey of wormhole routing techniques in direct networks. IEEE Computer, Feb. 1993, 26(2): 62–76.

    Google Scholar 

  14. Bjerregaard T, Mahadevan S. A survey of research and practices of network-on-chip. ACM Computing Surveys, June 2006, 38(1): Article No. 1.

  15. Peh L S, Dally W. A delay model for router microarchitectures. IEEE Micro, Jan. 2001, 21(1): 26–34.

    Article  Google Scholar 

  16. Mullins R, West A, Moore S. Low-latency virtual-channel routers for on-chip networks. In Proc. IEEE Int. Symp. Computer Architecture, M¨unchen, Germany, Jun. 19–23, 2004, pp.188-197.

  17. Kapre N, Mehta N, Delorimier M, Rubin R, Barnor H, Wilson M, Wrighton M, Dehon A. Packet switched vs. time multiplexed FPGA overlay networks. In Proc. IEEE Symp. Field-Programmable Custom Computing Machines, Napa, USA, Apr. 24–26, 2006, pp.205-216.

  18. Gratz P, Sankaralingam K, Hanson H, Shivakumar P, McDonald R, Keckler S, Burger D. Implementation and evaluation of a dynamically routed processor operand network. In Proc. IEEE Int. Symp. Networks-on-Chip, Princeton, USA, May 7–9, 2007, pp.7-17.

  19. Schelle G, Grunwald D. Exploring FPGA network on chip implementations across various application and network loads. In Proc. Int. Conf. Field Program. Logic and Applications, Heidelberg, Germany, Sept. 8–10, 2008, pp.41-46.

  20. Moraes F, Calazans N, Mello A, Moller L, Ost L. HERMES: An infrastructure for low area overhead packet-switching networks on chip. Integration, the VLSI Journal, Oct. 2004, 38: 69–93.

    Article  Google Scholar 

  21. Brebner G, Levi D. Networking on chip with platform FPGAs. In Proc. IEEE Int. Conf. Field-Programmable Technology, Tokyo, Japan, Dec. 15–17, 2003, pp.13-20.

  22. Bartic T, Mignolet J Y et al. Topology adaptive networkon-chip design and implementation. IEE Proc. Computers Digital Techniques, July 2005, 152(4): 467–472.

    Article  Google Scholar 

  23. Sethuraman B, Bhattacharya P, Khan J, Vemuri R. LiPaR: A light-weight parallel router for FPGA-based networks-onchip. In Proc. ACM Great Lakes Symp. VLSI, Chicago, USA, Apr. 17–19, 2005, pp.452-457.

  24. Ogras U, Marculescu R, Lee H, Choudhary P, Marculescu D, Kaufman M, Nelson P. Challenges and promising results in NoC prototyping using FPGAs. IEEE Micro, Sept. 2007, 27(5): 86–95.

    Article  Google Scholar 

  25. Ngouanga A, Sassatelli G, Torres L, Gil T, Suarez A, Susin A. Run-time resources management on coarse grained, packetswitching reconfigurable architecture: A case study through the APACHES’ platform. In Proc. Int. Workshop on Applied Reconfigurable Computing (ARC), Delft, The Netherlands, Mar. 1–3, 2006, pp.134-145.

  26. Gratz P, Kim C, Mcdonald R, Keckler S W, Burger D. Implementation and evaluation of on-chip network architectures. In Proc. IEEE Int. Conf. Computer Design, San Jose, USA, Oct. 1–4, 2006, pp.477-484.

  27. ML505/ML506/ML507 evaluation platform user guide. http://www.xilinx.com/support/documentation/boards and kits/ug347.pdf, Oct. 7, 2009.

  28. Sassatelli G, Torres L, Riso S, Robert M. Packet-switching network-on-chip features exploration and characterization. In Proc. IFIP Int. Conf. Very Large Scale Integration, Madrid, Spain, Sept. 27–29, 2005, pp.403-409.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaofang (Maggie) Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, X.(., Thota, S. A Resource-Efficient Communication Architecture for Chip Multiprocessors on FPGAs. J. Comput. Sci. Technol. 26, 434–447 (2011). https://doi.org/10.1007/s11390-011-1145-4

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-011-1145-4

Keywords

Navigation