Skip to main content
Log in

An integrated solution for QoS provision and congestion management in high-performance interconnection networks using deterministic source-based routing

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

A key element in any system based on several interconnected computing and/or storage nodes is the interconnection network. Currently, one of the main concerns of high-speed interconnection network designers is how to improve network performance while using the minimum number of network resources. In that sense, in this paper we describe an efficient switch architecture suitable for any interconnect technology implementing deterministic source-based routing. This switch architecture uses the same network resources to provide two issues that improve network performance: Congestion Management and QoS support. We also present results to compare the effectiveness of this architecture to those of other proposals typically used to provide these issues in this context. These results have been obtained for synthetic traffic and for traces from parallel benchmarks and video frames. From the results, we can conclude that in any traffic scenario, our proposal is as effective as the previous ones, while requiring fewer resources and thus being much more cost-effective.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. We assume for the sake of simplicity 8-port switches and four SLs. Anyway, architectures with a different number of ports or SLs could be easily deducted.

  2. We use the term flit as the generic flow-control unit, but note each interconnect technology may define its own flit size.

  3. These values are adequate to monitor instantaneous traffic behavior.

References

  1. Anderson T, Owicki S, Saxe J, Thacker C (1993) High-speed switch scheduling for local-area networks. ACM Trans Comput Syst 11(4):319–352

    Article  Google Scholar 

  2. AS (2003) Advanced switching core architecture specification. Revision 1.0. Advanced Switching Interconnect Special Interest Group

  3. Advanced Scientific Computing Advisory Committee (ASCAC) (2010) The opportunities and challenges of exascale computing. Tech rep, US Department of Energy

  4. Berejuck MD, Zeferino CA (2009) Adding mechanisms for QoS to a network-on-chip. In: Proceedings of the 22nd annual symposium on integrated circuits and system design: chip on the dunes (SBCCI). ACM, New York, pp 1–6

    Chapter  Google Scholar 

  5. Calyam P, Lee C (2005) Characterizing voice and video traffic behavior over the Internet. In: Proceedings of the international symposium on computer and information sciences (ISCIS). Advances in computer science and engineering. Imperial College Press, London. Special edition book series

    Google Scholar 

  6. Chrysos N, Katevenis M (2004) Multiple priorities in a two-lane buffered crossbar. In: Proceedings of the IEEE Globecom 2004 conference, CR-ROM, paper ID GE15-3

    Google Scholar 

  7. Chrysos NI (2007) Congestion management for non-blocking Clos networks. In: Proceedings of the third ACM/IEEE symposium on architecture for networking and communications systems (ANCS), Orlando, Florida, USA, pp 117–126

    Chapter  Google Scholar 

  8. Dally W, Carvey P, Dennison L (1998) Architecture of the Avici Terabit switch/router. In: Proceedings of the 6th symposium on high performance interconnections (HOTI), pp 41–50

    Google Scholar 

  9. Dally WJ, Aoki H (1993) Deadlock-free adaptive routing in multicomputer networks using virtual channels. IEEE Trans Parallel Distrib Syst 4(4):466–475

    Article  Google Scholar 

  10. Dongarra JJ (2011) Performance of various computers using standard linear equations software. Tech rep CS-89-85, University of Tennessee, Knoxville, TN, USA. http://www.netlib.org/benchmark/performance.ps

  11. Dongarra JJ, Meuer HW, Strohmaier E (2011) TOP500 supercomputer sites. http://www.top500.org

  12. Escudero-Sahuquillo J, García PJ, Quiles FJ, Duato J (2010) An efficient strategy for reducing head-of-line blocking in fat-trees. In: Euro-Par 2010 parallel processing. Lecture notes in computer science, vol 6272. Springer, Berlin, pp 413–427

    Chapter  Google Scholar 

  13. Escudero-Sahuquillo J, Gran E, García P, Flich J, Skeie T, Lysne O, Quiles F, Duato J (2011) Combining congested-flow isolation and injection throttling in HPC interconnection networks. In: Proceedings of the 40th international conference on parallel processing (ICPP), Taipei, Taiwan, pp 662–672

    Google Scholar 

  14. García PJ, Flich J, Duato J, Johnson I, Quiles FJ, Naven F (2006) Efficient, scalable congestion management for interconnection networks. IEEE MICRO 26(5):52–66

    Article  Google Scholar 

  15. Gran EG, Reinemo SA, Lysne O, Skeie T, Zahavi E, Shainer G (2012) Exploring the Scope of the InfiniBand Congestion Control Mechanism. In: IPDPS pp 1131–1143

    Google Scholar 

  16. Gratz P, Grot B, Keckler SW (2008) Regional congestion awareness for load balance in networks-on-chip. In: HPCA, pp 203–214

    Google Scholar 

  17. Guay WL, Bogdanski B, Reinemo SA, Lysne O, Skeie T (2011) vFtree—a fat-tree routing algorithm using virtual lanes to alleviate congestion. In: IPDPS, pp 197–208

    Google Scholar 

  18. Gusat M, Craddock D, Denzel W, Engbersen A, Ni N, Pfister G, Rooney W, Duato J (2005) Congestion control in InfiniBand networks. In: Proceedings of the 13th symposium on high performance interconnects (HOTI), pp 158–159

    Chapter  Google Scholar 

  19. IBA (2007) InfiniBand architecture specification. Vol 1. Release 1.2.1. InfiniBand Trade Association

  20. IEEE802.1D (1998) IEEE 802.1D information technology—telecommunications and information exchange between systems—local and metropolitan area networks—common specifications. Part 3. Media access control (MAC) bridges. IEEE 802.1 Bridging & Management Working Group

  21. Jurczyk M, Schwederski T (1996) Phenomenon of higher order head-of-line blocking in multistage interconnection networks under nonuniform traffic patterns. IEICE Trans Inf Syst E79-D(8):1124–1129

    Google Scholar 

  22. Katevenis M, Sidiropoulos S, Courcoubetis C (1991) Weighted round-robin cell multiplexing in a general-purpose ATM switch. IEEE J Sel Areas Commun 9(8):1265–1279

    Article  Google Scholar 

  23. Leiserson CE (1985) Fat-trees: universal networks for hardware-efficient supercomputing. IEEE Trans Comput 34(10):892–901

    Article  Google Scholar 

  24. Martínez A, García P, Alfaro F, Flich J, Sánchez J, Quiles F, Duato J (2006) Towards a cost-effective interconnection network architecture with QoS and congestion management support. In: Euro-Par 2006 parallel processing. Lecture notes in computer science, vol 4128. Springer, Berlin, pp 884–895

    Chapter  Google Scholar 

  25. Martínez A, Alfaro F, Sánchez J, Quiles F, Duato J (2007) A new cost-effective technique for QoS support in clusters. IEEE Trans Parallel Distrib Syst 18(12):1714–1726. doi:10.1109/TPDS.2007.1108

    Article  Google Scholar 

  26. Martínez A, García PJ, Alfaro FJ, Sánchez JL, Flich J, Quiles FJ, Duato J (2007) Integrated QoS provision and congestion management for interconnection networks. In: Euro-Par 2007 parallel processing. Lecture notes in computer science, vol 4641. Springer, Berlin, pp 837–847

    Chapter  Google Scholar 

  27. Martínez A, García PJ, Alfaro FJ, Sánchez JL, Flich J, Quiles FJ, Duato J (2009) A switch architecture guaranteeing QoS provision and HOL blocking elimination. IEEE Trans Parallel Distrib Syst 20(1):13–24

    Article  Google Scholar 

  28. Mellanox (2011) IS5022: 8-port non-blocking unmanaged 40 Gb/s InfiniBand Switch System. http://www.mellanox.com/related-docs/prod_ib_switch_systems/IS5022.pdf

  29. Minkenberg C, Abel F, Gusat M, Luijten RP, Denzel W (2003) Current issues in packet switch design. Comput Commun Rev 33:119–124

    Article  Google Scholar 

  30. Myrinet (2005) Myrinet. Web page: http://www.myrinet.com

  31. Nachiondo T, Flich J, Duato J (2010) Buffer management strategies to reduce HoL-blocking. IEEE Trans Parallel Distrib Syst 21(6):739–753

    Article  Google Scholar 

  32. NAS (2011) NAS parallel benchmarks. NASA Advanced Supercomputing Division. Available at http://www.nas.nasa.gov/Resources/Software/npb.html

  33. QSN (2005) QsNet overview. White paper, Quadrics Ltd. http://www.quadrics.com

  34. Ridruejo FJ, Miguel-Alonso J (2005) INSEE: an interconnection network simulation and evaluation environment. In: Euro-Par 2005 parallel processing. Lecture notes in computer science, vol 3648. Springer, Berlin, pp 1014–1023

    Chapter  Google Scholar 

  35. Shang L, Peh LS, Jha NK (2003) Dynamic voltage scaling with links for power optimization of interconnection networks. In: Proceedings of the 9th symposium on high performance computer architecture (HPCA), pp 91–102

    Google Scholar 

  36. Shreedhar M, Varghese G (1995) Efficient fair queueing using deficit round robin. In: Proceedings of the conference on applications, technologies, architectures, and protocols for computer communication, New York, NY, USA, pp 231–242

    Google Scholar 

  37. Singh A, Dally WJ, Towles B, Gupta A (2004) Globally adaptive load-balanced routing on tori. IEEE Comput Archit Lett 3(1):6–9

    Article  Google Scholar 

  38. Tamir Y, Frazier G (1992) Dynamically-allocated multi-queue buffers for VLSI communication switches. IEEE Trans Comput 41(6):725–737

    Article  Google Scholar 

  39. Thottetodi M, Lebeck A, Mukherjee S (2001) Self-tuned congestion control for multiprocessor networks. In: Proceedings of the seventh international symposium on high performance computer. Architecture (HPCA)

    Google Scholar 

  40. Wang M, Siegel HJ, Nichols MA, Abraham S (1995) Using a multipath network for reducing the effects of hot spots. IEEE Trans Parallel Distrib Syst 6(3):252–268

    Article  Google Scholar 

  41. Wong FC, Martin RP, Arpaci-Dusseau RH, Culler DE (1999) Architectural requirements and scalability of the NAS parallel benchmarks. In: Proceedings of the 1999 ACM/IEEE conference on supercomputing. ACM, New York (Supercomputing ’99). doi:10.1145/331532.331573

    Google Scholar 

Download references

Acknowledgements

We would like to thank and acknowledge the Intelligent Systems Group (ISG) of the University of the Basque Country (UPV/EHU) for their contribution to this paper.

This work has been jointly supported by the Spanish MINECO under the project TIN2012-38341-C04-04, and by the JCCM under the project POII10-0289-3724.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juan A. Villar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Villar, J.A., García, P.J., Alfaro, F.J. et al. An integrated solution for QoS provision and congestion management in high-performance interconnection networks using deterministic source-based routing. J Supercomput 66, 284–304 (2013). https://doi.org/10.1007/s11227-013-0904-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-013-0904-0

Keywords

Navigation