Skip to main content
Log in

A hybrid congestion control algorithm for broadcast-based architectures with multiple input queues

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The main purpose of this paper is to propose a hybrid congestion control algorithm to prevent congestion in 2-D broadcast-based multiprocessor architectures with multiple input queues. Our algorithm utilizes a node’s both input queue and output channel parameters to detect and prevent congestion. The intermediate node selection procedure and the bypass operation have also been developed as part of the proposed algorithm. The performance of the algorithm is tested with several synthetic traffic patterns on the 2-D simultaneous optical multiprocessor exchange bus. The performance of the algorithm is compared with that of the algorithms which use only input and only output parameters and it is shown that the proposed congestion control algorithm using hybrid parameters performs better than the other algorithms. The proposed algorithm is able to decrease the average network response time by 33.63 %, average input waiting time by 29.13 % and increase average processor utilization by 7.57 % on the average.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Mauch V, Kunze M, Hillenbrand M (2013) High performance cloud computing. Futur Gener Comput Syst 29:1408–1416. doi:10.1016/j.future.2012.03.011

    Article  Google Scholar 

  2. Vital J-A, Gaurut M, Lardy R et al (2013) High-performance computing for climate change impact studies with the Pasture simulation model. Comput Electron Agric 98:131–135. doi:10.1016/j.compag.2013.08.004

    Article  Google Scholar 

  3. Zheng Y, Lisherness P, Gao M, et al (2012) Power-efficient calibration and reconfiguration for optical network-on-chip. J Optic Commun Netw, 4:955–966

  4. Escudero-Sahuquillo J, Garcia PJ, Quiles FJ et al (2014) A new proposal to deal with congestion in InfiniBand-based fat-trees. J Parallel Distrib Comput 74:1802–1819. doi:10.1016/j.jpdc.2013.09.002

    Article  Google Scholar 

  5. Kaminow IP, Li T, Willner AE et al (2013) Optical fiber telecommunications. Opt Fiber Telecommun 377–418. doi:10.1016/B978-0-12-396958-3.00011-1

  6. Hawkins C, Small B, Wills D, Bergman K (2007) The data vortex, an all optical path multicomputer interconnection network. IEEE Trans Parallel Distrib Syst 18:409–420. doi:10.1109/TPDS.2007.48

    Article  Google Scholar 

  7. Petracca M, Lee BG, Bergman K, Carloni LP (2008) Design exploration of optical interconnection networks for chip multiprocessors. 2008 16th IEEE symposium high perform interconnects IEEE, pp 31–40

  8. Gripp J, Duelk M, Simsarian JE et al (2003) Optical switch fabrics for ultra-high-capacity IP routers. J Light Technol 21:2839–2850. doi:10.1109/JLT.2003.819150

    Article  Google Scholar 

  9. Gu H, Xu J, Wang Z (2008) ODOR proceedings 6th IEEE/ACM/IFIP international conference hardware/software codesign system synthesis CODES/ISSS ’08. ACM Press, New York, New York, USA, p 203

  10. Mahafzah BA, Tahboub RY, Tahboub OY (2010) Performance evaluation of broadcast and global combine operations in all-port wormhole-routed OTIS-mesh interconnection networks. Cluster Comput 13:87–110. doi:10.1007/s10586-009-0117-8

    Article  Google Scholar 

  11. Arabnia H, Smith J (1993) A reconfigurable interconnection network for imaging operations and its implementation using a multi-stage switching box. In: Proceedings of 7th annual international high performance computing conference, Calgary-Alberta, pp 349–357

  12. Arabnia HR, Oliver MA (1989) A transputer network for fast operations on digitised images. Comput Graph Forum 8:3–11. doi:10.1111/j.1467-8659.1989.tb00448.x

    Article  Google Scholar 

  13. Bhandarkar SM, Arabnia HR (1995) The REFINE multiprocessor–theoretical properties and algorithms. Parallel Comput 21:1783–1805. doi:10.1016/0167-8191(95)00032-9

    Article  Google Scholar 

  14. Arabnia HR (1990) A parallel algorithm for the arbitrary rotation of digitized images using process-and-data-decomposition approach. J Parallel Distrib Comput 10:188–192. doi:10.1016/0743-7315(90)90028-N

    Article  Google Scholar 

  15. Katsinis C (2001) Performance analysis of the simultaneous optical multi-processor exchange bus. Parallel Comput 27:1079–1115. doi:10.1016/S0167-8191(01)00071-0

    Article  MATH  Google Scholar 

  16. Katsinis C, Nabet B (2004) A scalable interconnection network architecture for petaflops computing. J Supercomput 27:103–128. doi:10.1023/B:SUPE.0000009318.91562.b0

    Article  Google Scholar 

  17. Ahmed Louri AK (2003) SYMNET: an optical interconnection network for scalable high-performance symmetric multiprocessors. Appl Opt 42:3407–3417

    Article  Google Scholar 

  18. Psota J, Miller J, Kurian G, et al (2010) ATAC: improving performance and programmability with on-chip optical networks. In: Proceedings 2010 IEEE international symposium circuits system IEEE, pp 3325–3328

  19. Vantrease D, Schreiber R, Monchiero M, et al (2008) Corona: system implications of emerging nanophotonic technology. 2008 international symposium computer architecture IEEE, pp 153–164

  20. Pan Y, Kumar P, Kim J et al (2009) Firefly: illuminating future network-on-chip with nanophotonics. ACM SIGARCH Comput Archit News 37:429. doi:10.1145/1555815.1555808

    Article  Google Scholar 

  21. Wu X, Ye Y, Zhang W, et al (2010) UNION: a unified inter/intra-chip optical network for chip multiprocessors. 2010 IEEE/ACM international symposium nanoscale architecture IEEE, pp 35–40

  22. Morris R, Jolley E, Kodi AK (2014) Extending the performance and energy-efficiency of shared memory multicores with nanophotonic technology. IEEE Trans Parallel Distrib Syst 25:83–92. doi:10.1109/TPDS.2013.26

    Article  Google Scholar 

  23. Wang C, Hu W-H, Bagherzadeh N (2012) A load-balanced congestion-aware wireless network-on-chip design for multi-core platforms. Microprocess Microsyst 36:555–570. doi:10.1016/j.micpro.2011.10.002

    Article  MATH  Google Scholar 

  24. Baydal E, Lopez P, Duato J (2005) A family of mechanisms for congestion control in wormhole networks. IEEE Trans Parallel Distrib Syst 16:772–784. doi:10.1109/TPDS.2005.102

    Article  Google Scholar 

  25. Miguel-Alonso J, Izu C, Gregorio JA (2008) Improving the performance of large interconnection networks using congestion-control mechanisms. Perform Eval 65:203–211. doi:10.1016/j.peva.2007.05.001

    Article  Google Scholar 

  26. Daneshtalab M, Ebrahimi M, Liljeberg P et al (2013) A systematic reordering mechanism for on-chip networks using efficient congestion-aware method. J Syst Archit 59:213–222. doi:10.1016/j.sysarc.2012.01.002

    Article  Google Scholar 

  27. Lotfi-Kamran P, Rahmani AM, Daneshtalab M et al (2010) EDXY: a low cost congestion-aware routing algorithm for network-on-chips. J Syst Archit 56:256–264. doi:10.1016/j.sysarc.2010.05.002

    Article  Google Scholar 

  28. Thottethodi M, Lebeck AR, Mukherjee SS (2004) Exploiting global knowledge to achieve self-tuned congestion control for k-ary n-cube networks. IEEE Trans Parallel Distrib Syst 15:257–272. doi:10.1109/TPDS.2004.1264810

    Article  Google Scholar 

  29. Li M, Zeng Q-A, Jone W-B (2006) DyXY. In: Proceedings of 43rd annual conference design automation - DAC ’06. ACM Press, New York, New York, USA, p 849

  30. Ascia G, Catania V, Palesi M, Patti D (2008) Implementation and analysis of a new selection strategy for adaptive routing in networks-on-chip. IEEE Trans Comput 57:809–820. doi:10.1109/TC.2008.38

    Article  MathSciNet  Google Scholar 

  31. Wang C, Bagherzadeh N (2012) Design and evaluation of a high throughput QoS-aware and congestion-aware router architecture for network-on-chip (2012) 20th Euromicro Int Conf Parallel. Distrib network-based process, pp 457–464. doi:10.1109/PDP.2012.20

  32. Masoud Daneshtalab MK (2012) Adaptive input–output selection based on-chip router architecture. J Low Power Electron 8:11–29

    Article  Google Scholar 

  33. Gratz P, Grot B, Keckler SW (2008) Regional congestion awareness for load balance in networks-on-chip. 2008 IEEE 14th international symposium High Performance computing architecture, pp 203–214. doi:10.1109/HPCA.2008.4658640

  34. Huang P-T, Hwang W (2009) An adaptive congestion-aware routing algorithm for mesh network-on-chip platform. 2009 IEEE international SOC conference IEEE, pp 375–378

  35. Wang J, Gu H, Yang Y, Wang K (2013) An energy- and buffer-aware fully adaptive routing algorithm for network-on-chip. Microelectron J 44:137–144. doi:10.1016/j.mejo.2012.12.008

    Article  Google Scholar 

  36. Samman FA, Hollstein T, Glesner M (2012) Planar adaptive network-on-chip supporting deadlock-free and efficient tree-based multicast routing method. Microprocess Microsyst 36:449–461. doi:10.1016/j.micpro.2012.04.003

    Article  Google Scholar 

  37. Trumler W, Schlingmann S, Ungerer T, Bahn JH, Bagherzadeh N (2008) Self-optimized routing in a network-on-a-chip. doi:10.1007/978-0-387-09655-1

  38. Van den Brand JW, Ciordas C, Goossens K, Basten T (2007) Congestion-controlled best-effort communication for networks-on-chip. 2007 Design automation test European conference exhibition. IEEE, pp 1–6

  39. Noh S, Kim D, Ngo V-D, Choi H-W (2007) Performance and complexity analysis of credit-based end-to-end flow control in network-on-chip, pp 4742:268–277. doi:10.1007/978-3-540-74742-0

  40. Shin KG, Chang CC (1995) Prevention of congestion in packet-switched multistage interconnection networks. IEEE Trans Parallel Distrib Syst 6:535–541. doi:10.1109/71.382322

    Article  Google Scholar 

  41. Akay MF, Katsinis C (2008) Performance improvement of parallel programs on a broadcast-based distributed shared memory multiprocessor by simulation. Simul Model Pract Theory 16:338–352. doi:10.1016/j.simpat.2007.11.015

    Article  Google Scholar 

  42. Hemenway R (2004) High bandwidth, low latency, burst-mode optical interconnect for high performance computing systems. Conference on lasers and electro-optics, San Francisco, California, United States, 16 May 2004

  43. Aci CI, Akay MF (2010) A new congestion control algorithm for improving the performance of a broadcast-based multiprocessor architecture. J Parallel Distrib Comput 70:930–940. doi:10.1016/j.jpdc.2010.06.003

    Article  MATH  Google Scholar 

  44. OPNET Modeler. http://www.riverbed.com/products/performance-management-control/opnet.html?redirect=opnet. Accessed 4 Jun 2014

  45. Gropp W, Lusk E, Skjellum A (1999) Using MPI: portable parallel programming with the message-passing interface, 2nd edn. The MIT Press, London

    Google Scholar 

  46. Lipsky L (2009) Queueing theory: a linear algebraic approach, 2nd edn., Springer, New York, p 576

  47. Dally W, Towles B (2003) Principles and practices of interconnection networks, 1st edn. Morgan Kaufmann Publishers Inc., San Francisco

    Google Scholar 

  48. Shan H, Singh JP, Oliker L, Biswas R (2003) Message passing and shared address space parallelism on an SMP cluster. Parallel Comput 29:167–186. doi:10.1016/S0167-8191(02)00222-3

    Article  Google Scholar 

Download references

Acknowledgments

We would like to thank Dr. Constantine Katsinis for letting us use the 2-D SOME-Bus architecture in the paper. We would like to thank OPNET Technologies, Inc. for letting us use the OPNET Modeler under the University Program and Çukurova University Scientific Research Projects Center for supporting this work (Project code: MMF2011D9).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Çiğdem İnan Acı.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Acı, Ç.İ., Akay, M.F. A hybrid congestion control algorithm for broadcast-based architectures with multiple input queues. J Supercomput 71, 1907–1931 (2015). https://doi.org/10.1007/s11227-015-1384-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-015-1384-1

Keywords

Navigation