Skip to main content

Advertisement

Log in

Comparing the performance of multi-layer perceptron training on electrical and optical network-on-chips

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Multi-layer perceptron (MLP) is a class of Artificial Neural Networks widely used in regression, classification, and prediction. To accelerate the training of MLP, more cores can be used for parallel computing on many-core systems. However, with the increasing number of cores integrated into the chip, the communication bottleneck in the training of MLP on electrical network-on-chip (ENoC) becomes severe, degrading MLP training performance. Replacing ENoC with optical network-on-chip (ONoC) can break the communication bottleneck in MLP training. To facilitate the development of ONoC for MLP training, it is necessary to compare and model the MLP training performance of ONoC and ENoC in advance. This paper first analyzes and compares the differences between ONoC and ENoC. Then, we formulate the performance and energy model of MLP training on ONoC and ENoC by analyzing the communication and computation time, static energy, and dynamic energy consumption, respectively. Furthermore, we conduct extensive simulations to compare their MLP training performance and energy consumption with our simulation infrastructure. The experimental results show the MLP training time of ONoC has been reduced by 65.16% and 52.51% on average in different numbers of cores and batch sizes compared with ENoC. The results also exhibit that ONoC overall has 54.86% and 43.13% on average energy reduction in different numbers of cores and batch sizes compared with ENoC. However, with a small number of cores (e.g., less than 50) in MLP training, ENoC consumes less energy than ONoC. These experiments confirm that generally ONoC is a good replacement for ENoC when using a large number of cores in terms of performance and energy consumption for MLP training.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

  1. Dai F, Chen Y, Huang Z, Zhang H (2021) Performance comparison of multi-layer perceptron training on electrical and optical network-on-chips. In: International Conference on Parallel and Distributed Computing: Applications and Technologies, Springer, pp 129–141

  2. Nabavinejad SM, Baharloo M, Chen K-C, Palesi M, Kogel T (2020) An overview of efficient interconnection networks for deep neural network accelerators. IEEE J Emerg Sel Topics Circuits Syst 10(3):268–282

    Article  Google Scholar 

  3. Liu F, Zhang H, Chen Y, Huang Z, Huaxi G (2017) Wavelength-reused hierarchical optical network on chip architecture for manycore processors. IEEE Trans Sustain Comput 4(2):231–244

    Article  Google Scholar 

  4. Yang W, Chen Y, Huang Z, Zhang H (2017) Rwadmm: routing and wavelength assignment for distribution-based multiple multicasts in onoc. In: 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), IEEE pp 550–557

  5. Dai F, Chen Y, Zhang H, Huang Z (2021) Accelerating fully connected neural network on optical network-on-chip (onoc). arXiv preprint arXiv:2109.14878

  6. Zhao Y, Ge F, Cui C, Zhou F, Wu N (2020) A mapping method for convolutional neural networks on network-on-chip. In: 2020 IEEE 20th International Conference on Communication Technology (ICCT), IEEE, pp 916–920

  7. Khan ZA, Abbasi U, Kim SW (2022) An efficient algorithm for mapping deep learning applications on the noc architecture. Appl Sci 12(6):3136

    Article  Google Scholar 

  8. Mirmahaleh SYH, Rahmani AM (2019) Dnn pruning and mapping on noc-based communication infrastructure. Microelectron J 94:104655

    Article  Google Scholar 

  9. Chen Y-H, Krishna T, Emer JS, Vivienne S (2016) Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE J Solid-State Circuits 52(1):127–138

    Article  Google Scholar 

  10. Lu W, Yan G, Li J, Gong S, Han Y, Li X (2017) Flexflow: a flexible dataflow accelerator architecture for convolutional neural networks. In: 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA), IEEE, pp 553–564

  11. Kwon H, Samajdar A, Krishna T (2018) Maeri: enabling flexible dataflow mapping over dnn accelerators via reconfigurable interconnects. ACM SIGPLAN Not 53(2):461–475

    Article  Google Scholar 

  12. Yasoubi A, Hojabr R, Takshi H, Modarressi M, Daneshtalab M (2015) Cupan–high throughput on-chip interconnection for neural networks. In: International Conference on Neural Information Processing, Springer, pp 559–566

  13. Liu X, Wen W, Qian X, Li H, Chen Y (2018) Neu-noc: a high-efficient interconnection network for accelerated neuromorphic systems. In: 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC), IEEE, pp 141–146

  14. Firuzan A, Modarressi M, Daneshtalab M, Reshadi M (2018) Reconfigurable network-on-chip for 3d neural network accelerators. In: 2018 Twelfth IEEE/ACM International Symposium on Networks-on-Chip (NOCS), IEEE, pp 1–8

  15. Dong Y, Kumai K, Lin Z, Li Y, Watanabe T (2009) High dependable implementation of neural networks with networks on chip architecture and a backtracking routing algorithm. In: 2009 Asia Pacific Conference on Postgraduate Research in Microelectronics & Electronics (PrimeAsia), IEEE, pp 404–407

  16. Akopyan F, Sawada J, Cassidy A, Alvarez-Icaza R, Arthur J, Merolla P, Imam N, Nakamura Y, Datta P, Nam G-J et al (2015) Truenorth: design and tool flow of a 65 mw 1 million neuron programmable neurosynaptic chip. IEEE Trans Comput Aided Des Integr Circuits Syst 34(10):1537–1557

    Article  Google Scholar 

  17. Kim J-Y, Park J, Lee S, Kim M, Oh J, Yoo H-J (2010). A 118.4 gb/s multi-casting network-on-chip with hierarchical star-ring combined topology for real-time object recognition. IEEE J Solid-State Circuits 45(7):1399–1409

    Article  Google Scholar 

  18. Pan Y, Kumar P, Kim J, Memik G, Zhang Y, Choudhary A (2009) Firefly: illuminating future network-on-chip with nanophotonics. In: Proceedings of the 36th Annual International Symposium on Computer Architecture, pp 429–440

  19. Pan Y, Kim J, Memik G (2010) Flexishare: channel sharing for an energy-efficient nanophotonic crossbar. In: IEEE Intl. Symp. High Perf. Comput. Archite. (HPCA), pp 1–12

  20. Vantrease D, Schreiber R, Monchiero M, McLaren M, Jouppi N, Fiorentino M, Davis A, Binkert N, Beausoleil R, Ahn J (2008) Corona: system implications of emerging nanophotonic technology. In: ACM/IEEE Proc. ISCA, pp 153–164

  21. Kurian G, Miller J, Psota J, Eastep J, Liu J, Michel J, Kimerling L, Agarwal A (2010) ATAC: a 1000-core cache-coherent processor with on-chip optical network. In: ACM Intl. Conf. Parallel Architectures and Compilation Techniques (PACT), pp 153–164

  22. Bashir J, Eldhose Peter, Sarangi Smruti R (2019) Bigbus: a scalable optical interconnect. ACM J Emerg Technol Comput Syst (JETC) 15(1):1–24

    Article  Google Scholar 

  23. Bashir J, Sarangi SR (2017) Nuplet: a photonic based multi-chip nuca architecture. In: 2017 IEEE International Conference on Computer Design (ICCD), IEEE, pp 617–624

  24. Kavyan Ziabari AK, Abellán JL, Ubal R, Chen C, Joshi A, Kaeli D (2015)Leveraging silicon-photonic noc for designing scalable gpus. In: Proceedings of the 29th ACM on International Conference on Supercomputing, pp 273–282

  25. Bashir J, Sarangi SR (2020) Gpuopt: power-efficient photonic network-on-chip for a scalable gpu. ACM J Emerg Technol Comput Syst (JETC) 17(1):1–26

    Google Scholar 

  26. Yahya MR, Wu N, Ali ZA, Khizar Y (2021) Optical versus electrical: performance evaluation of network on-chip topologies for uwasn manycore processors. Wireless Pers Commun 116(2):963–991

    Article  Google Scholar 

  27. Okada R, Power and performance comparison of electronic 2d-noc and opto-electronic 2d-noc

  28. Touza R, Martínez J, Álvarez M, Roca J (2022) Obtaining anti-missile decoy launch solution from a ship using machine learning techniques. Int J Interact Multimed Artif Intell 7(4)

  29. Bashir J, Goodchild C, Sarangi SR (202) Seconet: a security framework for a photonic network-on-chip. In: 2020 14th IEEE/ACM International Symposium on Networks-on-Chip (NOCS), IEEE, pp 1–8

  30. Liu F, Zhang H, Chen Y, Huang Z, Gu H (2016) Dynamic ring-based multicast with wavelength reuse for optical network on chips. In: 2016 IEEE 10th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSOC), pp 153–160

  31. Bashir J, Peter E, Sarangi SR (2019) A survey of on-chip optical interconnects. ACM Comput Surv (CSUR) 51(6):1–34

    Article  Google Scholar 

  32. Peter E, Sarangi SR (2015) Optimal power efficient photonic swmr buses. In: 2015 Workshop on Exploiting Silicon Photonics for Energy-Efficient High Performance Computing, pp 25–32

  33. Gibbons PB (1989) A more practical pram model. In: Proceedings of the First Annual ACM Symposium on Parallel Algorithms and Architectures, pp 158–168

  34. Valiant LG (1990) A bridging model for parallel computation. Commun ACM 33(8):103–111

    Article  Google Scholar 

  35. David C, Richard K, David P, Abhijit S, Klaus Erik S, Eunice S, Ramesh S, Thorsten VE (1993) Logp: towards a realistic model of parallel computation. In: Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming, pp 1–12

  36. Gianfranco B, Herley KT, Andrea P, Geppino P, Paul S (1996) Bsp vs logp. In: Proceedings of the Eighth Annual ACM Symposium on Parallel Algorithms and Architectures, pp 25–32

  37. Abbas Eslami K, Dara R, Hamid SA, Shaahin H (2008) A markovian performance model for networks-on-chip. In: 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008), pp 157–164

  38. Tikir MM, Laura C, Erich S, Allan S (2007) A genetic algorithms approach to modeling the performance of memory-bound computations. In: SC’07: Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, IEEE, pp 1–12

  39. Zhuang X, Liberatore V (2005) A recursion-based broadcast paradigm in wormhole routed networks. IEEE Trans Parallel Distrib Syst 16(11):1034–1052

    Article  Google Scholar 

  40. Grani P, Bartolini S (2014) Design options for optical ring interconnect in future client devices. ACM J Emerg Technol Comput Syst (JETC) 10(4):1–25

    Article  Google Scholar 

  41. Zhouhan L, Roland M, Kishore K (2015) How far can we go without convolution: Improving fully-connected networks. arXiv preprint arXiv:1511.02580

  42. Cireşan DC, Meier U, Gambardella LM, Schmidhuber J (2010) Deep, big, simple neural nets for handwritten digit recognition. Neural Comput 22(12):3207–3220

    Article  Google Scholar 

  43. Kadam SS, Adamuthe AC, Patil AB (2020) Cnn model for image classification on mnist and fashion-mnist dataset. J Sci Res 64(2):374–384

    Google Scholar 

  44. Abouelnaga Y, Ali OS, Rady H, Moustafa M (2016) Cifar-10: Knn-based ensemble of classifiers. In: 2016 International Conference on Computational Science and Computational Intelligence (CSCI), IEEE, pp 1192–1195

  45. Kågström B, Ling P, Van Loan C (1998) Gemm-based level 3 blas: high-performance model implementations and performance evaluation benchmark. ACM Trans Math Softw (TOMS) 24(3):268–302

    Article  MATH  Google Scholar 

  46. Li RM, King CT, Das B (2016) Extending gem5-garnet for efficient and accurate trace-driven noc simulation. In: Proceedings of the 9th International Workshop on Network on Chip Architectures, pp 3–8

  47. Sun C, Owen Chen CH, Kurian G, Wei L, Miller J, Agarwal A, Peh LS, Stojanovic V (2012) Dsent-a tool connecting emerging photonics with electronics for opto-electronic networks-on-chip modeling. In: 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip, IEEE, pp 201–210

  48. Laer van A (2018) The effect of an optical network on-chip on the performance of chip multiprocessors. PhD thesis, UCL (University College London)

  49. Zhang X, Louri A.(2010) A multilayer nanophotonic interconnection network for on-chip many-core communications. In: ACM/IEEE Proc. DAC, pp 156–161

  50. Vlasov Y, Green WMJ, Xia F (2008) High-throughput silicon nanophotonic wavelength-insensitive switch for on-chip optical networks. Nat Photonics 2(4):242–246

    Article  Google Scholar 

  51. Deng L, Li G, Han S, Shi L, Xie Y (2020) Model compression and hardware acceleration for neural networks: A comprehensive survey. Proc IEEE 108(4):485–532

    Article  Google Scholar 

Download references

Acknowledgements

The authors thank the reviewers for taking the time and effort necessary to review the manuscript. Besides, we acknowledge the use of New Zealand eScience Infrastructure (NeSI) high-performance computing facilities as part of this research (Project code: uoo03633).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fei Dai.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This study extends the work in [1], which is published at The 22nd International Conference on Parallel and Distributed Computing, Applications, and Technologies in 2022.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dai, F., Chen, Y., Huang, Z. et al. Comparing the performance of multi-layer perceptron training on electrical and optical network-on-chips. J Supercomput 79, 10725–10746 (2023). https://doi.org/10.1007/s11227-022-04945-y

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04945-y

Keywords

Navigation