Comparing the performance of multi-layer perceptron training on electrical and optical network-on-chips

Dai, Fei; Chen, Yawen; Huang, Zhiyi; Zhang, Haibo; Zhang, Hao; Xia, Chengpeng

doi:10.1007/s11227-022-04945-y

Comparing the performance of multi-layer perceptron training on electrical and optical network-on-chips

Published: 23 November 2022

Volume 79, pages 10725–10746, (2023)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Fei Dai ORCID: orcid.org/0000-0001-5887-3320¹,
Yawen Chen¹,
Zhiyi Huang¹,
Haibo Zhang¹,
Hao Zhang¹ &
…
Chengpeng Xia¹

328 Accesses
3 Citations
Explore all metrics

Abstract

Multi-layer perceptron (MLP) is a class of Artificial Neural Networks widely used in regression, classification, and prediction. To accelerate the training of MLP, more cores can be used for parallel computing on many-core systems. However, with the increasing number of cores integrated into the chip, the communication bottleneck in the training of MLP on electrical network-on-chip (ENoC) becomes severe, degrading MLP training performance. Replacing ENoC with optical network-on-chip (ONoC) can break the communication bottleneck in MLP training. To facilitate the development of ONoC for MLP training, it is necessary to compare and model the MLP training performance of ONoC and ENoC in advance. This paper first analyzes and compares the differences between ONoC and ENoC. Then, we formulate the performance and energy model of MLP training on ONoC and ENoC by analyzing the communication and computation time, static energy, and dynamic energy consumption, respectively. Furthermore, we conduct extensive simulations to compare their MLP training performance and energy consumption with our simulation infrastructure. The experimental results show the MLP training time of ONoC has been reduced by 65.16% and 52.51% on average in different numbers of cores and batch sizes compared with ENoC. The results also exhibit that ONoC overall has 54.86% and 43.13% on average energy reduction in different numbers of cores and batch sizes compared with ENoC. However, with a small number of cores (e.g., less than 50) in MLP training, ENoC consumes less energy than ONoC. These experiments confirm that generally ONoC is a good replacement for ENoC when using a large number of cores in terms of performance and energy consumption for MLP training.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Performance Comparison of Multi-layer Perceptron Training on Electrical and Optical Network-on-Chips

Versatile Architectures of Artificial Neural Network with Variable Capacity

Article 01 July 2022

A Runtime Programmable Accelerator for Convolutional and Multilayer Perceptron Neural Networks on FPGA

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

Dai F, Chen Y, Huang Z, Zhang H (2021) Performance comparison of multi-layer perceptron training on electrical and optical network-on-chips. In: International Conference on Parallel and Distributed Computing: Applications and Technologies, Springer, pp 129–141
Nabavinejad SM, Baharloo M, Chen K-C, Palesi M, Kogel T (2020) An overview of efficient interconnection networks for deep neural network accelerators. IEEE J Emerg Sel Topics Circuits Syst 10(3):268–282
Article Google Scholar
Liu F, Zhang H, Chen Y, Huang Z, Huaxi G (2017) Wavelength-reused hierarchical optical network on chip architecture for manycore processors. IEEE Trans Sustain Comput 4(2):231–244
Article Google Scholar
Yang W, Chen Y, Huang Z, Zhang H (2017) Rwadmm: routing and wavelength assignment for distribution-based multiple multicasts in onoc. In: 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), IEEE pp 550–557
Dai F, Chen Y, Zhang H, Huang Z (2021) Accelerating fully connected neural network on optical network-on-chip (onoc). arXiv preprint arXiv:2109.14878
Zhao Y, Ge F, Cui C, Zhou F, Wu N (2020) A mapping method for convolutional neural networks on network-on-chip. In: 2020 IEEE 20th International Conference on Communication Technology (ICCT), IEEE, pp 916–920
Khan ZA, Abbasi U, Kim SW (2022) An efficient algorithm for mapping deep learning applications on the noc architecture. Appl Sci 12(6):3136
Article Google Scholar
Mirmahaleh SYH, Rahmani AM (2019) Dnn pruning and mapping on noc-based communication infrastructure. Microelectron J 94:104655
Article Google Scholar
Chen Y-H, Krishna T, Emer JS, Vivienne S (2016) Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE J Solid-State Circuits 52(1):127–138
Article Google Scholar
Lu W, Yan G, Li J, Gong S, Han Y, Li X (2017) Flexflow: a flexible dataflow accelerator architecture for convolutional neural networks. In: 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA), IEEE, pp 553–564
Kwon H, Samajdar A, Krishna T (2018) Maeri: enabling flexible dataflow mapping over dnn accelerators via reconfigurable interconnects. ACM SIGPLAN Not 53(2):461–475
Article Google Scholar
Yasoubi A, Hojabr R, Takshi H, Modarressi M, Daneshtalab M (2015) Cupan–high throughput on-chip interconnection for neural networks. In: International Conference on Neural Information Processing, Springer, pp 559–566
Liu X, Wen W, Qian X, Li H, Chen Y (2018) Neu-noc: a high-efficient interconnection network for accelerated neuromorphic systems. In: 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC), IEEE, pp 141–146
Firuzan A, Modarressi M, Daneshtalab M, Reshadi M (2018) Reconfigurable network-on-chip for 3d neural network accelerators. In: 2018 Twelfth IEEE/ACM International Symposium on Networks-on-Chip (NOCS), IEEE, pp 1–8
Dong Y, Kumai K, Lin Z, Li Y, Watanabe T (2009) High dependable implementation of neural networks with networks on chip architecture and a backtracking routing algorithm. In: 2009 Asia Pacific Conference on Postgraduate Research in Microelectronics & Electronics (PrimeAsia), IEEE, pp 404–407
Akopyan F, Sawada J, Cassidy A, Alvarez-Icaza R, Arthur J, Merolla P, Imam N, Nakamura Y, Datta P, Nam G-J et al (2015) Truenorth: design and tool flow of a 65 mw 1 million neuron programmable neurosynaptic chip. IEEE Trans Comput Aided Des Integr Circuits Syst 34(10):1537–1557
Article Google Scholar
Kim J-Y, Park J, Lee S, Kim M, Oh J, Yoo H-J (2010). A 118.4 gb/s multi-casting network-on-chip with hierarchical star-ring combined topology for real-time object recognition. IEEE J Solid-State Circuits 45(7):1399–1409
Article Google Scholar
Pan Y, Kumar P, Kim J, Memik G, Zhang Y, Choudhary A (2009) Firefly: illuminating future network-on-chip with nanophotonics. In: Proceedings of the 36th Annual International Symposium on Computer Architecture, pp 429–440
Pan Y, Kim J, Memik G (2010) Flexishare: channel sharing for an energy-efficient nanophotonic crossbar. In: IEEE Intl. Symp. High Perf. Comput. Archite. (HPCA), pp 1–12
Vantrease D, Schreiber R, Monchiero M, McLaren M, Jouppi N, Fiorentino M, Davis A, Binkert N, Beausoleil R, Ahn J (2008) Corona: system implications of emerging nanophotonic technology. In: ACM/IEEE Proc. ISCA, pp 153–164
Kurian G, Miller J, Psota J, Eastep J, Liu J, Michel J, Kimerling L, Agarwal A (2010) ATAC: a 1000-core cache-coherent processor with on-chip optical network. In: ACM Intl. Conf. Parallel Architectures and Compilation Techniques (PACT), pp 153–164
Bashir J, Eldhose Peter, Sarangi Smruti R (2019) Bigbus: a scalable optical interconnect. ACM J Emerg Technol Comput Syst (JETC) 15(1):1–24
Article Google Scholar
Bashir J, Sarangi SR (2017) Nuplet: a photonic based multi-chip nuca architecture. In: 2017 IEEE International Conference on Computer Design (ICCD), IEEE, pp 617–624
Kavyan Ziabari AK, Abellán JL, Ubal R, Chen C, Joshi A, Kaeli D (2015)Leveraging silicon-photonic noc for designing scalable gpus. In: Proceedings of the 29th ACM on International Conference on Supercomputing, pp 273–282
Bashir J, Sarangi SR (2020) Gpuopt: power-efficient photonic network-on-chip for a scalable gpu. ACM J Emerg Technol Comput Syst (JETC) 17(1):1–26
Google Scholar
Yahya MR, Wu N, Ali ZA, Khizar Y (2021) Optical versus electrical: performance evaluation of network on-chip topologies for uwasn manycore processors. Wireless Pers Commun 116(2):963–991
Article Google Scholar
Okada R, Power and performance comparison of electronic 2d-noc and opto-electronic 2d-noc
Touza R, Martínez J, Álvarez M, Roca J (2022) Obtaining anti-missile decoy launch solution from a ship using machine learning techniques. Int J Interact Multimed Artif Intell 7(4)
Bashir J, Goodchild C, Sarangi SR (202) Seconet: a security framework for a photonic network-on-chip. In: 2020 14th IEEE/ACM International Symposium on Networks-on-Chip (NOCS), IEEE, pp 1–8
Liu F, Zhang H, Chen Y, Huang Z, Gu H (2016) Dynamic ring-based multicast with wavelength reuse for optical network on chips. In: 2016 IEEE 10th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSOC), pp 153–160
Bashir J, Peter E, Sarangi SR (2019) A survey of on-chip optical interconnects. ACM Comput Surv (CSUR) 51(6):1–34
Article Google Scholar
Peter E, Sarangi SR (2015) Optimal power efficient photonic swmr buses. In: 2015 Workshop on Exploiting Silicon Photonics for Energy-Efficient High Performance Computing, pp 25–32
Gibbons PB (1989) A more practical pram model. In: Proceedings of the First Annual ACM Symposium on Parallel Algorithms and Architectures, pp 158–168
Valiant LG (1990) A bridging model for parallel computation. Commun ACM 33(8):103–111
Article Google Scholar
David C, Richard K, David P, Abhijit S, Klaus Erik S, Eunice S, Ramesh S, Thorsten VE (1993) Logp: towards a realistic model of parallel computation. In: Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming, pp 1–12
Gianfranco B, Herley KT, Andrea P, Geppino P, Paul S (1996) Bsp vs logp. In: Proceedings of the Eighth Annual ACM Symposium on Parallel Algorithms and Architectures, pp 25–32
Abbas Eslami K, Dara R, Hamid SA, Shaahin H (2008) A markovian performance model for networks-on-chip. In: 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008), pp 157–164
Tikir MM, Laura C, Erich S, Allan S (2007) A genetic algorithms approach to modeling the performance of memory-bound computations. In: SC’07: Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, IEEE, pp 1–12
Zhuang X, Liberatore V (2005) A recursion-based broadcast paradigm in wormhole routed networks. IEEE Trans Parallel Distrib Syst 16(11):1034–1052
Article Google Scholar
Grani P, Bartolini S (2014) Design options for optical ring interconnect in future client devices. ACM J Emerg Technol Comput Syst (JETC) 10(4):1–25
Article Google Scholar
Zhouhan L, Roland M, Kishore K (2015) How far can we go without convolution: Improving fully-connected networks. arXiv preprint arXiv:1511.02580
Cireşan DC, Meier U, Gambardella LM, Schmidhuber J (2010) Deep, big, simple neural nets for handwritten digit recognition. Neural Comput 22(12):3207–3220
Article Google Scholar
Kadam SS, Adamuthe AC, Patil AB (2020) Cnn model for image classification on mnist and fashion-mnist dataset. J Sci Res 64(2):374–384
Google Scholar
Abouelnaga Y, Ali OS, Rady H, Moustafa M (2016) Cifar-10: Knn-based ensemble of classifiers. In: 2016 International Conference on Computational Science and Computational Intelligence (CSCI), IEEE, pp 1192–1195
Kågström B, Ling P, Van Loan C (1998) Gemm-based level 3 blas: high-performance model implementations and performance evaluation benchmark. ACM Trans Math Softw (TOMS) 24(3):268–302
Article MATH Google Scholar
Li RM, King CT, Das B (2016) Extending gem5-garnet for efficient and accurate trace-driven noc simulation. In: Proceedings of the 9th International Workshop on Network on Chip Architectures, pp 3–8
Sun C, Owen Chen CH, Kurian G, Wei L, Miller J, Agarwal A, Peh LS, Stojanovic V (2012) Dsent-a tool connecting emerging photonics with electronics for opto-electronic networks-on-chip modeling. In: 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip, IEEE, pp 201–210
Laer van A (2018) The effect of an optical network on-chip on the performance of chip multiprocessors. PhD thesis, UCL (University College London)
Zhang X, Louri A.(2010) A multilayer nanophotonic interconnection network for on-chip many-core communications. In: ACM/IEEE Proc. DAC, pp 156–161
Vlasov Y, Green WMJ, Xia F (2008) High-throughput silicon nanophotonic wavelength-insensitive switch for on-chip optical networks. Nat Photonics 2(4):242–246
Article Google Scholar
Deng L, Li G, Han S, Shi L, Xie Y (2020) Model compression and hardware acceleration for neural networks: A comprehensive survey. Proc IEEE 108(4):485–532
Article Google Scholar

Download references

Acknowledgements

The authors thank the reviewers for taking the time and effort necessary to review the manuscript. Besides, we acknowledge the use of New Zealand eScience Infrastructure (NeSI) high-performance computing facilities as part of this research (Project code: uoo03633).

Author information

Authors and Affiliations

Department of Computer Science, University of Otago, Dunedin, 9054, Otago, New Zealand
Fei Dai, Yawen Chen, Zhiyi Huang, Haibo Zhang, Hao Zhang & Chengpeng Xia

Authors

Fei Dai
View author publications
You can also search for this author in PubMed Google Scholar
Yawen Chen
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyi Huang
View author publications
You can also search for this author in PubMed Google Scholar
Haibo Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Hao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Chengpeng Xia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fei Dai.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This study extends the work in [1], which is published at The 22nd International Conference on Parallel and Distributed Computing, Applications, and Technologies in 2022.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Dai, F., Chen, Y., Huang, Z. et al. Comparing the performance of multi-layer perceptron training on electrical and optical network-on-chips. J Supercomput 79, 10725–10746 (2023). https://doi.org/10.1007/s11227-022-04945-y

Download citation

Accepted: 09 November 2022
Published: 23 November 2022
Issue Date: July 2023
DOI: https://doi.org/10.1007/s11227-022-04945-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparing the performance of multi-layer perceptron training on electrical and optical network-on-chips

Abstract

Access this article

Similar content being viewed by others

Performance Comparison of Multi-layer Perceptron Training on Electrical and Optical Network-on-Chips

Versatile Architectures of Artificial Neural Network with Variable Capacity

A Runtime Programmable Accelerator for Convolutional and Multilayer Perceptron Neural Networks on FPGA

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Comparing the performance of multi-layer perceptron training on electrical and optical network-on-chips

Abstract

Access this article

Similar content being viewed by others

Performance Comparison of Multi-layer Perceptron Training on Electrical and Optical Network-on-Chips

Versatile Architectures of Artificial Neural Network with Variable Capacity

A Runtime Programmable Accelerator for Convolutional and Multilayer Perceptron Neural Networks on FPGA

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation