research-article

STADIA: Photonic Stochastic Gradient Descent for Neural Network Accelerators

Authors:
Chengpeng Xia

University of Otago, New Zealand

University of Otago, New Zealand

0000-0002-9520-0229
View Profile

,
Yawen Chen

University of Otago, New Zealand

University of Otago, New Zealand

0000-0001-7006-2459
View Profile

,
Haibo Zhang

University of Otago, New Zealand

University of Otago, New Zealand

0000-0002-3752-0806
View Profile

,
Jigang Wu

Guangdong University of Technology, China

Guangdong University of Technology, China

0000-0002-6470-9794
View Profile

Authors Info & Claims

ACM Transactions on Embedded Computing Systems Volume 22 Issue 5sArticle No.: 126pp 1–23https://doi.org/10.1145/3607920

Published:09 September 2023Publication History

ACM Transactions on Embedded Computing Systems

Abstract

Deep Neural Networks (DNNs) have demonstrated great success in many fields such as image recognition and text analysis. However, the ever-increasing sizes of both DNN models and training datasets make deep leaning extremely computation- and memory-intensive. Recently, photonic computing has emerged as a promising technology for accelerating DNNs. While the design of photonic accelerators for DNN inference and forward propagation of DNN training has been widely investigated, the architectural acceleration for equally important backpropagation of DNN training has not been well studied. In this paper, we propose a novel silicon photonic-based backpropagation accelerator for high performance DNN training. Specifically, a general-purpose photonic gradient descent unit named STADIA is designed to implement the multiplication, accumulation, and subtraction operations required for computing gradients using mature optical devices including Mach-Zehnder Interferometer (MZI) and Mircoring Resonator (MRR), which can significantly reduce the training latency and improve the energy efficiency of backpropagation. To demonstrate efficient parallel computing, we propose a STADIA-based backpropagation acceleration architecture and design a dataflow by using wavelength-division multiplexing (WDM). We analyze the precision of STADIA by quantifying the precision limitations imposed by losses and noises. Furthermore, we evaluate STADIA with different element sizes by analyzing the power, area and time delay for photonic accelerators based on DNN models such as AlexNet, VGG19 and ResNet. Simulation results show that the proposed architecture STADIA can achieve significant improvement by 9.7× in time efficiency and 147.2× in energy efficiency, compared with the most advanced optical-memristor based backpropagation accelerator.

REFERENCES

[1] Akiyama Suguru, Baba Takeshi, Imai Masahiko, Akagawa Takeshi, Takahashi Masashi, Hirayama Naoki, Takahashi Hiroyuki, Noguchi Yoshiji, Okayama Hideaki, Horikawa Tsuyoshi, et al. 2012. 12.5-Gb/s operation with 0.29-V· cm V \(\pi\) L using silicon mach-zehnder modulator based-on forward-biased pin diode. Optics Express 20, 3 (2012), 2911–2923.Google ScholarCross Ref
[2] Al-Qadasi MA, Chrostowski L, Shastri BJ, and Shekhar S. 2022. Scaling up silicon photonic-based accelerators: Challenges and opportunities. APL Photonics 7, 2 (2022), 020902.Google ScholarCross Ref
[3] Alexoudi Theoni, Kanellos George Theodore, and Pleros Nikos. 2020. Optical RAM and integrated optical memories: A survey. Light: Science & Applications 9, 1 (2020), 1–16.Google ScholarCross Ref
[4] Awny Ahmed, Nagulapalli Rajasekhar, Kroh Marcel, Hoffmann Jan, Runge Patrick, Micusik Daniel, Fischer Gunter, Ulusoy Ahmet Cagri, Ko Minsu, and Kissinger Dietmar. 2017. A linear differential transimpedance amplifier for 100-Gb/s integrated coherent optical fiber receivers. IEEE Transactions on Microwave Theory and Techniques 66, 2 (2017), 973–986.Google ScholarCross Ref
[5] Chen Xia, Milosevic Milan M, Stanković Stevan, Reynolds Scott, Bucio Thalia Dominguez, Li Ke, Thomson David J, Gardes Frederic, and Reed Graham T. 2018. The emergence of silicon photonics as a flexible technology platform. Proc. IEEE 106, 12 (2018), 2101–2116.Google ScholarCross Ref
[6] Chetlur Sharan, Woolley Cliff, Vandermersch Philippe, Cohen Jonathan, Tran John, Catanzaro Bryan, and Shelhamer Evan. 2014. cudnn: Efficient primitives for deep learning. arXiv preprint arXiv:1410.0759 (2014).Google Scholar
[7] Dang Dharanidhar, Chittamuru Sai Vineel Reddy, Pasricha Sudeep, Mahapatra Rabi, and Sahoo Debashis. 2021. BPLight-CNN: A photonics-based backpropagation accelerator for deep learning. ACM Journal on Emerging Technologies in Computing Systems (JETC) 17, 4 (2021), 1–26.Google ScholarDigital Library
[8] Dang Dharanidhar, Khansama Aurosmita, Mahapatra Rabi, and Sahoo Debashis. 2020. BPhoton-CNN: An ultrafast photonic backpropagation accelerator for deep learning. In Proceedings of the Great Lakes Symposium on VLSI. 27–32.Google Scholar
[9] Dang Dharanidhar, Taheri Sahar, Lin Bill, and Sahoo Debashis. 2020. MEMTONIC: A neuromorphic accelerator for energy efficient deep learning. In 2020 57th ACM/IEEE Design Automation Conference (DAC). IEEE, 1–2.Google ScholarCross Ref
[10] Sa Christopher De, Feldman Matthew, Ré Christopher, and Olukotun Kunle. 2017. Understanding and optimizing asynchronous low-precision stochastic gradient descent. In Proceedings of the 44th Annual International Symposium on Computer Architecture. 561–574.Google ScholarDigital Library
[11] Dean Jeffrey, Corrado Greg, Monga Rajat, Chen Kai, Devin Matthieu, Mao Mark, Ranzato Marc’aurelio, Senior Andrew, Tucker Paul, Yang Ke, et al. 2012. Large scale distributed deep networks. Advances in Neural Information Processing Systems 25 (2012), 1223–1231.Google Scholar
[12] Deng Jia, Dong Wei, Socher Richard, Li Li-Jia, Li Kai, and Fei-Fei Li. 2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. Ieee, 248–255.Google ScholarCross Ref
[13] Deng Li. 2012. The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine 29, 6 (2012), 141–142.Google ScholarCross Ref
[14] Coarer Florian Denis-Le, Sciamanna Marc, Katumba Andrew, Freiberger Matthias, Dambre Joni, Bienstman Peter, and Rontani Damien. 2018. All-optical reservoir computing on a photonic chip using silicon-based ring resonators. IEEE Journal of Selected Topics in Quantum Electronics 24, 6 (2018), 1–8.Google ScholarCross Ref
[15] Descos A, Jany C, Bordel D, Duprez H, Farias G Beninca de, Brianceau P, Menezo S, and Bakir B Ben. 2013. Heterogeneously integrated III-V/Si distributed Bragg reflector laser with adiabatic coupling. In 39th European Conference and Exhibition on Optical Communication (ECOC 2013). IET, 1–3.Google ScholarCross Ref
[16] Fang Michael Y-S, Manipatruni Sasikanth, Wierzynski Casimir, Khosrowshahi Amir, and DeWeese Michael R. 2019. Design of optical neural networks with component imprecisions. Optics Express 27, 10 (2019), 14009–14029.Google ScholarCross Ref
[17] Fedus William, Zoph Barret, and Shazeer Noam. 2021. Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity.Google Scholar
[18] Feldmann Johannes, Youngblood Nathan, Karpov Maxim, Gehring Helge, Li Xuan, Stappers Maik, Gallo Manuel Le, Fu Xin, Lukashchuk Anton, Raja Arslan Sajid, et al. 2021. Parallel convolutional processing using an integrated photonic tensor core. Nature 589, 7840 (2021), 52–58.Google ScholarCross Ref
[19] Floridi Luciano and Chiriatti Massimo. 2020. GPT-3: Its nature, scope, limits, and consequences. Minds and Machines 30 (2020), 681–694.Google ScholarDigital Library
[20] Guo Xianxin, Barrett Thomas D, Wang Zhiming M, and Lvovsky AI. 2021. Backpropagation through nonlinear units for the all-optical training of neural networks. Photonics Research 9, 3 (2021), B71–B80.Google ScholarCross Ref
[21] Han Song, Mao Huizi, and Dally William J. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015).Google Scholar
[22] Hartenstein Reiner. 2016. The alternative machine paradigm for energy-efficient computing. ResearchGate 113 (2016), 1–13. Google ScholarCross Ref
[23] He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Sun Jian. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.Google ScholarCross Ref
[24] Hui Rongqing. 2019. Introduction to Fiber-optic Communications. Academic Press.Google Scholar
[25] Jacob Benoit, Kligys Skirmantas, Chen Bo, Zhu Menglong, Tang Matthew, Howard Andrew, Adam Hartwig, and Kalenichenko Dmitry. 2018. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2704–2713.Google ScholarCross Ref
[26] Kaleem Rashid, Pai Sreepathi, and Pingali Keshav. 2015. Stochastic gradient descent on GPUs. In Proceedings of the 8th Workshop on General Purpose Processing using GPUs. 81–89.Google ScholarDigital Library
[27] Koetsier John. 2021. Photonic Supercomputer For AI: 10X faster, 90% less energy, plus runway for 100X speed boost. Forbes (2021). https://www.forbes.com/sites/johnkoetsier/2021/04/07/photonic-supercomputer-for-ai-10x-faster-90-less-energy-plus-runway-for-100x-speed-boost/?sh=4589d9b67260Google Scholar
[28] Krizhevsky Alex, Sutskever Ilya, and Hinton Geoffrey E. 2012. Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems 25 (2012), 84–90.Google Scholar
[29] LeCun Yann, Bottou Léon, Bengio Yoshua, and Haffner Patrick. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278–2324.Google ScholarCross Ref
[30] Liao Kun, Li Chentong, Dai Tianxiang, Zhong Chuyu, Lin Hongtao, Hu Xiaoyong, and Gong Qihuang. 2022. Matrix eigenvalue solver based on reconfigurable photonic neural network. Nanophotonics 11, 17 (2022), 4089–4099.Google ScholarCross Ref
[31] Lin Xing, Rivenson Yair, Yardimci Nezih T, Veli Muhammed, Luo Yi, Jarrahi Mona, and Ozcan Aydogan. 2018. All-optical machine learning using diffractive deep neural networks. Science 361, 6406 (2018), 1004–1008.Google ScholarCross Ref
[32] Liu Qijun, Jimenez Miguel, Inda Maria Eugenia, Riaz Arslan, Zirtiloglu Timur, Chandrakasan Anantha P, Lu Timothy K, Traverso Giovanni, Nadeau Phillip, and Yazicigil Rabia Tugce. 2022. A threshold-based bioluminescence detector with a CMOS-integrated photodiode array in 65 nm for a multi-diagnostic ingestible capsule. IEEE Journal of Solid-State Circuits (2022).Google Scholar
[33] Mehrabian Armin, Al-Kabani Yousra, Sorger Volker J, and El-Ghazawi Tarek. 2018. PCNNA: A photonic convolutional neural network accelerator. In 2018 31st International System-on-Chip Conference. IEEE, 169–173.Google ScholarCross Ref
[34] O’Connor Ian and Nicolescu Gabriela. 2012. Integrated Optical Interconnect Architectures for Embedded Systems. Springer Science & Business Media.Google ScholarDigital Library
[35] Pai Sunil, Sun Zhanghao, Hughes Tyler W, Park Taewon, Bartlett Ben, Williamson Ian AD, Minkov Momchil, Milanizadeh Maziyar, Abebe Nathnael, Morichetti Francesco, et al. 2023. Experimentally realized in situ backpropagation for deep learning in photonic neural networks. Science 380, 6643 (2023), 398–404.Google ScholarCross Ref
[36] Shafaei Alireza, Wang Yanzhi, and Lin Xue. 2014. FinCACTI: Architectural analysis and modeling of caches with deeply-scaled FinFET devices. In 2014 IEEE Computer Society Annual Symposium on VLSI. 290–295.Google ScholarDigital Library
[37] Shafiee Ali, Nag Anirban, Muralimanohar Naveen, Balasubramonian Rajeev, Strachan John Paul, Hu Miao, and Williams R Stanley. 2016. ISAAC: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars. ACM SIGARCH Computer Architecture News 44, 3 (2016), 14–26.Google ScholarDigital Library
[38] Shen Yichen, Harris Nicholas C, Skirlo Scott, Prabhu Mihika, Baehr-Jones Tom, Hochberg Michael, Sun Xin, Zhao Shijie, Larochelle Hugo, Englund Dirk, et al. 2017. Deep learning with coherent nanophotonic circuits. Nature Photonics 11, 7 (2017), 441–446.Google ScholarCross Ref
[39] Shiflett Kyle, Karanth Avinash, Bunescu Razvan, and Louri Ahmed. 2021. Albireo: Energy-efficient acceleration of convolutional neural networks via silicon photonics. In ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA). 860–873.Google Scholar
[40] Shiflett Kyle, Wright Dylan, Karanth Avinash, and Louri Ahmed. 2020. PIXEL: Photonic neural network accelerator. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA). 474–487.Google ScholarCross Ref
[41] Shokraneh Farhad, Geoffroy-Gagnon Simon, Nezami Mohammadreza Sanadgol, and Liboiron-Ladouceur Odile. 2019. A single layer neural network implemented by a \(4 \times 4\) MZI-based optical processor. IEEE Photonics Journal 11, 6 (2019), 1–12.Google ScholarCross Ref
[42] Simonyan Karen and Zisserman Andrew. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014), 1–14.Google Scholar
[43] Singh Jaspreet, Dabeer Onkar, and Madhow Upamanyu. 2009. On the limits of communication with low-precision analog-to-digital conversion at the receiver. IEEE Transactions on Communications 57, 12 (2009), 3629–3639.Google ScholarCross Ref
[44] Song Linghao, Qian Xuehai, Li Hai, and Chen Yiran. 2017. Pipelayer: A pipelined reram-based accelerator for deep learning. In 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 541–552.Google ScholarCross Ref
[45] Sun Xiao, Choi Jungwook, Chen Chia-Yu, Wang Naigang, Venkataramani Swagath, Srinivasan Vijayalakshmi Viji, Cui Xiaodong, Zhang Wei, and Gopalakrishnan Kailash. 2019. Hybrid 8-bit floating point (HFP8) training and inference for deep neural networks. Advances in Neural Information Processing Systems 32 (2019).Google Scholar
[46] Szegedy Christian, Liu Wei, Jia Yangqing, Sermanet Pierre, Reed Scott, Anguelov Dragomir, Erhan Dumitru, Vanhoucke Vincent, and Rabinovich Andrew. 2015. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1–9.Google ScholarCross Ref
[47] Tait Alexander N, Nahmias Mitchell A, Shastri Bhavin J, and Prucnal Paul R. 2014. Broadcast and weight: An integrated network for scalable photonic spike processing. Journal of Lightwave Technology 32, 21 (2014), 3427–3439.Google ScholarCross Ref
[48] Texas. 2023. Texas Instruments ADS1285 32-Bit Low-Power ADC. https://nz.mouser.com/new/texas-instruments/ti-ads1285-low-power-adc/Google Scholar
[49] Wijk Arthur van, Doerr Christopher R, Ali Zain, Karabiyik Mustafa, and Akca B Imran. 2020. Compact ultrabroad-bandwidth cascaded arrayed waveguide gratings. Optics Express 28, 10 (2020), 14618–14626.Google ScholarCross Ref
[50] Xiang Chao, Guo Joel, Jin Warren, Wu Lue, Peters Jonathan, Xie Weiqiang, Chang Lin, Shen Boqiang, Wang Heming, Yang Qi-Fan, et al. 2021. High-performance lasers for fully integrated silicon nitride photonics. Nature Communications 12, 1 (2021), 6650.Google ScholarCross Ref
[51] Xiang Shuiying, Han Yanan, Song Ziwei, Guo Xingxing, Zhang Yahui, Ren Zhenxing, Wang Suhong, Ma Yuanting, Zou Weiwen, Ma Bowen, et al. 2021. A review: Photonics devices, architectures, and algorithms for optical neural computing. Journal of Semiconductors 42, 2 (2021), 023105.Google ScholarCross Ref
[52] Xie Xiaolong, Tan Wei, Fong Liana L, and Liang Yun. 2017. CuMF_SGD: Parallelized stochastic gradient descent for matrix factorization on GPUs. In Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing. 79–92.Google ScholarDigital Library
[53] Zhou Shijie, Kannan Rajgopal, and Prasanna Viktor. 2020. Accelerating stochastic gradient descent based matrix factorization on FPGA. IEEE Transactions on Parallel and Distributed Systems 31, 8 (2020), 1897–1911.Google ScholarCross Ref
[54] Zhu Feng, Gong Ruihao, Yu Fengwei, Liu Xianglong, Wang Yanfei, Li Zhelong, Yang Xiuqi, and Yan Junjie. 2020. Towards unified int8 training for convolutional neural network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1969–1979.Google ScholarCross Ref

Index Terms

STADIA: Photonic Stochastic Gradient Descent for Neural Network Accelerators
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Neural networks
2. Hardware
  1. Emerging technologies
    1. Emerging optical and photonic technologies

Recommendations

DNNARA: A Deep Neural Network Accelerator using Residue Arithmetic and Integrated Photonics
ICPP '20: Proceedings of the 49th International Conference on Parallel Processing

Deep Neural Networks (DNNs) are currently used in many fields, including critical real-time applications. Due to its compute-intensive nature, speeding up DNNs has become an important topic in current research. We propose a hybrid opto-electronic ...
Read More
Coarse grain parallelization of deep neural networks
PPoPP '16: Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

Deep neural networks (DNN) have recently achieved extraordinary results in domains like computer vision and speech recognition. An essential element for this success has been the introduction of high performance computing (HPC) techniques in the ...
Read More
Stochastic gradient descent on GPUs
GPGPU-8: Proceedings of the 8th Workshop on General Purpose Processing using GPUs

Irregular algorithms such as Stochastic Gradient Descent (SGD) can benefit from the massive parallelism available on GPUs. However, unlike in data-parallel algorithms, synchronization patterns in SGD are quite complex. Furthermore, scheduling for scale-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Embedded Computing Systems Volume 22, Issue 5s
Special Issue ESWEEK 2023
October 2023
1394 pages
ISSN:1539-9087
EISSN:1558-3465
DOI:10.1145/3614235
Editor:
Tulika Mitra
National University of Singapore, Singapore
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States

Journal Family
ACM Journals for the Design of Smart and Connected Systems
Publication History
- Published: 9 September 2023
- Accepted: 1 July 2023
- Revised: 2 June 2023
- Received: 23 March 2023
Published in tecs Volume 22, Issue 5s

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Stochastic gradient descent
neural networks accelerator
optical computing
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 353
  Total Downloads
- Downloads (Last 12 months)353
- Downloads (Last 6 weeks)40
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

View Full Text

STADIA: Photonic Stochastic Gradient Descent for Neural Network Accelerators

ACM Transactions on Embedded Computing Systems

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

DNNARA: A Deep Neural Network Accelerator using Residue Arithmetic and Integrated Photonics

Coarse grain parallelization of deep neural networks

Stochastic gradient descent on GPUs