Efficient Architecture for Convolution and Softmax Function in Deep Learning Accelerator

Jiang, Zhenyu; Zhang, Zhifeng; Ren, Haoqi; Wu, Jun

doi:10.1007/978-3-030-67720-6_43

Efficient Architecture for Convolution and Softmax Function in Deep Learning Accelerator

Zhenyu Jiang²¹,
Zhifeng Zhang²¹,
Haoqi Ren²¹ &
…
Jun Wu²²

Conference paper
First Online: 02 February 2021

1068 Accesses

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 352))

Abstract

Convolutional neural network (CNN) has been widely used in deep learning. However, the hardware consumption of the convolutional neural network is very large. Traditional Central Processing Units (CPUs) and Graphic Processing Units (GPUs) are inefficient and expensive for neural network, so an efficient hardware design is required. The proposed design based on Digital Signal Processor (DSP) has rapid operating speed and strong computation ability for training and inference of CNN. In this paper, the hardware architecture of convolution and softmax function is specially optimized. Winograd algorithm can reduce multiplications of convolution, thus decreases hardware complexity, since multiplication is much more complex in hardware implementation than addition. The softmax function is also simplified by replacing divider by subtractor and logarithmic function which cost fewer resources. The proposed hardware architecture dramatically decreases the complexity and hardware resources.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Graves, A., Jaitly, N.: Towards end-to-end speech recognition with recurrent neural networks. In: International Conference on Machine Learning, pp. 1764–1772 (2014)
Google Scholar
Jogin, M., et al.: Feature extraction using convolution neural networks (CNN) and deep learning. In: 2018 3rd IEEE International Conference on Recent Trends in Electronics, Information and Communication Technology (RTEICT). IEEE (2020)
Google Scholar
Wang, Z., Chen, J., Wang, X.: Convolutional neural network for image feature extraction based on concurrent nested inception modules. In: 2019 15th International Conference on Computational Intelligence and Security (CIS). IEEE (2020)
Google Scholar
Li, R., et al.: A convolutional neural network with mapping layers for hyperspectral image classification. IEEE Trans. Geoence Remote Sens. 58(5), 3136–3147 (2020)
Google Scholar
Long, P.M., Sedghi, H.: Size-free generalization bounds for convolutional neural networks (2019)
Google Scholar
Shao, R., Zhong, S., Yan, L.: ASIC-based architecture for the real-time computation of 2D convolution with large kernel size. In: International Symposium on Multispectral Image Processing and Pattern Recognition International Society for Optics and Photonics (2015)
Google Scholar
Jouppi, N.P., et al.: In-datacenter performance analysis of a tensor processing unit. In: 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA), pp. 1–12. IEEE (2017)
Google Scholar
Ren, H., Zhang, Z., Wu, J.: SWIFT: a computationally-intensive DSP architecture for communication applications. Mobile Networks Appl. 21(6), 974 (2016)
Google Scholar
Spagnolo, F., et al.: Designing fast convolutional engines for deep learning applications. In: 2018 25th IEEE International Conference on Electronics, Circuits and Systems (ICECS). IEEE (2019)
Google Scholar
Meng, L., Brothers, J.: Efficient Winograd Convolution via Integer Arithmetic (2019)
Google Scholar
Asgari, B., Hadidi, R., Kim, H.: Proposing a fast and scalable systolic array for matrix multiplication. In: 2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE (2020)
Google Scholar
Wang, Z., Lan, Q., He, H., Zhang, C.: Winograd algorithm for 3D convolution neural networks. In: Lintas, A., Rovetta, S., Verschure, P.F.M.J., Villa, A.E.P. (eds.) ICANN 2017. LNCS, vol. 10614, pp. 609–616. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68612-7_69
Chapter Google Scholar
Yuan, B.: Efficient hardware architecture of softmax layer in deep neural network. In: System-on-chip Conference. IEEE (2017)
Google Scholar
Zhu, D., et al.: Efficient Precision-adjustable architecture for softmax function in deep learning. IEEE Trans. Circuits Syst. II Express Briefs, 99, 1 (2020)
Google Scholar
Farabet, C., et al.: Hardware accelerated convolutional neural networks for synthetic vision systems. In: IEEE International Symposium on Circuits and Systems. IEEE (2010)
Google Scholar

Download references

Acknowledgement

The authors would like to thank the editors and the reviewers for providing comments and suggestions for this paper. This work was supported in part by the National Natural Science Foundation of China under Grant Nos. 61831018, 61901199, and 61631017, and Guangdong Province Key Research and Development Program Major Science and Technology Projects under Grant 2018B010115002.

Author information

Authors and Affiliations

College of Electronic and Information Engineering, Tongji University, Shanghai, China
Zhenyu Jiang, Zhifeng Zhang & Haoqi Ren
School of Computer Science, Fudan University, Shanghai, China
Jun Wu

Authors

Zhenyu Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Zhifeng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Haoqi Ren
View author publications
You can also search for this author in PubMed Google Scholar
Jun Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhenyu Jiang .

Editor information

Editors and Affiliations

Shanghai University, Shanghai, China
Honghao Gao
Tsinghua University, Beijing, China
Pingyi Fan
Fudan University, Shanghai, China
Jun Wun
Tongji University, Shanghai, China
Xue Xiaoping
Hangzhou Dianzi University, Hangzhou, China
Jun Yu
Huawei Technologies Co Ltd, Shanghai, China
Yi Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jiang, Z., Zhang, Z., Ren, H., Wu, J. (2021). Efficient Architecture for Convolution and Softmax Function in Deep Learning Accelerator. In: Gao, H., Fan, P., Wun, J., Xiaoping, X., Yu, J., Wang, Y. (eds) Communications and Networking. ChinaCom 2020. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 352. Springer, Cham. https://doi.org/10.1007/978-3-030-67720-6_43

Download citation

DOI: https://doi.org/10.1007/978-3-030-67720-6_43
Published: 02 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67719-0
Online ISBN: 978-3-030-67720-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics