Skip to main content
Log in

An Efficient Dataflow Mapping Method for Convolutional Neural Networks

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Convolutional neural network (CNN) have been widely used in speech recognition, object detection and image recognition. In the process of inference, data access operations consume more energy than calculations. Therefore, optimizing the dataflow from external storage to the on-chip processing unit is an effective method to reduce the power consumption. The state-of-the-art row stationary dataflow maximizes the data reusing to reduce the number of data movement and thus reducing the power consumption of the system. However, it has the problems of low processing unit utilization and poor scalability. In this letter, we propose an enhance row stationary (ERS) dataflow. ERS dataflow maps the data of each channel of the three-dimensional filter and input feature map (ifmap) into a column of processing units by changing the mapping of data. The processing array operates multiple filters in parallel, which effectively improves the utilization of computing resources. Within each processing unit, one row of filter data and one row of ifmap data are processed at a time, and the filter row data and ifmap row data are reused in multiple convolution operations. In addition, a configurable sliding window model is proposed to solve the problem that the existing dataflow cannot handle the calculation array width smaller than the filter width. Simulation results show that for AlexNet and VGG16, the execution speed of ERS dataflow is improved by about 60% compared with row stationary dataflow, and the hardware resource utilization is improved by about 30%. For the MobileNet V1, the execution speed of ERS dataflow is improved by about 4% compared with the row stationary dataflow.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Chen Y, Emer J, Sze V (2017) Using dataflow to optimize energy efficiency of deep neural network accelerators. IEEE Micro 37(3):12–21

    Article  Google Scholar 

  2. Sadr H, Pedram MM, Teshnehlab M (2019) A robust sentiment analysis method based on sequential combination of convolutional and recursive neural networks. Neural Process Lett 50(3):2745–2761

    Article  Google Scholar 

  3. Zhang S, Xu X, Pang Y et al (2020) Multi-layer attention based CNN for target-dependent sentiment classification. Neural Process Lett 51(3):2089–2103

    Article  Google Scholar 

  4. Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90

    Article  Google Scholar 

  5. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations (ICLR)

  6. Szegedy C, Liu W, Jia Y et al (2015) Going deeper with convolutions. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), Boston, MA, USA, pp 1–9

  7. He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of CVPR, pp 770–778

  8. Hu J, Shen L, Albanie S et al (2020) Squeeze-and-excitation networks. IEEE Trans Pattern Anal Mach Intell IEEE 42(8):2011–2023

    Article  Google Scholar 

  9. Sinha M, Gade SH, Singh W, Deb S (2018). Data-flow aware CNN accelerator with hybrid wireless interconnection. In: 2018 IEEE 29th international conference on application-specific systems, architectures and processors (ASAP), Milan, pp 1–4

  10. Yin S et al (2018) A high energy efficient reconfigurable hybrid neural network processor for deep learning applications. IEEE J Solid State Circuits 53(4):968–982

    Article  Google Scholar 

  11. Chakradhar S et al. (2010) A dynamically configurable coprocessor for convolutional neural networks. In: Proceedings of 37th annual international symposium. Computer architecture (ISCA 10), 2010, pp 247–257

  12. Chen T, DianNao et al (2014) A small-footprint high-throughput accelerator for ubiquitous machine-learning. In: Proceedings of 19th international conference. Architectural support for programming languages and operating systems (ASPLOS14), pp 269–284

  13. Kwon Y, Rhu M (2018) A case for memory-centric HPC system architecture for training deep neural networks. In: IEEE computer architecture letters, vol 17, no 2, pp 134–138, 1

  14. Lu W, Yan G, Li J, Gong S, Han Y, Li X (2017) FlexFlow: a flexible data-flow accelerator architecture for convolutional neural networks. In: IEEE international symposium on high performance computer architecture (HPCA). Austin, TX, pp 553–564

  15. Chen Y, Krishna T, Emer JS, Sze V (2017) Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE J Solid State Circuits 52(1):127–138

    Article  Google Scholar 

  16. Li Y, Ma S, Guo Y, Chen G, Xu R (2018) Dataflow single-channel, for convolutional neural network accelerator. In: IEEE 4th information technology and mechatronics engineering conference (ITOEC). Chong-qing, China, pp 966–970

  17. Yasoubi A, Hojabr R, Modarressi M (2017) Power-efficient accelerator design for neural networks using computation reuse. In: IEEE computer architecture letters, vol.16, no 1, pp 72-75, 1

  18. Gokhale V, Jin J, Dundar A et al (2014) A 240 G-ops/s mobile coprocessor for deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) workshops, pp 682–687

  19. Du Z, Fasthuber R, Chen T et al (2015) ShiDianNao: Shifting vision processing closer to the sensor. In: 2015 ACM/IEEE 42nd annual international symposium on computer architecture (ISCA). Portland, OR, USA, pp 92–104

  20. Zhang C, Li P, Sun G, et al (2015)Optimizing FPGA-based accelerator design for deep convolutional neural networks. In: Proceedings of the 2015 ACM/SIGDA international symposium on field-programmable gate. ACM/SIGDA, pp 161–170

  21. Kong A, Zhao B (2019) A high efficient architecture for convolution neural network accelerator. In: 2019 2nd international conference on intelligent autonomous systems (ICoIAS), Singapore, Singapore, pp 131–134

  22. Oh M, Lee C, Lee S, Accelerator Convolutional Neural Network, with Reconfigurable Dataflow, et al (2018) International SoC design conference (ISOCC), Dae-gu. Korea (South) 2018, pp 42–43

  23. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861

  24. Fen G, Ning W, Qi W (2007) Simulation and performance evaluation for network on chip design using OPNET, TENCON 2007–2007 IEEE region 10 conference. Taipei, pp 1–4. https://doi.org/10.1109/TENCON.2007.4428942

Download references

Acknowledgements

This work was supported in part by the National Key R & D Program of China under Grant 2018YFE0202800, the National Natural Science Foundation of China under Grant 61634004 and 61934002, the Natural Science Foundation of Shaanxi Province for Distinguished Young Scholars under Grant no 2020JC-26, and the Fundamental Research Funds for the Central Universities under Grant No. JB190105 and XJS200119, and the Open Project Program of the State Key Laboratory of Mathematical Engineering and Advanced Computing under Grant No. 2019A01.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huaxi Gu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Z., Gu, H., Zhang, B. et al. An Efficient Dataflow Mapping Method for Convolutional Neural Networks. Neural Process Lett 54, 1075–1090 (2022). https://doi.org/10.1007/s11063-021-10670-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-021-10670-z

Keywords

Navigation