Skip to main content

Hardware-Software Co-design for Deep Neural Network Acceleration

  • Conference paper
  • First Online:
Service Science (ICSS 2023)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1844))

Included in the following conference series:

Abstract

Deep neural networks are widely utilized in many fields. However, the extensive requirement of computation is usually difficult to meet to support network inference. Model pruning, a technique to reduce redundant model weights to acceleration, provides a possible way to solve this problem but the improvement is usually limited due to the separation of hardware and software optimization. In this paper, we propose a complete hardware-software co-design framework to support irregular sparse model. Specifically, we prune redundant model weights through iterative pruning by increasing the penalty factor and improve the hardware efficiency through hardware threads control. We achieve significant model efficiency improvement by reducing 64.2% and 86.5% inference latency in vector-multiplication and convolution applications. The experimental results show the significant performance improvement and proves the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Asanović, K., Patterson, D.A.: Instruction sets should be free: the case for RISC-V. In: EECS Department, University of California, Berkeley, Technical report UCB/EECS-2014-146 (2014)

    Google Scholar 

  2. Bragança, L., et al.: Simplifying HW/SW integration to deploy multiple accelerators for CPU-FPGA heterogeneous platforms. In: Proceedings of the 18th International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation, pp. 97–104 (2018)

    Google Scholar 

  3. Collange, C.: Simty: generalized SIMT execution on RISC-V. In: CARRV 2017–1st Workshop on Computer Architecture Research with RISCV, vol. 6, p. 6 (2017)

    Google Scholar 

  4. Elsabbagh, F., et al. : Vortex: OpenCL compatible RISC-V GPGPU. In: arXiv preprint arXiv:2002.12151 (2020)

  5. Jääskeläinen, P., et al.: PoCL: a performance-portable OpenCL implementation. Int. J. Parallel Prog. 43, 752–785 (2015)

    Article  Google Scholar 

  6. Lattner, C.: LLVM and Clang: next generation compiler technology. In: The BSD Conference, vol. 5, pp. 1–20 (2008)

    Google Scholar 

  7. Lattner, C., Adve, V.: LLVM: a compilation framework for lifelong program analysis & transformation. In: International Symposium on Code Generation and Optimization, 2004. CGO 2004, pp. 75–86. IEEE (2004)

    Google Scholar 

  8. Liu, Z.-J., et al.: Behavior-aware memory scheduling for GPGPU applications. Comput. Eng. Sci. 39(06), 1011

    Google Scholar 

  9. Michel, P., Levy, O., Neubig, G.: Are sixteen heads really better than one? Adv. Neural Inf. Process. Syst. 32 (2019)

    Google Scholar 

  10. Nvidia. Cuda binary utilities. NVIDIA Application Note (2014)

    Google Scholar 

  11. Sanders, J., Kandrot, E.: CUDA by Example: An Introduction to General-Purpose GPU Programming. Addison-Wesley Professional, Boston (2010)

    Google Scholar 

  12. Stone, J.E., Gohara, D., Shi, G.,: OpenCL: a parallel programming standard for heterogeneous computing systems. Comput. Sci. Eng. 12(3), 66 (2010)

    Google Scholar 

  13. Blaise, T., et al.: Vortex: extending the RISC-V ISA for GPGPU and 3D-graphics. In: MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 754–766 (2021)

    Google Scholar 

  14. J. J. Corinna Vinschen. Newlib (2001). http://sourceware.org/newlib

  15. Wang, Z., Wohlwend, J., Lei, T.: Structured pruning of large language models. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 6151–6162 (2020)

    Google Scholar 

  16. Yan, H., et al.: Constructing concurrent data structures on FPGA with channels. In: Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 172–177 (2019)

    Google Scholar 

  17. Yao, Y.: SE-CNN: convolution neural network acceleration via symbolic value prediction. IEEE J. Emerg. Sel. Top. Circuits Syst. (2023)

    Google Scholar 

  18. Zhang, T., et al.: A systematic DNN weight pruning framework using alternating direction method of multipliers. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 184–199 (2018)

    Google Scholar 

  19. Zhuang, Z., et al.: Discrimination-aware channel pruning for deep neural networks. Adv. Neural Inf. Process. Syst. 31 (2018)

    Google Scholar 

Download references

Acknowledgements

This research is supported by National Key R &D Program of China Grants No. 2020YFB1805505, and Natural Science Foundation of Shandong Province Grant No. ZR2022LZH017.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bingbing Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, Y., Li, B., Lu, L., Wang, J., Li, R., Kan, H. (2023). Hardware-Software Co-design for Deep Neural Network Acceleration. In: Wang, Z., Wang, S., Xu, H. (eds) Service Science. ICSS 2023. Communications in Computer and Information Science, vol 1844. Springer, Singapore. https://doi.org/10.1007/978-981-99-4402-6_16

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-4402-6_16

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-4401-9

  • Online ISBN: 978-981-99-4402-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics