skip to main content
research-article

ACDSE: A Design Space Exploration Method for CNN Accelerator based on Adaptive Compression Mechanism

Published: 09 November 2023 Publication History

Abstract

Customized accelerators for Convolutional Neural Network (CNN) can achieve better energy efficiency than general computing platforms. However, the design of a high-performance accelerator should take into account a variety of parameters and physical constraints. The increasing parameters and tighter constraints gradually complicate the design space, which poses new challenges to the capacity and efficiency of design space exploration methods. In this paper, we provide a novel design space exploration method named ACDSE for optimizing the design process of CNN accelerators. ACDSE implements the adaptive compression mechanism to dynamically adjust the search range and prune low-value design points according to the exploration states. As a result, it can focus on valuable subspace while also improving exploration capacity and efficiency. Additionally, we implement ACDSE to address the problem of CNN accelerator latency optimization. The experiment indicates that, compared to former DSE methods, ACDSE can reduce latency and increase efficiency by 1.39x-5.07x and 2.07x-43.87x, respectively, under the most stringent constraint conditions, demonstrating its superior adaptability to the complicated design space.

References

[1]
Mohamed S. Abdelfattah, Łukasz Dudziak, Thomas Chau, Royson Lee, Hyeji Kim, and Nicholas D. Lane. 2020. Best of both worlds: AutoML codesign of a CNN and its hardware accelerator. In 2020 57th ACM/IEEE Design Automation Conference (DAC). 1–6. DOI:
[2]
Joshua Achiam. 2018. Spinning up in Deep Reinforcement Learning. (2018).
[3]
J. Bergstra, D. Yamins, and D. D. Cox. 2013. Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. In Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28 (ICML’13). JMLR.org, I-115–I-123.
[4]
Jason Cong and Jie Wang. 2018. PolySA: Polyhedral-based systolic array auto-compilation. In 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). IEEE Press, 1–8. DOI:
[5]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. 248–255. DOI:
[6]
Kaijie Feng, Xiaoya Fan, Jianfeng An, Xiping Wang, Kaiyue Di, Jiangfei Li, Minghao Lu, and Chuxi Li. 2021. ERDSE: Efficient reinforcement learning based design space exploration method for CNN accelerator on resource limited platform. Graphics and Visual Computing 4 (2021), 200024. DOI:
[7]
Yuanxiang Gao, Li Chen, and Baochun Li. 2018. Spotlight: Optimizing device placement for training deep neural networks. In International Conference on Machine Learning. PMLR, 1676–1684.
[8]
Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. 2018. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In Proceedings of the 35th International Conference on Machine Learning (Proceedings of Machine Learning Research), Jennifer Dy and Andreas Krause (Eds.), Vol. 80. PMLR, 1861–1870. https://proceedings.mlr.press/v80/haarnoja18b.html.
[9]
Andrew Howard, Mark Sandler, Bo Chen, Weijun Wang, Liang-Chieh Chen, Mingxing Tan, Grace Chu, Vijay Vasudevan, Yukun Zhu, Ruoming Pang, Hartwig Adam, and Quoc Le. 2019. Searching for MobileNetV3. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 1314–1324. DOI:
[10]
Jazzbin. 2020. Geatpy: The genetic and evolutionary algorithm toolbox with high performance in Python. (2020). http://www.geatpy.com/.
[11]
Weiwen Jiang, Lei Yang, Edwin Hsing-Mean Sha, Qingfeng Zhuge, Shouzhen Gu, Sakyasingha Dasgupta, Yiyu Shi, and Jingtong Hu. 2020. Hardware/software co-exploration of neural architectures. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 39, 12 (2020), 4805–4815. DOI:
[12]
Sheng-Chun Kao, Geonhwa Jeong, and Tushar Krishna. 2020. ConfuciuX: Autonomous hardware resource assignment for DNN accelerators using reinforcement learning. In 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 622–636. DOI:
[13]
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. (2014). DOI: arXiv:1412.6980.
[14]
Hyoukjun Kwon, Prasanth Chatarasi, Michael Pellauer, Angshuman Parashar, Vivek Sarkar, and Tushar Krishna. 2019. Understanding reuse, performance, and hardware cost of DNN dataflow: A data-centric approach. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’52). Association for Computing Machinery, New York, NY, USA, 754–768. DOI:
[15]
Hyoukjun Kwon, Ananda Samajdar, and Tushar Krishna. 2018. MAERI: Enabling flexible dataflow mapping over DNN accelerators via reconfigurable interconnects. SIGPLAN Not. 53, 2 (Mar.2018), 461–475. DOI:
[16]
Chuxi Li, Xiaoya Fan, Yuling Geng, Zhao Yang, Danghui Wang, and Meng Zhang. 2020. ENAS oriented layer adaptive data scheduling strategy for resource limited hardware. Neurocomput. 381, C (Mar.2020), 29–39. DOI:
[17]
Haitong Li, Mudit Bhargava, Paul N. Whatmough, and H.-S. Philip Wong. 2019. On-chip memory technology design space explorations for mobile deep neural network accelerators. In Proceedings of the 56th Annual Design Automation Conference 2019 (DAC’19). Association for Computing Machinery, New York, NY, USA, Article 131, 6 pages.
[18]
Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. (2015). DOI: arXiv:1509.02971.
[19]
Weina Lu, Yu Hu, Jing Ye, and Xiaowei Li. 2018. Throughput-oriented automatic design of FPGA accelerator for convolutional neural networks. Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics 30 (112018), 2164–2173. DOI:
[20]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing Atari with Deep Reinforcement Learning. (2013). DOI: arXiv:1312.5602.
[21]
Francisco Muñoz-Martínez, José L. Abellán, Manuel E. Acacio, and Tushar Krishna. 2021. STONNE: Enabling cycle-level microarchitectural simulation for DNN inference accelerators. IEEE Computer Architecture Letters 20, 2 (2021), 122–125. DOI:
[22]
Luigi Nardi, David Koeplinger, and Kunle Olukotun. 2019. Practical design space exploration. In 2019 IEEE 27th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS). 347–358. DOI:
[23]
Angshuman Parashar, Priyanka Raina, Yakun Sophia Shao, Yu-Hsin Chen, Victor A. Ying, Anurag Mukkara, Rangharajan Venkatesan, Brucek Khailany, Stephen W. Keckler, and Joel Emer. 2019. TimeLoop: A systematic approach to DNN accelerator evaluation. In 2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 304–315. DOI:
[24]
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in Pytorch. (2017).
[25]
Hieu Pham, Melody Y. Guan, Barret Zoph, Quoc V. Le, and Jeff Dean. 2018. Efficient neural architecture search via parameter sharing. (2018). DOI: arXiv:1802.03268.
[26]
Brandon Reagen, José Miguel Hernández-Lobato, Robert Adolf, Michael Gelbart, Paul Whatmough, Gu-Yeon Wei, and David Brooks. 2017. A case for efficient accelerator design space exploration via Bayesian optimization. In 2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED). 1–6. DOI:
[27]
Giulia Santoro, Mario R. Casu, Valentino Peluso, Andrea Calimera, and Massimo Alioto. 2018. Energy-performance design exploration of a low-power microprogrammed deep-learning accelerator. In 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 1151–1154. DOI:
[28]
John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, and Philipp Moritz. 2015. Trust region policy optimization. In Proceedings of the 32nd International Conference on Machine Learning (Proceedings of Machine Learning Research), Francis Bach and David Blei (Eds.), Vol. 37. PMLR, Lille, France, 1889–1897. https://proceedings.mlr.press/v37/schulman15.html.
[29]
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. (2017). DOI: arXiv:1707.06347.
[30]
Yakun Sophia Shao, Sam Likun Xi, Vijayalakshmi Srinivasan, Gu-Yeon Wei, and David Brooks. 2016. Co-designing accelerators and SoC interfaces using gem5-Aladdin. In 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 1–12. DOI:
[31]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. (2014). DOI: arXiv:1409.1556.
[32]
Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction. MIT Press.
[33]
Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, and Quoc V. Le. 2019. MnasNet: Platform-aware neural architecture search for mobile. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2815–2823. DOI:
[34]
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9, 86 (2008), 2579–2605. http://jmlr.org/papers/v9/vandermaaten08a.html.
[35]
Kuan Wang, Zhijian Liu, Yujun Lin, Ji Lin, and Song Han. 2019. HAQ: Hardware-aware automated quantization with mixed precision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8612–8620.
[36]
Xuechao Wei, Yun Liang, Xiuhong Li, Cody Hao Yu, Peng Zhang, and Jason Cong. 2018. TGPA: Tile-grained pipeline architecture for low latency CNN inference. In 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). IEEE Press, 1–8. DOI:
[37]
Ronald J. Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning 8, 3–4 (1992), 229–256. DOI:
[38]
Lei Yang, Zheyu Yan, Meng Li, Hyoukjun Kwon, Liangzhen Lai, Tushar Krishna, Vikas Chandra, Weiwen Jiang, and Yiyu Shi. 2020. Co-exploration of neural architectures and heterogeneous ASIC accelerator designs targeting multiple tasks. In 2020 57th ACM/IEEE Design Automation Conference (DAC). 1–6. DOI:
[39]
Xuan Yang, Mingyu Gao, Qiaoyi Liu, Jeff Setter, Jing Pu, Ankita Nayak, Steven Bell, Kaidi Cao, Heonjae Ha, Priyanka Raina, Christos Kozyrakis, and Mark Horowitz. 2020. Interstellar: Using Halide’s Scheduling Language to Analyze DNN Accelerators. Association for Computing Machinery, New York, NY, USA, 369–383.
[40]
A. Yazdanbakhsh, Christof Angermüller, Berkin Akin, Yanqi Zhou, Albin Jones, Milad Hashemi, Kevin Swersky, S. Chatterjee, Ravi Narayanaswami, and J. Laudon. 2021. Apollo: Transferable architecture exploration. (2021). arXiv:2102.01723.
[41]
Ye Yu, Yingmin Li, Shuai Che, Niraj K. Jha, and Weifeng Zhang. 2021. Software-defined design space exploration for an efficient DNN accelerator architecture. IEEE Trans. Comput. 70, 1 (2021), 45–56. DOI:
[42]
Size Zheng, Yun Liang, Shuo Wang, Renze Chen, and Kaiwen Sheng. 2020. FlexTensor: An automatic schedule exploration and optimization framework for tensor computation on heterogeneous system(ASPLOS’20). Association for Computing Machinery, New York, NY, USA, 859–873.
[43]
Barret Zoph and Quoc V. Le. 2016. Neural architecture search with reinforcement learning. (2016). DOI: arXiv:1611.01578.

Cited By

View all
  • (2025)CSDSE: An efficient design space exploration framework for deep neural network accelerator based on cooperative searchNeurocomputing10.1016/j.neucom.2025.129366(129366)Online publication date: Jan-2025
  • (2024)LCDSE: Enable Efficient Design Space Exploration for DCNN Accelerator Based on Layer ClusteringIEEE Transactions on Circuits and Systems II: Express Briefs10.1109/TCSII.2024.339398671:10(4486-4490)Online publication date: Oct-2024
  • (2023)Boomerang: Physical-Aware Design Space Exploration Framework on RISC-V SonicBOOM Microarchitecture2023 IEEE 34th International Conference on Application-specific Systems, Architectures and Processors (ASAP)10.1109/ASAP57973.2023.00026(85-93)Online publication date: Jul-2023

Index Terms

  1. ACDSE: A Design Space Exploration Method for CNN Accelerator based on Adaptive Compression Mechanism

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Embedded Computing Systems
      ACM Transactions on Embedded Computing Systems  Volume 22, Issue 6
      November 2023
      428 pages
      ISSN:1539-9087
      EISSN:1558-3465
      DOI:10.1145/3632298
      • Editor:
      • Tulika Mitra
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Journal Family

      Publication History

      Published: 09 November 2023
      Online AM: 28 June 2022
      Accepted: 12 June 2022
      Revised: 12 April 2022
      Received: 31 December 2021
      Published in TECS Volume 22, Issue 6

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. CNN accelerator
      2. design space exploration
      3. Reinforcement Learning
      4. derivative-free optimization

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)154
      • Downloads (Last 6 weeks)5
      Reflects downloads up to 01 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)CSDSE: An efficient design space exploration framework for deep neural network accelerator based on cooperative searchNeurocomputing10.1016/j.neucom.2025.129366(129366)Online publication date: Jan-2025
      • (2024)LCDSE: Enable Efficient Design Space Exploration for DCNN Accelerator Based on Layer ClusteringIEEE Transactions on Circuits and Systems II: Express Briefs10.1109/TCSII.2024.339398671:10(4486-4490)Online publication date: Oct-2024
      • (2023)Boomerang: Physical-Aware Design Space Exploration Framework on RISC-V SonicBOOM Microarchitecture2023 IEEE 34th International Conference on Application-specific Systems, Architectures and Processors (ASAP)10.1109/ASAP57973.2023.00026(85-93)Online publication date: Jul-2023

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media