skip to main content
10.1145/3688636.3688655acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccbnConference Proceedingsconference-collections
research-article

Reconfigurable Hardware Accelerator for Convolution Operations in Convolutional Neural Networks

Published: 11 October 2024 Publication History

Abstract

Convolutional neural network (CNN) have significantly advanced image classification, video processing, and pattern recognition. Compared to other hardware deployment platforms, field programmable gate arrays (FPGAs) offer advantages such as programmability, low power consumption, parallelism, and low cost. However, the substantial computational demands of CNN and the limited logic resources of FPGA constrain the deployment and acceleration of CNN, posing significant challenges for real-time performance on resource-limited edge devices. This paper proposes a dynamically reconfigurable convolutional neural network accelerator. The design leverages reconfigurable convolution techniques and matrix decomposition methods to achieve a dynamically reconfigurable CNN accelerator. Compared to traditional convolution accelerator, this design significantly reduces resource consumption and enhances hardware utilization while ensuring high classification accuracy. We successfully implemented the accelerator using Vivado 2021.1 and comprehensively evaluated the resource consumption of multiple convolutional kernels, including DSP and LUT. Furthermore, experimental results demonstrate that the proposed dynamically reconfigurable CNN accelerator not only offers significant advantages in resource consumption and hardware utilization but also exhibits robust performance and broad application prospects in various practical applications.

References

[1]
Leena Arya, Yogesh Kumar Sharma, Ramakrishna Kumar, Harish Padmanaban, Suman Devi, and Lalit Kumar Tyagi. 2023. Maximizing IoT Security: An Examination of Cryptographic Algorithms. In 2023 International Conference on Power Energy, Environment & Intelligent Control (PEEIC). IEEE, 1548–1552.
[2]
ROBERT-ALEXANDRU CRĂCIUN, Radu-Nicolae Pietraru, and MIHNEA-ALEXANDRU MOISESCU. 2024. INTERNET OF THINGS PLATFORM BENCHMARK: AN ARTIFICIAL INTELLIGENCE ASSESSMENT. REVUE ROUMAINE DES SCIENCES TECHNIQUES—SÉRIE ÉLECTROTECHNIQUE ET ÉNERGÉTIQUE 69, 1 (2024), 97–102.
[3]
Li Du, Yuan Du, Yilei Li, Junjie Su, Yen-Cheng Kuan, Chun-Chen Liu, and Mau-Chung Frank Chang. 2017. A reconfigurable streaming deep convolutional neural network accelerator for Internet of Things. IEEE Transactions on Circuits and Systems I: Regular Papers 65, 1 (2017), 198–208.
[4]
Julian Faraone, Martin Kumm, Martin Hardieck, Peter Zipf, Xueyuan Liu, David Boland, and Philip HW Leong. 2019. AddNet: Deep neural networks using FPGA-optimized multipliers. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 28, 1 (2019), 115–128.
[5]
Wenjin Huang, Huangtao Wu, Qingkun Chen, Conghui Luo, Shihao Zeng, Tianrui Li, and Yihua Huang. 2021. FPGA-based high-throughput CNN hardware accelerator with high computing resource utilization ratio. IEEE Transactions on Neural Networks and Learning Systems 33, 8 (2021), 4069–4083.
[6]
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. nature 521, 7553 (2015), 436–444.
[7]
Zhiwei Li, Qingjiang Li, Haijun Liu, Zhongjin Zhao, 2023. FPGA Optimized Accelerator of DCNN with Fast Data Readout and Multiplier Sharing Strategy.Computers, Materials & Continua 77, 3 (2023).
[8]
Xing Liu, Wenxing Xu, Qing Wang, and Mengya Zhang. 2024. Energy-Efficient Computing Acceleration of Unmanned Aerial Vehicles Based on a CPU/FPGA/NPU Heterogeneous System. IEEE Internet of Things Journal (2024).
[9]
Jian-Hao Luo, Hao Zhang, Hong-Yu Zhou, Chen-Wei Xie, Jianxin Wu, and Weiyao Lin. 2018. ThiNet: Pruning CNN filters for a thinner net. IEEE transactions on pattern analysis and machine intelligence 41, 10 (2018), 2525–2538.
[10]
Marcel Lütke Dreimann, Birte Friesel, and Olaf Spinczyk. 2024. HetSim: A Simulator for Task-based Scheduling on Heterogeneous Hardware. In Companion of the 15th ACM/SPEC International Conference on Performance Engineering. 261–268.
[11]
Radoslav Pitonak, Jan Mucha, Lukas Dobis, Martin Javorka, and Marek Marusin. 2022. Cloudsatnet-1: Fpga-based hardware-accelerated quantized cnn for satellite on-board cloud coverage classification. Remote Sensing 14, 13 (2022), 3180.
[12]
Taylor Simons and Dah-Jye Lee. 2019. A review of binarized neural networks. Electronics 8, 6 (2019), 661.
[13]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[14]
Naveen Suda, Vikas Chandra, Ganesh Dasika, Abinash Mohanty, Yufei Ma, Sarma Vrudhula, Jae-sun Seo, and Yu Cao. 2016. Throughput-optimized OpenCL-based FPGA accelerator for large-scale convolutional neural networks. In Proceedings of the 2016 ACM/SIGDA international symposium on field-programmable gate arrays. 16–25.
[15]
Sridhar Swaminathan, Deepak Garg, Rajkumar Kannan, and Frederic Andres. 2020. Sparse low rank factorization for deep neural network compression. Neurocomputing 398 (2020), 185–196.
[16]
Rizwan Tariq Syed, Marko Andjelkovic, Markus Ulbricht, and Milos Krstic. 2023. Towards Reconfigurable CNN Accelerator for FPGA Implementation. IEEE Transactions on Circuits and Systems II: Express Briefs 70, 3 (2023), 1249–1253.
[17]
Rizwan Tariq Syed, Yanhua Zhao, Junchao Chen, Marko Andjelkovic, Markus Ulbricht, and Milos Krstic. 2024. FPGA Implementation of a Fault-Tolerant Fused and Branched CNN Accelerator with Reconfigurable Capabilities. IEEE Access (2024).
[18]
Yaman Umuroglu, Nicholas J Fraser, Giulio Gambardella, Michaela Blott, Philip Leong, Magnus Jahre, and Kees Vissers. 2017. Finn: A framework for fast, scalable binarized neural network inference. In Proceedings of the 2017 ACM/SIGDA international symposium on field-programmable gate arrays. 65–74.
[19]
Zi-Rui Wang and Jun Du. 2021. Joint architecture and knowledge distillation in CNN for Chinese text recognition. Pattern Recognition 111 (2021), 107722.
[20]
Yuhua Xu, Jie Luo, and Wei Sun. 2024. Flare: An FPGA-Based Full Precision Low Power CNN Accelerator with Reconfigurable Structure. Sensors 24, 7 (2024), 2239.
[21]
Tian Ye, Sanmukh R Kuppannagari, Rajgopal Kannan, and Viktor K Prasanna. 2021. Performance modeling and FPGA acceleration of homomorphic encrypted convolution. In 2021 31st International Conference on Field-Programmable Logic and Applications (FPL). IEEE, 115–121.
[22]
Sean I Young, Wang Zhe, David Taubman, and Bernd Girod. 2021. Transform quantization for CNN compression. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 9 (2021), 5700–5714.
[23]
Ying Zhang, Denghui Huang, Gaosong Lv, and Huapeng Zhao. 2023. A New Real-Time Positioning Correction System Based On An Inherited Sampling Algorithm With An FPGA Accelerator. IEEE Internet of Things Journal (2023).

Index Terms

  1. Reconfigurable Hardware Accelerator for Convolution Operations in Convolutional Neural Networks

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICCBN '24: Proceedings of the 2024 12th International Conference on Communications and Broadband Networking
    July 2024
    221 pages
    ISBN:9798400717109
    DOI:10.1145/3688636
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 October 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. FPGA
    2. convolutional neural network
    3. reconfigurable

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    • Science and Technology Major Project of Tibetan Autonomous Region of China

    Conference

    ICCBN 2024

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 51
      Total Downloads
    • Downloads (Last 12 months)51
    • Downloads (Last 6 weeks)11
    Reflects downloads up to 16 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media