research-article

Reconfigurable Hardware Accelerator for Convolution Operations in Convolutional Neural Networks

Authors:

Zhicheng DongAuthors Info & Claims

ICCBN '24: Proceedings of the 2024 12th International Conference on Communications and Broadband Networking

Pages 20 - 26

https://doi.org/10.1145/3688636.3688655

Published: 11 October 2024 Publication History

Abstract

Convolutional neural network (CNN) have significantly advanced image classification, video processing, and pattern recognition. Compared to other hardware deployment platforms, field programmable gate arrays (FPGAs) offer advantages such as programmability, low power consumption, parallelism, and low cost. However, the substantial computational demands of CNN and the limited logic resources of FPGA constrain the deployment and acceleration of CNN, posing significant challenges for real-time performance on resource-limited edge devices. This paper proposes a dynamically reconfigurable convolutional neural network accelerator. The design leverages reconfigurable convolution techniques and matrix decomposition methods to achieve a dynamically reconfigurable CNN accelerator. Compared to traditional convolution accelerator, this design significantly reduces resource consumption and enhances hardware utilization while ensuring high classification accuracy. We successfully implemented the accelerator using Vivado 2021.1 and comprehensively evaluated the resource consumption of multiple convolutional kernels, including DSP and LUT. Furthermore, experimental results demonstrate that the proposed dynamically reconfigurable CNN accelerator not only offers significant advantages in resource consumption and hardware utilization but also exhibits robust performance and broad application prospects in various practical applications.

References

[1]

Leena Arya, Yogesh Kumar Sharma, Ramakrishna Kumar, Harish Padmanaban, Suman Devi, and Lalit Kumar Tyagi. 2023. Maximizing IoT Security: An Examination of Cryptographic Algorithms. In 2023 International Conference on Power Energy, Environment & Intelligent Control (PEEIC). IEEE, 1548–1552.

[2]

ROBERT-ALEXANDRU CRĂCIUN, Radu-Nicolae Pietraru, and MIHNEA-ALEXANDRU MOISESCU. 2024. INTERNET OF THINGS PLATFORM BENCHMARK: AN ARTIFICIAL INTELLIGENCE ASSESSMENT. REVUE ROUMAINE DES SCIENCES TECHNIQUES—SÉRIE ÉLECTROTECHNIQUE ET ÉNERGÉTIQUE 69, 1 (2024), 97–102.

[3]

Li Du, Yuan Du, Yilei Li, Junjie Su, Yen-Cheng Kuan, Chun-Chen Liu, and Mau-Chung Frank Chang. 2017. A reconfigurable streaming deep convolutional neural network accelerator for Internet of Things. IEEE Transactions on Circuits and Systems I: Regular Papers 65, 1 (2017), 198–208.

[4]

Julian Faraone, Martin Kumm, Martin Hardieck, Peter Zipf, Xueyuan Liu, David Boland, and Philip HW Leong. 2019. AddNet: Deep neural networks using FPGA-optimized multipliers. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 28, 1 (2019), 115–128.

[5]

Wenjin Huang, Huangtao Wu, Qingkun Chen, Conghui Luo, Shihao Zeng, Tianrui Li, and Yihua Huang. 2021. FPGA-based high-throughput CNN hardware accelerator with high computing resource utilization ratio. IEEE Transactions on Neural Networks and Learning Systems 33, 8 (2021), 4069–4083.

[6]

Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. nature 521, 7553 (2015), 436–444.

[7]

Zhiwei Li, Qingjiang Li, Haijun Liu, Zhongjin Zhao, 2023. FPGA Optimized Accelerator of DCNN with Fast Data Readout and Multiplier Sharing Strategy.Computers, Materials & Continua 77, 3 (2023).

[8]

Xing Liu, Wenxing Xu, Qing Wang, and Mengya Zhang. 2024. Energy-Efficient Computing Acceleration of Unmanned Aerial Vehicles Based on a CPU/FPGA/NPU Heterogeneous System. IEEE Internet of Things Journal (2024).

[9]

Jian-Hao Luo, Hao Zhang, Hong-Yu Zhou, Chen-Wei Xie, Jianxin Wu, and Weiyao Lin. 2018. ThiNet: Pruning CNN filters for a thinner net. IEEE transactions on pattern analysis and machine intelligence 41, 10 (2018), 2525–2538.

[10]

Marcel Lütke Dreimann, Birte Friesel, and Olaf Spinczyk. 2024. HetSim: A Simulator for Task-based Scheduling on Heterogeneous Hardware. In Companion of the 15th ACM/SPEC International Conference on Performance Engineering. 261–268.

[11]

Radoslav Pitonak, Jan Mucha, Lukas Dobis, Martin Javorka, and Marek Marusin. 2022. Cloudsatnet-1: Fpga-based hardware-accelerated quantized cnn for satellite on-board cloud coverage classification. Remote Sensing 14, 13 (2022), 3180.

[12]

Taylor Simons and Dah-Jye Lee. 2019. A review of binarized neural networks. Electronics 8, 6 (2019), 661.

[13]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

[14]

Naveen Suda, Vikas Chandra, Ganesh Dasika, Abinash Mohanty, Yufei Ma, Sarma Vrudhula, Jae-sun Seo, and Yu Cao. 2016. Throughput-optimized OpenCL-based FPGA accelerator for large-scale convolutional neural networks. In Proceedings of the 2016 ACM/SIGDA international symposium on field-programmable gate arrays. 16–25.

Digital Library

[15]

Sridhar Swaminathan, Deepak Garg, Rajkumar Kannan, and Frederic Andres. 2020. Sparse low rank factorization for deep neural network compression. Neurocomputing 398 (2020), 185–196.

[16]

Rizwan Tariq Syed, Marko Andjelkovic, Markus Ulbricht, and Milos Krstic. 2023. Towards Reconfigurable CNN Accelerator for FPGA Implementation. IEEE Transactions on Circuits and Systems II: Express Briefs 70, 3 (2023), 1249–1253.

[17]

Rizwan Tariq Syed, Yanhua Zhao, Junchao Chen, Marko Andjelkovic, Markus Ulbricht, and Milos Krstic. 2024. FPGA Implementation of a Fault-Tolerant Fused and Branched CNN Accelerator with Reconfigurable Capabilities. IEEE Access (2024).

[18]

Yaman Umuroglu, Nicholas J Fraser, Giulio Gambardella, Michaela Blott, Philip Leong, Magnus Jahre, and Kees Vissers. 2017. Finn: A framework for fast, scalable binarized neural network inference. In Proceedings of the 2017 ACM/SIGDA international symposium on field-programmable gate arrays. 65–74.

Digital Library

[19]

Zi-Rui Wang and Jun Du. 2021. Joint architecture and knowledge distillation in CNN for Chinese text recognition. Pattern Recognition 111 (2021), 107722.

[20]

Yuhua Xu, Jie Luo, and Wei Sun. 2024. Flare: An FPGA-Based Full Precision Low Power CNN Accelerator with Reconfigurable Structure. Sensors 24, 7 (2024), 2239.

[21]

Tian Ye, Sanmukh R Kuppannagari, Rajgopal Kannan, and Viktor K Prasanna. 2021. Performance modeling and FPGA acceleration of homomorphic encrypted convolution. In 2021 31st International Conference on Field-Programmable Logic and Applications (FPL). IEEE, 115–121.

[22]

Sean I Young, Wang Zhe, David Taubman, and Bernd Girod. 2021. Transform quantization for CNN compression. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 9 (2021), 5700–5714.

[23]

Ying Zhang, Denghui Huang, Gaosong Lv, and Huapeng Zhao. 2023. A New Real-Time Positioning Correction System Based On An Inherited Sampling Algorithm With An FPGA Accelerator. IEEE Internet of Things Journal (2023).

Index Terms

Reconfigurable Hardware Accelerator for Convolution Operations in Convolutional Neural Networks
1. Hardware
  1. Integrated circuits
    1. Reconfigurable logic and FPGAs
      1. Hardware accelerators

Recommendations

Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks
FPGA '15: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

Convolutional neural network (CNN) has been widely employed for image recognition because it can achieve high accuracy by emulating behavior of optic nerves in living creatures. Recently, rapid growth of modern applications based on deep learning ...
A High-Performance Reconfigurable Accelerator for Convolutional Neural Networks
ICMSSP '18: Proceedings of the 3rd International Conference on Multimedia Systems and Signal Processing

In this paper, we propose a new high-performance accelerator that supports a variety of convolutional neural networks (CNNs) such as GoogLeNet, ResNet and AlexNet. The proposed accelerator mainly includes 24 parallel PEs (processing engines) for ...
An Efficient Parallel Architecture for Convolutional Neural Networks Accelerator on FPGAs
HP3C '22: Proceedings of the 6th International Conference on High Performance Compilation, Computing and Communications

Convolutional Neural Networks (CNNs) have been widely used in the field of computer vision. Due to the computational complexity of CNNs, their computational efficiency has become a major concern. Field Programmable Gate Array (FPGA) is an ideal ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICCBN '24: Proceedings of the 2024 12th International Conference on Communications and Broadband Networking

July 2024

221 pages

ISBN:9798400717109

DOI:10.1145/3688636

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 October 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Science and Technology Major Project of Tibetan Autonomous Region of China

Conference

ICCBN 2024

ICCBN 2024: 2024 12th International Conference on Communications and Broadband Networking

July 24 - 27, 2024

Nyingchi, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
51
Total Downloads

Downloads (Last 12 months)51
Downloads (Last 6 weeks)11

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten