Adaptive design and implementation of automatic modulation recognition accelerator

Wang, Bin; Wei, Xianglin; Wang, Chao; Li, Junnan; Jiao, Xiang; Fan, Jianhua; Li, Peng

doi:10.1007/s12652-023-04736-0

Adaptive design and implementation of automatic modulation recognition accelerator

Original Research
Published: 03 January 2024

Volume 15, pages 779–795, (2024)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Bin Wang¹,
Xianglin Wei³,
Chao Wang³,
Junnan Li³,
Xiang Jiao²,
Jianhua Fan³ &
…
Peng Li¹

157 Accesses
Explore all metrics

Abstract

Automatic modulation recognition-oriented Deep Neural Networks (ADNNs) have achieved higher recognition accuracy than traditional methods with less labor overhead. However, their high computation complexity usually far exceeds the computation capacity of communication devices built on Field Programmable Gate Array (FPGA) platform. When solving the problem of insufficient resources, the complete operation of FPGA-based accelerator can be achieved by dividing the calculation into several parts and calculating them separately, but this will cause unaccepTable latency. In this backdrop, we develop a new ADNN model, named VT-CNN2+, to promote the recognition accuracy. Then, after stating the resources and latency problems for implementing VT-CNN2+ on the FPGA platform, an adaptive hardware accelerator is put forward. To implement the accelerator, Area folding is introduced to optimize resources consumption. Moreover, Literacy Optimization, Parallelism Optimization, Inter-layer Cascading, Temporary Cache and Data Loading Optimization are adopted to reduce latency. Afterwords, the two components in our accelerator are detailed, i.e., Once-designed module and Re-designed module. Finally, to evaluate the performance and adaptivity of our accelerator, a series of experiments are conducted on two different FPGA platforms, i.e., AX7350 and ZedBoard. Results show that our accelerator can successfully adapt to different FPGA platforms and it can remarkably reduce the processing latency. Moreover, our accelerator’s processing speed of 0.066249s per single data sample with much lower energy consumption is one order of magnitude faster than desktop-level Central Processing Units (CPUs), two orders of magnitude faster than embedded CPUs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Survey of FPGA-Based Deep Learning Acceleration Research

An Anatomization of FPGA-Based Neural Networks

Optimizing Neural Networks for Efficient FPGA Implementation: A Survey

Article 11 January 2021

Data availability

The authors confirm that the data supporting the findings of this study are available within the article.

References

Bablani D, Mckinstry JL, Esser SK, Appuswamy R, Modha DS (2023) Efficient and effective methods for mixed precision neural network quantization for faster, energy-efficient inference. arXiv preprint arXiv:2301.13330 arXiv:2301.13330
Chen J, Liu L, Liu Y, Zeng X (2020) A learning framework for n-bit quantized neural networks toward fpgas. IEEE Trans Neural Netw Learn Syst 32(3):1067–1081
Article MathSciNet Google Scholar
Choi J, Kong BY, Park I-C (2020) Retrain-less weight quantization for multiplier-less convolutional neural networks. IEEE Trans Circuits Syst I Regul Pap 67(3):972–982
Article Google Scholar
Deng C, Sui Y, Liao S, Qian X, Yuan B (2021) Gospa: an energy-efficient high-performance globally optimized sparse convolutional neural network accelerator. In: 2021 ACM/IEEE 48th Annual International Symposium on computer architecture (ISCA), pp 1110–1123. IEEE
Dong P, Wang S, Niu W, Zhang C, Lin S, Li Z, Gong Y, Ren B, Lin X, Tao D (2020) Rtmobile: beyond real-time mobile acceleration of rnns for speech recognition. In: 2020 57th ACM/IEEE Design Automation Conference (DAC), pp 1–6. IEEE
Gong C, Ye L, Xie K, Jin Z, Li T, Wang Y (2022) Elastic significant bit quantization and acceleration for deep neural networks. IEEE Trans Parallel Distrib Syst 33(11):3178–3193
Google Scholar
Gudaparthi S, Singh S, Narayanan S, Balasubramonian R, Sathe V (2022) Candles: channel-aware novel dataflow-microarchitecture co-design for low energy sparse neural network acceleration. In: 2022 IEEE International Symposium on high-performance computer architecture (HPCA), pp 876–891. IEEE
Xu K, Zhang D, An J, Liu L, Liu L, Wang D (2021) Genexp: multi-objective pruning for deep neural network based on genetic algorithm. Neurocomputing 451:81–94
Article Google Scholar
Kosuge A, Hamada M, Kuroda T (2021) A 16 nj/classification fpga-based wired-logic dnn accelerator using fixed-weight non-linear neural net. IEEE J Emerg Sel Top Circ Syst 11(4):751–761
Article Google Scholar
Kumar S, Mahapatra R, Singh A (2022) Automatic modulation recognition: an FPGA implementation. IEEE Commun Lett 26(9):2062–2066
Article Google Scholar
Liang Y, Liqiang L, Xiao Q, Yan S (2020) Evaluating fast algorithms for convolutional neural networks on fpgas. IEEE Trans Comput Aided Des Integr Circuits Syst 39(4):857–870
Article Google Scholar
Li H, Fan X, Jiao L, Cao W, Zhou X, Wang L (2016) A high performance FPGA-based accelerator for large-scale convolutional neural networks. In: 2016 26th International Conference on field programmable logic and applications (FPL), pp 1–9. IEEE
Li Y, Zhao P, Yuan G, Lin X, Wang Y, Chen X (2022) Pruning-as-search: efficient neural architecture search via channel pruning and structural reparameterization. arXiv preprint arXiv:2206.01198
Lin Y, Ya T, Dou Z (2020) An improved neural network pruning technology for automatic modulation classification in edge devices. IEEE Trans Veh Technol 69(5):5703–5706
Article Google Scholar
Ling Y, He T, Yu Z, Meng H, Huang K, Chen G (2022) Lite-stereo: a resource-efficient hardware accelerator for real-time high-quality stereo estimation using binary neural network. IEEE Trans Comput Aided Des Integr Circuits Syst 41(12):5357–5366
Article Google Scholar
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: Single shot multibox detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp 21–37. Springer
Ma Y, Suda N, Cao Y, Seo J-s, Vrudhula S (2016) Scalable and modularized rtl compilation of convolutional neural networks onto FPGA. In: 2016 26th International Conference on field programmable logic and applications (FPL), pp 1–8. IEEE
Ma Y, Cao Y, Vrudhula S, Seo J-s (2017) Optimizing loop operation and dataflow in FPGA acceleration of deep convolutional neural networks. In: Proceedings of the 2017 ACM/SIGDA International Symposium on field-programmable gate arrays, pp 45–54
Matsubara Y, Yang R, Levorato M, Mandt S (2022) Supervised compression for resource-constrained edge computing systems. In: Proceedings of the IEEE/CVF Winter Conference on applications of computer vision, pp 2685–2695
Motamedi M, Gysel P, Akella V, Ghiasi S (2016) Design space exploration of FPGA-based deep convolutional neural networks. In: 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC), pp 575–580. IEEE
Nguyen DT, Je H, Nguyen TN, Ryu S, Lee K, Lee H-J (2022) Shortcutfusion: from tensorflow to fpga-based accelerator with a reuse-aware memory allocation for shortcut data. IEEE Trans Circuits Syst I Regul Pap 69(6):2477–2489
Article Google Scholar
O’shea TJ, West N (2016) Radio machine learning dataset generation with GNU radio. In: Proceedings of the GNU Radio Conference, volume 1
O’Shea TJ, Roy T, Clancy TC (2018) Over-the-air deep learning based radio signal classification. IEEE J Sel Top Signal Process 12(1):168–179
Article ADS Google Scholar
Qin E, Samajdar A, Kwon H, Nadella V, Srinivasan S, Das D, Kaul B, Krishna T (2020) Sigma: A sparse and irregular gemm accelerator with flexible interconnects for dnn training. In: 2020 IEEE International Symposium on high performance computer architecture (HPCA), pp 58–70. IEEE
Rahman A, Lee J, Choi K (2016) Efficient FPGA acceleration of convolutional neural networks using logical-3d compute array. In: 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE), pp 1393–1398. IEEE
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 779–788
Rybalkin V, Ney J, Tekleyohannes MK, Wehn N (2021) When massive gpu parallelism ain’t enough: A novel hardware architecture of 2d-lstm neural network. ACM Trans Reconfig Technol Syst (TRETS) 15(1):1–35
Google Scholar
Shanyong X, Zhou Y, Huang Y, Han T (1983) YOLOv4-tiny-based coal gangue image recognition and FPGA implementation. Micromachines 13(11):2022
Google Scholar
Shen Y, Ferdman Ml, Milder P (2016) Overcoming resource underutilization in spatial cnn accelerators. In: 2016 26th International Conference on field programmable logic and applications (FPL), pp 1–4. IEEE
Swaminathan S, Garg D, Kannan R, Andres F (2020) Sparse low rank factorization for deep neural network compression. Neurocomputing 398:185–196
Article Google Scholar
Tan S, Fang Z, Liu Y, Zhe W, Hang D, Renjie X, Liu Y (2023) An SSD-MobileNet acceleration strategy for FPGAs based on network compression and subgraph fusion. Forests 14(1):53
Article Google Scholar
Tridgell S, Boland D, Leong PHW, Kastner R, Khodamoradi A et al (2020) Real-time automatic modulation classification using RFSoC. In: 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp 82–89. IEEE
Wang Y, Qin Y, Deng D, Wei J, Chen T, Lin X, Liu L, Wei S, Yin S (2021) A 28nm 276.55 tflops/w sparse deep-neural-network training processor with implicit redundancy speculation and batch normalization reformulation. In: 2021 Symposium on VLSI Circuits, pp 1–2. IEEE
Xiao H, Zhao K, Liu G (2021) Efficient hardware accelerator for compressed sparse deep neural network. IEICE Trans Inf Syst 104(5):772–775
Article Google Scholar
Xuan L, Un K-F, Lam C-S, Martins RP (2022) An fpga-based energy-efficient reconfigurable depthwise separable convolution accelerator for image recognition. IEEE Trans Circuits Syst II Express Briefs 69(10):4003–4007
Google Scholar
Yan Y, Ling Y, Huang K, Chen G (2023) An efficient real-time accelerator for high-accuracy dnn-based optical flow estimation in fpga. J Syst Arch 136:102818
Article Google Scholar
Yang D, Ghasemazar A, Ren X, Golub M, Lemieux G, Lis M (2020a) Procrustes: a dataflow and accelerator for sparse deep neural network training. In: 2020 53rd Annual IEEE/ACM International Symposium on microarchitecture (MICRO), pp 711–724. IEEE
Yang Y, Deng L, Shuang W, Yan T, Xie Y, Li G (2020b) Training high-performance and large-scale deep neural networks with full 8-bit integers. Neural Netw 125:70–82
Article PubMed Google Scholar
Yuan T, Liu W, Han J, Lombardi F (2021) High performance cnn accelerators based on hardware and algorithm co-optimization. IEEE Trans Circuits Syst I Regul Pap 68(1):250–263
Article MathSciNet Google Scholar
Zhang C, Li P, Sun G, Guan Y, Xiao B, Cong J (2015) Optimizing FPGA-based accelerator design for deep convolutional neural networks. In: Proceedings of the 2015 ACM/SIGDA International Symposium on field-programmable gate arrays, pp 161–170
Zhang Z, Mahmud MAP, Kouzani AZ (2022) Fitnn: a low-resource fpga-based cnn accelerator for drones. IEEE Internet Things J 9(21):21357–21369
Article Google Scholar
Zhang Y, Dong Z, Yang H, Lu M, Tseng C-C, Guo Y, Keutzer K, Du L, Zhang S (2023) QD-BEV: quantization-aware view-guided distillation for multi-view 3D object detection
Zhou C, Liu M, Qiu S, He Y, Jiao H (2021) An energy-efficient low-latency 3d-cnn accelerator leveraging temporal locality, full zero-skipping, and hierarchical load balance. In: 2021 58th ACM/IEEE Design Automation Conference (DAC), pp 241–246. IEEE
Zhu F, Gong R, Yu F, Liu X, Wang Y, Li Z, Yang X, Yan J (2020) Towards unified int8 training for convolutional neural network. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 1969–1979

Download references

Author information

Authors and Affiliations

School of Electronics and Information Engineering, Nanjing University of Information Science and Technology, Nanjing, 210044, China
Bin Wang & Peng Li
School of Computer and Software, Nanjing University of Information Science and Technology, Nanjing, 210044, China
Xiang Jiao
63rd Research Institute, National University of Defense Technology, Nanjing, 210007, China
Xianglin Wei, Chao Wang, Junnan Li & Jianhua Fan

Authors

Bin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xianglin Wei
View author publications
You can also search for this author in PubMed Google Scholar
Chao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Junnan Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiang Jiao
View author publications
You can also search for this author in PubMed Google Scholar
Jianhua Fan
View author publications
You can also search for this author in PubMed Google Scholar
Peng Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xianglin Wei.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, B., Wei, X., Wang, C. et al. Adaptive design and implementation of automatic modulation recognition accelerator. J Ambient Intell Human Comput 15, 779–795 (2024). https://doi.org/10.1007/s12652-023-04736-0

Download citation

Received: 17 May 2023
Accepted: 06 November 2023
Published: 03 January 2024
Issue Date: January 2024
DOI: https://doi.org/10.1007/s12652-023-04736-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive design and implementation of automatic modulation recognition accelerator

Abstract

Access this article

Similar content being viewed by others

A Survey of FPGA-Based Deep Learning Acceleration Research

An Anatomization of FPGA-Based Neural Networks

Optimizing Neural Networks for Efficient FPGA Implementation: A Survey

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Adaptive design and implementation of automatic modulation recognition accelerator

Abstract

Access this article

Similar content being viewed by others

A Survey of FPGA-Based Deep Learning Acceleration Research

An Anatomization of FPGA-Based Neural Networks

Optimizing Neural Networks for Efficient FPGA Implementation: A Survey

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation