An energy-efficient convolutional neural network accelerator for speech classification based on FPGA and quantization

Wen, Dong; Jiang, Jingfei; Dou, Yong; Xu, Jinwei; Xiao, Tao

doi:10.1007/s42514-020-00055-4

An energy-efficient convolutional neural network accelerator for speech classification based on FPGA and quantization

Regular Paper
Published: 26 January 2021

Volume 3, pages 4–16, (2021)
Cite this article

CCF Transactions on High Performance Computing Aims and scope Submit manuscript

Dong Wen¹,
Jingfei Jiang¹,
Yong Dou¹,
Jinwei Xu¹ &
…
Tao Xiao¹

453 Accesses
1 Citation
Explore all metrics

Abstract

Deep convolutional neural network (CNN), which is widely applied in image tasks, can also achieve excellent performance in acoustic tasks. However, activation data in convolutional neural network is usually indicated in floating format, which is both time-consuming and power-consuming when be computed. Quantization method can turn activation data into fixed-point, replacing floating computing into faster and more energy-saving fixed-point computing. Based on this method, this article proposes a design space searching method to quantize a binary weight neural network. A specific accelerator is built on FPGA platform, which has layer-by-layer pipeline design, higher throughput and energy-efficiency compared with CPU and other hardware platforms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

FPGA-Based Neural Network Acceleration for Handwritten Digit Recognition

Generic Automated Implementation of Deep Neural Networks on Field Programmable Gate Arrays

Deploying deep learning networks based advanced techniques for image processing on FPGA platform

Article 15 June 2023

References

Alemdar, H., Leroy, V., Prost-Boucle, A., P ́etrot, F.: Ternary neural networks for resource-efficient ai applications. Int. Jt. Conf. Neural Netw. (IJCNN) 2547–2554 (2017). https://doi.org/10.1109/IJCNN.2017.7966166
Alessandro, A., Hesham, M., Enrico, C., et al.: NullHop: a flexible convolutional neural network accelerator based on sparse representations of feature maps. IEEE Trans. Neural Netw. Learn. Syst. 1–13 (2018). https://doi.org/10.1109/TNNLS.2018.2852335
Baicheng, L., Song, C., Yi, K., Feng, W., et al.: An energy-efficient systolic pipeline architecture for binary convolutional neural network. IEEE Int. Conf. ASIC (2019). https://doi.org/10.1109/ASICON47005.2019.8983637
Bo, L., Hai, Q., Yu, G., et al.: EERA-ASR: an energy-efficient reconfigurable architecture for automatic speech recognition with hybrid DNN and approximate computing. IEEE Access 6, 52227–52237 (2018)
Article Google Scholar
Chen, T., Du, Z., Sun, N., et al.: A high-throughput neural network accelerator. IEEE Micro 35(3), 24–32 (2015)
Article Google Scholar
Cheng, G., Yao, C., Ye, L., Tao, L., Cong, H., et al.: Vecq: minimal loss DNN model compression with vectorized weight quantization. IEEE Trans. Comput. (2020). https://doi.org/10.1109/TC.2020.2995593
Cheung, K., Schultz, S.R., Luk, W.: A large-scale spiking neural network accelerator for FPGA systems. In: Proceedings of the 22nd international conference on artificial neural networks and machine learning - volume part I, Springer, Berlin, Heidelberg (2012)
Conti, F., Schiavone, P.D., Benini, L.: XNOR neural engine: a hardware accelerator IP for 21.6-fJ/op binary neural network inference. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 37(11), 2940–2951 (2018)
Article Google Scholar
Douglas, C., Sabato, L., Martin, L., et al.: A neural attention model for speech command recognition. arXiv:1808.08929v1[eess.AS] (2018)
Dundar, G., Rose, K.: The effects of quantization on multilayer neural networks. IEEE Trans. Neural Netw. 6(6), 1446–1451 (1995)
Article Google Scholar
Geoffrey, H., Li, D., Dong, Y., et al.: Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process. Mag. 12, 82–98 (2012)
Google Scholar
Giri, E.P., Fanany, M.I., Arymurthy, A.M., et al.: Ischemic stroke identification based on EEG and EOG using 1D convolutional neural network and batch normalization. ICACSIS. IEEE (2016)
Guo, P., Ma, H., Chen, R., et al.: A high-efficiency FPGA-based accelerator for binarized neural networks. J. Circuits Syst. Comput. (2019). https://doi.org/10.1142/S0218126619400048
Gwennap, L.: Microsoft brainwave uses FPGAs. Microprocess. Rep. 31(11), 25–27 (2017)
Google Scholar
Han et al.: 4th International Conference on Learning Representations, ICLR 2016, Conference Track Proceedings (2016)
Jacob, B., Kligys, S., Chen, B., et al.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. (2017). https://doi.org/10.1109/CVPR.2018.00286
Jyrki Kivinen, M.K.W.: Exponentiated gradient versus gradient descent for linear predictors. Inf. Comput. 132(1), 1–63 (1997)
Article MathSciNet Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. NIPS. Curran Associates Inc. (2012)
Liang, S., Yin, S., Liu, L., et al.: FP-BNN: binarized neural network on FPGA. Neurocomputing 275, 1072–1086 (2018)
Article Google Scholar
Liu, S., Pattabiraman, K., Moscibroda, T., et al.: Flikker: saving DRAM refresh-power through critical data partitioning. Comput. Arch. News 39(1), 213–224 (2011)
Article Google Scholar
Liu, M., Wu, W., Gu, Z., et al.: Deep learning based on batch normalization for P300 signal detection. Neurocomputing S0925231217314601 (2017). https://doi.org/10.1016/j.neucom.2017.08.039
Matthieu, C., Itay, H., Daniel, S., et al.: Training deep neural networks with weights and activations constrained to +1 or -1. arXiv:1602.02830v3[cs.LG] (2016)
Muckenhirn, H., Magimai-Doss, M., Marcell, S.: [IEEE ICASSP 2018 - 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP) - Calgary, AB (2018.4.15–2018.4.20)] 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP) - towards directly modeling raw speech signal for speaker verification using CNNS, pp. 4884–4888 (2018)
Pakyurek, M., Atmis, M., Kulac, S., et al.: Extraction of novel features based on histograms of MFCCs used in emotion classification from generated original speech dataset. Electron. Electr. Eng. 26, 46–51 (2020)
Google Scholar
Palaz, D., Magimai-Doss, M., Collobert, R.: Convolutional neural networks-based continuous speech recognition using raw speech signal. In: 2015 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE (2015)
Perkins, S., Lacker, K., Theiler, J.: Grafting: fast, incremental feature selection by gradient descent in function space. J. Mach. Learn. Res. 3(3), 1333–1356 (2003)
MathSciNet MATH Google Scholar
Renzo, A., Lukas, C., Davide, R., Luca, B.: YodaNN: an architecture for ultralow power binary-weight CNN acceleration. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 37(1) (2018). https://doi.org/10.1109/TCAD.2017.2682138
Samuel, K., Boris, G.: MatchboxNet: 1D time-channel separable convolutional neural network architecture for speech commands recognition. arXiv:2004.085431v2[eess.AS] (2020)
Samuel, K., Stanislav, B., Boris, G., et al.: QuartzNet: deep automatic speech recognition with 1D time-channel separable convolutions. In: 2020 IEEE international conference on acoustic, speech and signal processing (2020)
Santurkar, S., Tsipras, D., Ilyas, A., et al.: 2018; NIPS'18: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 2488– 2498 (2018)
Shijie, C., Chen, Z., Zhuliang, Y., et al.: Efficient and effective sparse LSTM on FPGA with bank-balanced sparsity. In: FPGA’19: proceedings of the 2019 ACM/SIGDA international symposium on field-programmable gate arrays, pp. 63–72 (2019)
Shuo, W., Zhe, L., Caiwen, D., et al.: C-LSTM: enabling efficient LSTM using structured compression techniques on FPGAs. In: FPGA’18: proceedings of the 2018 ACM/SIGDA international symposium on field-programmable gate arrays, pp. 21–30 (2018)
Simonyan, K., Zisserman, A.: 3rd International Conference on Learning Representations, ICLR 2015, Conference Track Proceedings (2014)
Song, H., Junlong, K., Huizi, M., et al.: ESE: efficient speech recognition engine with sparse LSTM on FPGA. In: FPGA’17: Proceedings of the 2017 ACM/sigda international symposium on field-programmable gate arrays, pp. 75–84 (2017)
Tom, S., Vaibhava, G.: Advances in very deep convolutional neural networks for LVCSR. arXiv:1604.01792v2[cs.CL] (2016)
Turan, F., Roy, S.S., Verbauwhede, I.: HEAWS: an accelerator for homomorphic encryption on the amazon AWS FPGA. IEEE Trans. Comput. (99):1–1 (2020). https://doi.org/10.1109/TC.2020.2988765
Umuroglu, Y., Fraser, N.J., Gambardella, G., Blott, M., Leong, P., Jahre, M., Vissers, K.: Finn: a framework for fast, scalable binarized neural network inference. In: ACM/SIGDA international symposium on field-programmable gate arrays, pp. 65–74 (2017)
Wan, H., Guo, S., Yin, K., et al.: CTS-LSTM: LSTM-based neural networks for correlated time series prediction. Knowl. Based Syst. 191 (2019). https://doi.org/10.1016/j.knosys.2019.105239
Wei, Z., Jingyi, Q., Renbiao, W.: Straight convolutional neural networks algorithm based on batch normalization for image classification. J. Comput. Aided Des. Comput. Graph. 29(9), 1650–1657 (2017)
Xu, Y., Wang, Y., Zhou, A., et al.: Thirty-Second AAAI Conference on Artificial Intelligence. 32 (2018)
Zeng, X., Zhi, T., Zhou, X., Du, Z., Guo, Q., Liu, S., et al.: Addressing irregularity in sparse neural networks through a cooperative software/hardware approach. IEEE Trans. Comput. 69(7), 968–985 (2020)
MATH Google Scholar

Download references

Acknowledgements

This work is supported by National Science and Technology Major Project 2018ZX01028101.

Author information

Authors and Affiliations

National University of Defense Technology, Changsha, China
Dong Wen, Jingfei Jiang, Yong Dou, Jinwei Xu & Tao Xiao

Authors

Dong Wen
View author publications
You can also search for this author in PubMed Google Scholar
Jingfei Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Yong Dou
View author publications
You can also search for this author in PubMed Google Scholar
Jinwei Xu
View author publications
You can also search for this author in PubMed Google Scholar
Tao Xiao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jingfei Jiang.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wen, D., Jiang, J., Dou, Y. et al. An energy-efficient convolutional neural network accelerator for speech classification based on FPGA and quantization. CCF Trans. HPC 3, 4–16 (2021). https://doi.org/10.1007/s42514-020-00055-4

Download citation

Received: 18 June 2020
Accepted: 15 October 2020
Published: 26 January 2021
Issue Date: March 2021
DOI: https://doi.org/10.1007/s42514-020-00055-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An energy-efficient convolutional neural network accelerator for speech classification based on FPGA and quantization

Abstract

Access this article

Similar content being viewed by others

FPGA-Based Neural Network Acceleration for Handwritten Digit Recognition

Generic Automated Implementation of Deep Neural Networks on Field Programmable Gate Arrays

Deploying deep learning networks based advanced techniques for image processing on FPGA platform

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An energy-efficient convolutional neural network accelerator for speech classification based on FPGA and quantization

Abstract

Access this article

Similar content being viewed by others

FPGA-Based Neural Network Acceleration for Handwritten Digit Recognition

Generic Automated Implementation of Deep Neural Networks on Field Programmable Gate Arrays

Deploying deep learning networks based advanced techniques for image processing on FPGA platform

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation