Skip to main content

Advertisement

Log in

An energy-efficient convolutional neural network accelerator for speech classification based on FPGA and quantization

  • Regular Paper
  • Published:
CCF Transactions on High Performance Computing Aims and scope Submit manuscript

Abstract

Deep convolutional neural network (CNN), which is widely applied in image tasks, can also achieve excellent performance in acoustic tasks. However, activation data in convolutional neural network is usually indicated in floating format, which is both time-consuming and power-consuming when be computed. Quantization method can turn activation data into fixed-point, replacing floating computing into faster and more energy-saving fixed-point computing. Based on this method, this article proposes a design space searching method to quantize a binary weight neural network. A specific accelerator is built on FPGA platform, which has layer-by-layer pipeline design, higher throughput and energy-efficiency compared with CPU and other hardware platforms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Alemdar, H., Leroy, V., Prost-Boucle, A., P ́etrot, F.: Ternary neural networks for resource-efficient ai applications. Int. Jt. Conf. Neural Netw. (IJCNN) 2547–2554 (2017). https://doi.org/10.1109/IJCNN.2017.7966166

  • Alessandro, A., Hesham, M., Enrico, C., et al.: NullHop: a flexible convolutional neural network accelerator based on sparse representations of feature maps. IEEE Trans. Neural Netw. Learn. Syst. 1–13 (2018). https://doi.org/10.1109/TNNLS.2018.2852335

  • Baicheng, L., Song, C., Yi, K., Feng, W., et al.: An energy-efficient systolic pipeline architecture for binary convolutional neural network. IEEE Int. Conf. ASIC (2019). https://doi.org/10.1109/ASICON47005.2019.8983637

  • Bo, L., Hai, Q., Yu, G., et al.: EERA-ASR: an energy-efficient reconfigurable architecture for automatic speech recognition with hybrid DNN and approximate computing. IEEE Access 6, 52227–52237 (2018)

    Article  Google Scholar 

  • Chen, T., Du, Z., Sun, N., et al.: A high-throughput neural network accelerator. IEEE Micro 35(3), 24–32 (2015)

    Article  Google Scholar 

  • Cheng, G., Yao, C., Ye, L., Tao, L., Cong, H., et al.: Vecq: minimal loss DNN model compression with vectorized weight quantization. IEEE Trans. Comput. (2020). https://doi.org/10.1109/TC.2020.2995593

  • Cheung, K., Schultz, S.R., Luk, W.: A large-scale spiking neural network accelerator for FPGA systems. In: Proceedings of the 22nd international conference on artificial neural networks and machine learning - volume part I, Springer, Berlin, Heidelberg (2012)

  • Conti, F., Schiavone, P.D., Benini, L.: XNOR neural engine: a hardware accelerator IP for 21.6-fJ/op binary neural network inference. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 37(11), 2940–2951 (2018)

    Article  Google Scholar 

  • Douglas, C., Sabato, L., Martin, L., et al.: A neural attention model for speech command recognition. arXiv:1808.08929v1[eess.AS] (2018)

  • Dundar, G., Rose, K.: The effects of quantization on multilayer neural networks. IEEE Trans. Neural Netw. 6(6), 1446–1451 (1995)

    Article  Google Scholar 

  • Geoffrey, H., Li, D., Dong, Y., et al.: Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process. Mag. 12, 82–98 (2012)

    Google Scholar 

  • Giri, E.P., Fanany, M.I., Arymurthy, A.M., et al.: Ischemic stroke identification based on EEG and EOG using 1D convolutional neural network and batch normalization. ICACSIS. IEEE (2016)

  • Guo, P., Ma, H., Chen, R., et al.: A high-efficiency FPGA-based accelerator for binarized neural networks. J. Circuits Syst. Comput. (2019). https://doi.org/10.1142/S0218126619400048

  • Gwennap, L.: Microsoft brainwave uses FPGAs. Microprocess. Rep. 31(11), 25–27 (2017)

    Google Scholar 

  • Han et al.: 4th International Conference on Learning Representations, ICLR 2016, Conference Track Proceedings (2016)

  • Jacob, B., Kligys, S., Chen, B., et al.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. (2017). https://doi.org/10.1109/CVPR.2018.00286

  • Jyrki Kivinen, M.K.W.: Exponentiated gradient versus gradient descent for linear predictors. Inf. Comput. 132(1), 1–63 (1997)

    Article  MathSciNet  Google Scholar 

  • Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. NIPS. Curran Associates Inc. (2012)

  • Liang, S., Yin, S., Liu, L., et al.: FP-BNN: binarized neural network on FPGA. Neurocomputing 275, 1072–1086 (2018)

    Article  Google Scholar 

  • Liu, S., Pattabiraman, K., Moscibroda, T., et al.: Flikker: saving DRAM refresh-power through critical data partitioning. Comput. Arch. News 39(1), 213–224 (2011)

    Article  Google Scholar 

  • Liu, M., Wu, W., Gu, Z., et al.: Deep learning based on batch normalization for P300 signal detection. Neurocomputing S0925231217314601 (2017). https://doi.org/10.1016/j.neucom.2017.08.039

  • Matthieu, C., Itay, H., Daniel, S., et al.: Training deep neural networks with weights and activations constrained to +1 or -1. arXiv:1602.02830v3[cs.LG] (2016)

  • Muckenhirn, H., Magimai-Doss, M., Marcell, S.: [IEEE ICASSP 2018 - 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP) - Calgary, AB (2018.4.15–2018.4.20)] 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP) - towards directly modeling raw speech signal for speaker verification using CNNS, pp. 4884–4888 (2018)

  • Pakyurek, M., Atmis, M., Kulac, S., et al.: Extraction of novel features based on histograms of MFCCs used in emotion classification from generated original speech dataset. Electron. Electr. Eng. 26, 46–51 (2020)

    Google Scholar 

  • Palaz, D., Magimai-Doss, M., Collobert, R.: Convolutional neural networks-based continuous speech recognition using raw speech signal. In: 2015 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE (2015)

  • Perkins, S., Lacker, K., Theiler, J.: Grafting: fast, incremental feature selection by gradient descent in function space. J. Mach. Learn. Res. 3(3), 1333–1356 (2003)

    MathSciNet  MATH  Google Scholar 

  • Renzo, A., Lukas, C., Davide, R., Luca, B.: YodaNN: an architecture for ultralow power binary-weight CNN acceleration. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 37(1) (2018). https://doi.org/10.1109/TCAD.2017.2682138

  • Samuel, K., Boris, G.: MatchboxNet: 1D time-channel separable convolutional neural network architecture for speech commands recognition. arXiv:2004.085431v2[eess.AS] (2020)

  • Samuel, K., Stanislav, B., Boris, G., et al.: QuartzNet: deep automatic speech recognition with 1D time-channel separable convolutions. In: 2020 IEEE international conference on acoustic, speech and signal processing (2020)

  • Santurkar, S., Tsipras, D., Ilyas, A., et al.: 2018; NIPS'18: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 2488– 2498 (2018)

  • Shijie, C., Chen, Z., Zhuliang, Y., et al.: Efficient and effective sparse LSTM on FPGA with bank-balanced sparsity. In: FPGA’19: proceedings of the 2019 ACM/SIGDA international symposium on field-programmable gate arrays, pp. 63–72 (2019)

  • Shuo, W., Zhe, L., Caiwen, D., et al.: C-LSTM: enabling efficient LSTM using structured compression techniques on FPGAs. In: FPGA’18: proceedings of the 2018 ACM/SIGDA international symposium on field-programmable gate arrays, pp. 21–30 (2018)

  • Simonyan, K., Zisserman, A.: 3rd International Conference on Learning Representations, ICLR 2015, Conference Track Proceedings (2014)

  • Song, H., Junlong, K., Huizi, M., et al.: ESE: efficient speech recognition engine with sparse LSTM on FPGA. In: FPGA’17: Proceedings of the 2017 ACM/sigda international symposium on field-programmable gate arrays, pp. 75–84 (2017)

  • Tom, S., Vaibhava, G.: Advances in very deep convolutional neural networks for LVCSR. arXiv:1604.01792v2[cs.CL] (2016)

  • Turan, F., Roy, S.S., Verbauwhede, I.: HEAWS: an accelerator for homomorphic encryption on the amazon AWS FPGA. IEEE Trans. Comput. (99):1–1 (2020). https://doi.org/10.1109/TC.2020.2988765

  • Umuroglu, Y., Fraser, N.J., Gambardella, G., Blott, M., Leong, P., Jahre, M., Vissers, K.: Finn: a framework for fast, scalable binarized neural network inference. In: ACM/SIGDA international symposium on field-programmable gate arrays, pp. 65–74 (2017)

  • Wan, H., Guo, S., Yin, K., et al.: CTS-LSTM: LSTM-based neural networks for correlated time series prediction. Knowl. Based Syst. 191 (2019). https://doi.org/10.1016/j.knosys.2019.105239

  • Wei, Z., Jingyi, Q., Renbiao, W.: Straight convolutional neural networks algorithm based on batch normalization for image classification. J. Comput. Aided Des. Comput. Graph. 29(9), 1650–1657 (2017)

  • Xu, Y., Wang, Y., Zhou, A., et al.: Thirty-Second AAAI Conference on Artificial Intelligence. 32 (2018)

  • Zeng, X., Zhi, T., Zhou, X., Du, Z., Guo, Q., Liu, S., et al.: Addressing irregularity in sparse neural networks through a cooperative software/hardware approach. IEEE Trans. Comput. 69(7), 968–985 (2020)

    MATH  Google Scholar 

Download references

Acknowledgements

This work is supported by National Science and Technology Major Project 2018ZX01028101.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jingfei Jiang.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wen, D., Jiang, J., Dou, Y. et al. An energy-efficient convolutional neural network accelerator for speech classification based on FPGA and quantization. CCF Trans. HPC 3, 4–16 (2021). https://doi.org/10.1007/s42514-020-00055-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42514-020-00055-4

Keywords

Navigation