Skip to main content

A New Lightweight CRNN Model for Keyword Spotting with Edge Computing Devices

  • Conference paper
  • First Online:
Book cover Machine Learning for Cyber Security (ML4CS 2020)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12486))

Included in the following conference series:

  • 1212 Accesses

Abstract

Keyword Spotting (KWS) is a significant branch of Automatic Speech Recognition (ASR), which has been widely used in edge computing devices. The goal of KWS is to provide high accuracy at a low false alarm rate (FAR) while reducing the costs of memory, computation, and latency. However, limited resources are challenging for KWS applications on edge computing devices. Lightweight models and structures for deep learning have achieved good results in the KWS branch while maintaining high accuracy, low computational costs, and low latency. In this paper, we present a new Convolutional Recurrent Neural Network (CRNN) architecture named EdgeCRNN for edge computing devices. EdgeCRNN is based on a depthwise separable convolution (DSC) and residual structure, and it uses a feature enhancement method. The experimental results on Google Speech Commands Dataset depict that EdgeCRNN can test 11.1 audio data per second on Raspberry Pi 3B+, which are 2.2 times that of Tpool2. Compared with Tpool2, the accuracy of EdgeCRNN reaches 98.05% whilst its performance is also competitive.

This paper is supported by the National Natural Sciences Foundation of China (No. 61572028), National Cryptography Development Fund (No. MMJJ20180206), the Project of Science and Technology of Guangzhou (No. 201802010044) and Guangdong Basic and Applied Basic Research Foundation (No. 2019A1515011797).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/genty1314/KWS.git.

References

  1. Wilpon, J., Miller, L., Modi, P.: Improvements and applications for key word recognition using hidden markov modeling techniques. In: 1991 International Conference on Acoustics, Speech, and Signal Processing, pp. 309–312. IEEE (1991)

    Google Scholar 

  2. Silaghi, M.C.: Spotting subsequences matching an hmm using the average observation probability criteria with application to keyword spotting. In: AAAI, pp. 1118–1123 (2005)

    Google Scholar 

  3. Chen, G., Parada, C., Heigold, G.: Small-footprint keyword spotting using deep neural networks. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4087–4091. IEEE (2014)

    Google Scholar 

  4. Benelli, G., Meoni, G., Fanucci, L.: A low power keyword spotting algorithm for memory constrained embedded systems. In: 2018 IFIP/IEEE International Conference on Very Large Scale Integration (VLSI-SoC), pp. 267–272. IEEE (2018)

    Google Scholar 

  5. Dinelli, G., Meoni, G., Rapuano, E., Benelli, G., Fanucci, L.: An FPGA-based hardware accelerator for cnns using on-chip memories only: Design and benchmarking with intel movidius neural compute stick. Int. J. Reconfig. Comput. 2019, 13 p. (2019)

    Google Scholar 

  6. Tang, R., Wang, W., Tu, Z., Lin, J.: An experimental analysis of the power consumption of convolutional neural networks for keyword spotting. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5479–5483. IEEE (2018)

    Google Scholar 

  7. Sainath, T., Parada, C.: Convolutional neural networks for small-footprint keyword spotting (2015)

    Google Scholar 

  8. Sun, M., Raju, A., Tucker, G., et al.: Max-pooling loss training of long short-term memory networks for small-footprint keyword spotting. In: 2016 IEEE Spoken Language Technology Workshop (SLT), pp. 474–480. IEEE (2016)

    Google Scholar 

  9. Arik, S.O., Kliegl, M., Child, R., et al.: Convolutional recurrent neural networks for small-footprint keyword spotting. arXiv preprint arXiv:1703.05390 (2017)

  10. Warden, P.: Speech commands: a dataset for limited-vocabulary speech recognition. arXiv preprint arXiv:1804.03209 (2018)

  11. Tucker, G., Wu, M., Sun, M., Panchapagesan, S., Fu, G., Vitaladevuni, S.: Model compression applied to small-footprint keyword spotting. In: INTERSPEECH, pp. 1878–1882 (2016)

    Google Scholar 

  12. Zhou, Y., Ebrahimi, S., Arık, S.Ö., et al.: Resource-efficient neural architect. arXiv preprint arXiv:1806.07912 (2018)

  13. Anderson, A., Su, J., Dahyot, R., Gregg, D.: Performance-oriented neural architecture search. arXiv preprint arXiv:2001.02976 (2020)

  14. Zhang, Y., Suda, N., Lai, L., Chandra, V.: Hello edge: keyword spotting on microcontrollers. arXiv preprint arXiv:1711.07128 (2017)

  15. Coucke, A., Chlieh, M., Gisselbrecht, T., Leroy, D., Poumeyrol, M., Lavril, T.: Efficient keyword spotting using dilated convolutions and gating. In: ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6351–6355. IEEE (2019)

    Google Scholar 

  16. McFee, B., et al.: librosa: audio and music signal analysis in python. In: Proceedings of the 14th python in science conference. vol. 8 (2015)

    Google Scholar 

  17. Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)

    Google Scholar 

  18. Howard, A.G., Zhu, M., Chen, B., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)

  19. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  20. Tang, R., Lin, J.: Deep residual learning for small-footprint keyword spotting. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5484–5488. IEEE (2018)

    Google Scholar 

  21. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  22. Cho, K., Van Merriënboer, B., Gulcehre, C., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)

  23. Zeng, M., Xiao, N.: Effective combination of densenet and bilstm for keyword spotting. IEEE Access 7, 10767–10775 (2019)

    Article  Google Scholar 

  24. Ma, N., Zhang, X., Zheng, H.T., Sun, J.: Shufflenet v2: practical guidelines for efficient CNN architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yamin Wen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wei, Y., Gong, Z., Yang, S., Ye, K., Wen, Y. (2020). A New Lightweight CRNN Model for Keyword Spotting with Edge Computing Devices. In: Chen, X., Yan, H., Yan, Q., Zhang, X. (eds) Machine Learning for Cyber Security. ML4CS 2020. Lecture Notes in Computer Science(), vol 12486. Springer, Cham. https://doi.org/10.1007/978-3-030-62223-7_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-62223-7_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-62222-0

  • Online ISBN: 978-3-030-62223-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics