skip to main content
10.1145/3370748.3407001acmconferencesArticle/Chapter ViewAbstractPublication PagesislpedConference Proceedingsconference-collections
research-article

NS-KWS: joint optimization of near-sensor processing architecture and low-precision GRU for always-on keyword spotting

Published:10 August 2020Publication History

ABSTRACT

Keyword spotting (KWS) is a crucial front-end module in the whole speech interaction system. The always-on KWS module detects input words, then activates the energy-consuming complex backend system when keywords are detected. The performance of the KWS determines the standby performance of the whole system and the conventional KWS module encounters the power consumption bottleneck problem of the data conversion near the microphone sensor. In this paper, we propose an energy-efficient near-sensor processing architecture for always-on KWS, which could enhance continuous perception of the whole speech interaction system. By implementing the keyword detection in the analog domain after the microphone sensor, this architecture avoids energy-consuming data converter and achieves faster speed than conventional realizations. In addition, we propose a lightweight gated recurrent unit (GRU) with negligible accuracy loss to ensure the recognition performance. We also implement and fabricate the proposed KWS system with the CMOS 0.18μm process. In the system-view evaluation results, the hardware-software co-design architecture achieves 65.6% energy consumption saving and 71 times speed up than state of the art.

Skip Supplemental Material Section

Supplemental Material

3370748.3407001.mp4

mp4

129.3 MB

References

  1. Stephen Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein, et al. 2011. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine learning 3, 1 (2011), 1--122.Google ScholarGoogle Scholar
  2. Fernando Cardes, Gutierrez, et al. 2018. 0.04-mm 2 103-dB-A Dynamic Range Second-Order VCO-Based Audio Sigma-Delta ADC in 0.13μm CMOS. IEEE Journal of Solid-State Circuits 53, 6 (2018), 1731--1742.Google ScholarGoogle ScholarCross RefCross Ref
  3. Ittipong Chaisayun et al. 2012. Versatile analog squarer and multiplier free from body effect. Analog Integrated Circuits and Signal Processing 71, 3 (2012), 539--547.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Chixiao Chen, H.w. Ding, et al. 2017. OCEAN: An on-chip incremental-learning enhanced processor with gated recurrent neural network accelerators. In ESSCIRC 2017-43rd IEEE European Solid State Circuits Conference. IEEE, 259--262.Google ScholarGoogle ScholarCross RefCross Ref
  5. Guoguo Chen, C. Parada, and T.N. Sainath. 2015. Query-by-example keyword spotting using long short-term memory networks. In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 5236--5240.Google ScholarGoogle Scholar
  6. Juan SP Giraldo and Marian Verhelst. 2018. Laika: A 5uW programmable LSTM accelerator for always-on keyword spotting in 65nm CMOS. In ESSCIRC 2018-IEEE 44th European Solid State Circuits Conference (ESSCIRC). IEEE, 166--169.Google ScholarGoogle ScholarCross RefCross Ref
  7. Kaige Jia et al. 2018. Calibrating process variation at system level with in-situ low-precision transfer learning for analog neural network processors. In Proceedings of the 55th Annual Design Automation Conference. ACM, 12.Google ScholarGoogle Scholar
  8. Kyunghee Kang and Tadashi Shibata. 2010. An on-chip-trainable Gaussian-kernel analog support vector machine. IEEE Transactions on Circuits and Systems I: Regular Papers 57, 7 (2010), 1513--1524.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Liangzhen Lai and Naveen Suda. 2018. Enabling deep learning at the IoT edge. In Proceedings of the International Conference on Computer-Aided Design. ACM, 135.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Qin Li et al. 2020. MSP-MFCC: Energy-Efficient MFCC Feature Extraction Method With Mixed-Signal Processing Architecture for Wearable Speech Recognition Applications. IEEE Access 8 (2020), 48720--48730.Google ScholarGoogle ScholarCross RefCross Ref
  11. Sheng Lin et al. 2019. Toward Extremely Low Bit and Lossless Accuracy in DNNs with Progressive ADMM. arXiv preprint arXiv:1905.00789 (2019).Google ScholarGoogle Scholar
  12. Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, et al. 2017. Automatic Differentiation in PyTorch. In NIPS Autodiff Workshop.Google ScholarGoogle Scholar
  13. M. Price et al. 2018. A low-power speech recognizer and voice activity detector using deep neural networks. IEEE Journal of Solid-State Circuits 53.1 (2018).Google ScholarGoogle ScholarCross RefCross Ref
  14. M. Shah et al. 2015. A fixed-point neural network for keyword detection on resource constrained hardware. In Workshop on Signal Processing Systems. IEEE.Google ScholarGoogle ScholarCross RefCross Ref
  15. Weiwei Shan et al. 2020. 14.1 A 510nW 0.41 V Low-Memory Low-Computation Keyword-Spotting Chip Using Serial FFT-Based MFCC and Binarized Depth-wise Separable Convolutional Neural Network in 28nm CMOS. In 2020 IEEE International Solid-State Circuits Conference-(ISSCC). IEEE, 230--232.Google ScholarGoogle Scholar
  16. Raphael T., W.j. W., Z.c. Tu, and J. L. 2018. An experimental analysis of the power consumption of convolutional neural networks for keyword spotting. In International Conference Acoustics, Speech and Signal Processing (ICASSP). IEEE.Google ScholarGoogle Scholar
  17. Naveen Verma, Hongyang Jia, Hossein Valavi, Yinqi Tang, Murat Ozatay, LungYen Chen, Bonan Zhang, and Peter Deaville. 2019. In-Memory Computing: Advances and prospects. IEEE Solid-State Circuits Magazine 11, 3 (2019), 43--55.Google ScholarGoogle ScholarCross RefCross Ref
  18. Pete Warden. 2018. Speech commands: A dataset for limited-vocabulary speech recognition. arXiv preprint arXiv:1804.03209 (2018).Google ScholarGoogle Scholar
  19. F.x. Yu, Z.r. Xu, C.c. Liu, and X. Chen. 2019. MASKER: Adaptive Mobile Security Enhancement against Automatic Speech Recognition in Eavesdropping. In Proceedings of the 56th Annual Design Automation Conference 2019. ACM, 163.Google ScholarGoogle Scholar
  20. Y.d. Zhang, Naveen Suda, L.z. Lai, and Vikas Chandra. 2017. Hello edge: Keyword spotting on microcontrollers. arXiv preprint arXiv:1711.07128 (2017).Google ScholarGoogle Scholar
  21. An Zou et al. 2018. Efficient and reliable power delivery in voltage-stacked manycore system with. regulators. In 55th Design Automation Conference. IEEE.Google ScholarGoogle Scholar

Index Terms

  1. NS-KWS: joint optimization of near-sensor processing architecture and low-precision GRU for always-on keyword spotting

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ISLPED '20: Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design
      August 2020
      263 pages
      ISBN:9781450370530
      DOI:10.1145/3370748

      Copyright © 2020 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 10 August 2020

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate398of1,159submissions,34%

      Upcoming Conference

      ISLPED '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader