Skip to main content

Robust MCU Oriented KWS Model for Children Robotic Prosthetic Hand Control

  • Conference paper
  • First Online:
Progress in Artificial Intelligence and Pattern Recognition (IWAIPR 2023)

Abstract

There are few models of prosthetic hands in literature designed for children. The use of speech commands as control method was not found to be used in any of them. Control by voice based on Keyword Spotting (KWS) is a non-invasive method that offers many advantages over others. KWS based on Deep Learning models have proved to be the most accurate, but their implementation in microcontrollers (MCUs) is challenging due to MCUs low hardware resources. In this paper, a robust KWS model based on log-Mel spectrograms and CNNs is presented for deployment on MCUs. The model is trained to recognize 5 keywords using the Multilingual Spoken Words Corpus and UrbanSound8k datasets, including a large number of non-keywords and background noise in training to provide robustness. Some popular MCU platforms are evaluated to implement the model, and STM32 was chosen for its advantages. Inference time simulations were made on some model-compatible STM32 boards.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Alajlan, N.N., Ibrahim, D.M.: TinyML: enabling of inference deep learning models on ultra-low-power IoT edge devices for AI applications. Micromachines 13(6), 851 (2022). https://doi.org/10.3390/mi13060851

    Article  Google Scholar 

  2. Bondarenko, Y., Nagel, M., Blankevoort, T.: Understanding and overcoming the challenges of efficient transformer quantization (2021). https://doi.org/10.48550/arXiv.2109.12948

  3. Dridi, H., Ouni, K.: Towards robust combined deep architecture for speech recognition: experiments on TIMIT. Int. J. Adv. Comput. Sci. Appl. 11, 525–534 (2020). https://doi.org/10.14569/IJACSA.2020.0110469

  4. Giménez, N.L., Freitag, F., Lee, J., Vandierendonck, H.: Comparison of two microcontroller boards for on-device model training in a keyword spotting task. In: 2022 11th Mediterranean Conference on Embedded Computing (MECO), pp. 1–4 (2022). https://doi.org/10.1109/MECO55406.2022.9797171

  5. Heller, S., Woias, P.: Microwatt power hardware implementation of machine learning algorithms on MSP430 microcontrollers. In: 2019 26th IEEE International Conference on Electronics, Circuits and Systems (ICECS), pp. 25–28 (2019). https://doi.org/10.1109/ICECS46596.2019.8964726

  6. López-Espejo, I., Tan, Z.H., Hansen, J.H.L., Jensen, J.: Deep spoken keyword spotting: an overview. IEEE Access 10, 4169–4199 (2022). https://doi.org/10.1109/ACCESS.2021.3139508

    Article  Google Scholar 

  7. López-Espejo, I., Tan, Z.H., Jensen, J.: Exploring Filterbank learning for keyword spotting. In: 2020 28th European Signal Processing Conference (EUSIPCO), pp. 331–335 (2021). https://doi.org/10.23919/Eusipco47968.2020.9287772

  8. Mazumder, M., et al.: Multilingual spoken words corpus. In: Thirty-Fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2) (2021)

    Google Scholar 

  9. Miah, M.N., Wang, G.: Keyword spotting with deep neural network on edge devices. In: 2022 IEEE 12th International Conference on Electronics Information and Emergency Communication (ICEIEC), pp. 98–102 (2022).https://doi.org/10.1109/ICEIEC54567.2022.9835061

  10. Osman, A., Abid, U., Gemma, L., Perotto, M., Brunelli, D.: TinyML platforms benchmarking. In: Saponara, S., De Gloria, A. (eds.) ApplePies 2021. LNEE, vol. 866, pp. 139–148. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-95498-7_20

    Chapter  Google Scholar 

  11. Purwins, H., Li, B., Virtanen, T., Schlüter, J., Chang, S.Y., Sainath, T.: Deep learning for audio signal processing. IEEE J. Sel. Top. Sig. Process. 13(2), 206–219 (2019). https://doi.org/10.1109/JSTSP.2019.2908700

    Article  Google Scholar 

  12. Ray, P.P.: A review on TinyML: state-of-the-art and prospects. J. King Saud Univ. Comput. Inf. Sci. 34(4), 1595–1623 (2022). https://doi.org/10.1016/j.jksuci.2021.11.019

    Article  Google Scholar 

  13. Ribeiro, J., et al.: Analysis of man-machine interfaces in upper-limb prosthesis: a review. Robotics 8(1), 16 (2019). https://doi.org/10.3390/robotics8010016

    Article  Google Scholar 

  14. Saha, S.S., Sandha, S.S., Srivastava, M.: Machine learning for microcontroller-class hardware: a review. IEEE Sens. J. 22(22), 21362–21390 (2022). https://doi.org/10.1109/JSEN.2022.3210773

    Article  Google Scholar 

  15. Saifullah, K., Quaiser, R.M., Akhtar, N.: Voice keyword spotting on edge devices. In: 2022 5th International Conference on Multimedia, Signal Processing and Communication Technologies (IMPACT), pp. 1–5 (2022). https://doi.org/10.1109/IMPACT55510.2022.10029228

  16. Sainath, T., Parada, C.: Convolutional neural networks for small-footprint keyword spotting (2015)

    Google Scholar 

  17. Sainath, T.N., Vinyals, O., Senior, A., Sak, H.: Convolutional, long short-term memory, fully connected deep neural networks. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4580–4584 (Apr 2015). https://doi.org/10.1109/ICASSP.2015.7178838

  18. Salamon, J., Jacoby, C., Bello, J.P.: A dataset and taxonomy for urban sound research. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 1041–1044. MM ’14, Association for Computing Machinery, New York, NY, USA (2014). https://doi.org/10.1145/2647868.2655045

  19. Zim, M.Z.H.: TinyML: analysis of Xtensa LX6 microprocessor for neural network applications by ESP32 SoC (2021). https://doi.org/10.13140/RG.2.2.28602.11204

Download references

Acknowledgements

The authors would like to thank the Academic Interchange and Mobility Program (PIMA) held by the University of Cadiz, that made possible authors collaboration for the development of this research. This research was partially funded by the FEDER research project “Sistemas multimodales avanzados para prótesis robóticas de miembro superior (PROBOTHAND)” (FEDER-UCA18-108407) from Junta de Andalucía, Spain.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alejandro Perdomo-Campos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Perdomo-Campos, A., Ramírez-Beltrán, J., Morgado-Estevez, A. (2024). Robust MCU Oriented KWS Model for Children Robotic Prosthetic Hand Control. In: Hernández Heredia, Y., Milián Núñez, V., Ruiz Shulcloper, J. (eds) Progress in Artificial Intelligence and Pattern Recognition. IWAIPR 2023. Lecture Notes in Computer Science, vol 14335. Springer, Cham. https://doi.org/10.1007/978-3-031-49552-6_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-49552-6_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-49551-9

  • Online ISBN: 978-3-031-49552-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics