Robust MCU Oriented KWS Model for Children Robotic Prosthetic Hand Control

Perdomo-Campos, Alejandro; Ramírez-Beltrán, Jorge; Morgado-Estevez, Arturo

doi:10.1007/978-3-031-49552-6_25

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14335))

Included in the following conference series:

International Workshop on Artificial Intelligence and Pattern Recognition

347 Accesses

Abstract

There are few models of prosthetic hands in literature designed for children. The use of speech commands as control method was not found to be used in any of them. Control by voice based on Keyword Spotting (KWS) is a non-invasive method that offers many advantages over others. KWS based on Deep Learning models have proved to be the most accurate, but their implementation in microcontrollers (MCUs) is challenging due to MCUs low hardware resources. In this paper, a robust KWS model based on log-Mel spectrograms and CNNs is presented for deployment on MCUs. The model is trained to recognize 5 keywords using the Multilingual Spoken Words Corpus and UrbanSound8k datasets, including a large number of non-keywords and background noise in training to provide robustness. Some popular MCU platforms are evaluated to implement the model, and STM32 was chosen for its advantages. Inference time simulations were made on some model-compatible STM32 boards.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

An Open-Source Voice Command-Based Human-Computer Interaction System Using Speech Recognition Platforms

Addressing Effects of Formant Dispersion and Pitch Sensitivity for the Development of Children’s KWS System

Building a Production-Ready Keyword Detection System on a Real-World Audio

Article 28 August 2024

References

Alajlan, N.N., Ibrahim, D.M.: TinyML: enabling of inference deep learning models on ultra-low-power IoT edge devices for AI applications. Micromachines 13(6), 851 (2022). https://doi.org/10.3390/mi13060851
Article Google Scholar
Bondarenko, Y., Nagel, M., Blankevoort, T.: Understanding and overcoming the challenges of efficient transformer quantization (2021). https://doi.org/10.48550/arXiv.2109.12948
Dridi, H., Ouni, K.: Towards robust combined deep architecture for speech recognition: experiments on TIMIT. Int. J. Adv. Comput. Sci. Appl. 11, 525–534 (2020). https://doi.org/10.14569/IJACSA.2020.0110469
Giménez, N.L., Freitag, F., Lee, J., Vandierendonck, H.: Comparison of two microcontroller boards for on-device model training in a keyword spotting task. In: 2022 11th Mediterranean Conference on Embedded Computing (MECO), pp. 1–4 (2022). https://doi.org/10.1109/MECO55406.2022.9797171
Heller, S., Woias, P.: Microwatt power hardware implementation of machine learning algorithms on MSP430 microcontrollers. In: 2019 26th IEEE International Conference on Electronics, Circuits and Systems (ICECS), pp. 25–28 (2019). https://doi.org/10.1109/ICECS46596.2019.8964726
López-Espejo, I., Tan, Z.H., Hansen, J.H.L., Jensen, J.: Deep spoken keyword spotting: an overview. IEEE Access 10, 4169–4199 (2022). https://doi.org/10.1109/ACCESS.2021.3139508
Article Google Scholar
López-Espejo, I., Tan, Z.H., Jensen, J.: Exploring Filterbank learning for keyword spotting. In: 2020 28th European Signal Processing Conference (EUSIPCO), pp. 331–335 (2021). https://doi.org/10.23919/Eusipco47968.2020.9287772
Mazumder, M., et al.: Multilingual spoken words corpus. In: Thirty-Fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2) (2021)
Google Scholar
Miah, M.N., Wang, G.: Keyword spotting with deep neural network on edge devices. In: 2022 IEEE 12th International Conference on Electronics Information and Emergency Communication (ICEIEC), pp. 98–102 (2022).https://doi.org/10.1109/ICEIEC54567.2022.9835061
Osman, A., Abid, U., Gemma, L., Perotto, M., Brunelli, D.: TinyML platforms benchmarking. In: Saponara, S., De Gloria, A. (eds.) ApplePies 2021. LNEE, vol. 866, pp. 139–148. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-95498-7_20
Chapter Google Scholar
Purwins, H., Li, B., Virtanen, T., Schlüter, J., Chang, S.Y., Sainath, T.: Deep learning for audio signal processing. IEEE J. Sel. Top. Sig. Process. 13(2), 206–219 (2019). https://doi.org/10.1109/JSTSP.2019.2908700
Article Google Scholar
Ray, P.P.: A review on TinyML: state-of-the-art and prospects. J. King Saud Univ. Comput. Inf. Sci. 34(4), 1595–1623 (2022). https://doi.org/10.1016/j.jksuci.2021.11.019
Article Google Scholar
Ribeiro, J., et al.: Analysis of man-machine interfaces in upper-limb prosthesis: a review. Robotics 8(1), 16 (2019). https://doi.org/10.3390/robotics8010016
Article Google Scholar
Saha, S.S., Sandha, S.S., Srivastava, M.: Machine learning for microcontroller-class hardware: a review. IEEE Sens. J. 22(22), 21362–21390 (2022). https://doi.org/10.1109/JSEN.2022.3210773
Article Google Scholar
Saifullah, K., Quaiser, R.M., Akhtar, N.: Voice keyword spotting on edge devices. In: 2022 5th International Conference on Multimedia, Signal Processing and Communication Technologies (IMPACT), pp. 1–5 (2022). https://doi.org/10.1109/IMPACT55510.2022.10029228
Sainath, T., Parada, C.: Convolutional neural networks for small-footprint keyword spotting (2015)
Google Scholar
Sainath, T.N., Vinyals, O., Senior, A., Sak, H.: Convolutional, long short-term memory, fully connected deep neural networks. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4580–4584 (Apr 2015). https://doi.org/10.1109/ICASSP.2015.7178838
Salamon, J., Jacoby, C., Bello, J.P.: A dataset and taxonomy for urban sound research. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 1041–1044. MM ’14, Association for Computing Machinery, New York, NY, USA (2014). https://doi.org/10.1145/2647868.2655045
Zim, M.Z.H.: TinyML: analysis of Xtensa LX6 microprocessor for neural network applications by ESP32 SoC (2021). https://doi.org/10.13140/RG.2.2.28602.11204

Download references

Acknowledgements

The authors would like to thank the Academic Interchange and Mobility Program (PIMA) held by the University of Cadiz, that made possible authors collaboration for the development of this research. This research was partially funded by the FEDER research project “Sistemas multimodales avanzados para prótesis robóticas de miembro superior (PROBOTHAND)” (FEDER-UCA18-108407) from Junta de Andalucía, Spain.

Author information

Authors and Affiliations

Center for Microelectronics Research, Technological University of Havana “José Antonio Echeverría”, 114 Street e/Ciclovía & Rotonda, Marianao, Havana, Cuba
Alejandro Perdomo-Campos
Center for Hydraulic Research, Technological University of Havana “José Antonio Echeverría”, 114 Street e/Ciclovía & Rotonda, Marianao, Havana, Cuba
Jorge Ramírez-Beltrán
School of Engineering, Av. Universidad de Cádiz, 10, University of Cadiz, Puerto Real, Cadiz, Spain
Arturo Morgado-Estevez

Authors

Alejandro Perdomo-Campos
View author publications
You can also search for this author in PubMed Google Scholar
Jorge Ramírez-Beltrán
View author publications
You can also search for this author in PubMed Google Scholar
Arturo Morgado-Estevez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alejandro Perdomo-Campos .

Editor information

Editors and Affiliations

Universidad de las Ciencias Informáticas, Havana, Cuba
Yanio Hernández Heredia
Universidad de las Ciencias Informáticas, Havana, Cuba
Vladimir Milián Núñez
Universidad de las Ciencias Informáticas, Havana, Cuba
José Ruiz Shulcloper

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Perdomo-Campos, A., Ramírez-Beltrán, J., Morgado-Estevez, A. (2024). Robust MCU Oriented KWS Model for Children Robotic Prosthetic Hand Control. In: Hernández Heredia, Y., Milián Núñez, V., Ruiz Shulcloper, J. (eds) Progress in Artificial Intelligence and Pattern Recognition. IWAIPR 2023. Lecture Notes in Computer Science, vol 14335. Springer, Cham. https://doi.org/10.1007/978-3-031-49552-6_25

Download citation

DOI: https://doi.org/10.1007/978-3-031-49552-6_25
Published: 20 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-49551-9
Online ISBN: 978-3-031-49552-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Robust MCU Oriented KWS Model for Children Robotic Prosthetic Hand Control