ABSTRACT
Artificial intelligence (AI) on tiny edge devices has become feasible thanks to the emergence of high-performance microcontrollers (MCUs) and lightweight machine learning (ML) models. Nevertheless, the cost and power consumption of these MCUs and the computation requirements of these ML algorithms still present barriers that prevent the widespread inclusion of AI functionality on smaller, cheaper, and lower-power devices. Thus, there is an urgent need for a more efficient ML algorithm and implementation strategy suitable for lower-end MCUs.
This paper presents MicroVSA, a library of optimized implementations of a low-dimensional computing (LDC) classifier, a recently proposed variant of vector symbolic architecture (VSA), for 8, 16, and 32-bit MCUs. MicroVSA achieves up to 21.86x speedup and 8x less flash utilization compared to the vanilla LDC. Evaluation results on the three most common always-on inference tasks - myocardial infarction detection, human activity recognition, and hot word detection - demonstrate that MicroVSA outperforms traditional classifiers and achieves comparable accuracy to tiny deep learning models, while requiring only a few ten bytes of RAM and can easily fit in tiny 8-bit MCUs. For instance, our model for recognizing human activity from inertial sensor data only needs 2.46 KiB of flash and 0.02 KiB of RAM and can complete one inference in 0.85 ms on a 32-bit ARM Cortex-M4 MCU or 11.82 ms on a tiny 8-bit AVR MCU, whereas the RNN model running on a higher-end ARM Cortex-M3 requires 62.0 ms. Our study suggests that ubiquitous ML deployment on low-cost tiny MCUs is possible, and more study on VSA model training, model compression, and implementation techniques is needed to further lower the cost and power of ML on edge devices.
- Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. Tensor-flow: A system for large-scale machine learning. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, OSDI'16, page 265--283, USA, 2016. USENIX Association.Google ScholarDigital Library
- Advanced Micro Devices, Inc. Software Optimization Guide for AMD64 Processors, September 2005. Rev. 3.06.Google Scholar
- Sean Eron Anderson. Bit twiddling hacks. https://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetTable, 1997.Google Scholar
- Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra, Jorge Luis Reyes-Ortiz, et al. A public domain dataset for human activity recognition using smartphones. In Esann, volume 3, page 3, 2013.Google Scholar
- Arm Limited. Arm® Cortex®-M0+ Processor Technical Reference Manual, December 2012. Rev. r0p1.Google Scholar
- Arm Limited. Arm® Cortex®-M7 Processor Technical Reference Manual, December 2014. Rev. r0p2.Google Scholar
- Arm Limited. Arm® Cortex®-M4 Processor Technical Reference Manual, May 2020. Rev. r0p1.Google Scholar
- Lutz Bierl. MSP430 Family Mixed-Signal Microcontroller Application Reports. Texas Instruments, January 2000.Google Scholar
- Ralf Bousseljot, Dieter Kreiseler, and Allard Schnabel. Nutzung der ekg-signaldatenbank cardiodat der ptb über das internet. 1995.Google Scholar
- Cadence Design Systems, Inc. Xtensa® Instruction Set Architecture (ISA) Summary, April 2022.Google Scholar
- Robert David, Jared Duke, Advait Jain, Vijay Janapa Reddi, Nat Jeffries, Jian Li, Nick Kreeger, Ian Nappier, Meghna Natraj, Tiezhen Wang, et al. Tensorflow lite micro: Embedded machine learning for tinyml systems. Proceedings of Machine Learning and Systems, 3:800--811, 2021.Google Scholar
- Don Kurian Dennis, Durmus Alp Emre Acar, Vikram Mandikal, Vinu Sankar Sadasivan, Harsha Vardhan Simhadri, Venkatesh Saligrama, and Prateek Jain. Shallow RNNs: A Method for Accurate Time-Series Classification on Tiny Devices. Curran Associates Inc., Red Hook, NY, USA, 2019.Google Scholar
- Don Kurian Dennis, Chirag Pabbaraju, Harsha Vardhan Simhadri, and Prateek Jain. Multiple instance learning for efficient sequential data classification on resource-constrained devices. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS'18, page 10976--10987, Red Hook, NY, USA, 2018. Curran Associates Inc.Google ScholarDigital Library
- Shijin Duan, Yejia Liu, Shaolei Ren, and Xiaolin Xu. Lehdc: Learning-based hyperdimensional computing classifier. In Proceedings of the 59th ACM/IEEE Design Automation Conference, pages 1111--1116, 2022.Google ScholarDigital Library
- Shijin Duan, Xiaolin Xu, and Shaolei Ren. A brain-inspired low-dimensional computing classifier for inference on tiny devices. In tinyML Research Symposium 2022, 2022.Google Scholar
- eloquentarduino. micromlgen. https://github.com/eloquentarduino/micromlgen, 2019.Google Scholar
- Apache Software Foundation. Microtvm: Tvm on bare-metal. https://tvm.apache.org/docs/topic/microtvm/index.html, 2023.Google Scholar
- Lulu Ge and Keshab K Parhi. Classification using hyperdimensional computing: A review. IEEE Circuits and Systems Magazine, 20(2):30--47, 2020.Google ScholarCross Ref
- Ary L Goldberger, Luis AN Amaral, Leon Glass, Jeffrey M Hausdorff, Plamen Ch Ivanov, Roger G Mark, Joseph E Mietus, George B Moody, Chung-Kang Peng, and H Eugene Stanley. Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals. circulation, 101(23):e215--e220, 2000.Google Scholar
- Chirag Gupta, Arun Sai Suggala, Ankit Goyal, Harsha Vardhan Simhadri, Bhargavi Paranjape, Ashish Kumar, Saurabh Goyal, Raghavendra Udupa, Manik Varma, and Prateek Jain. ProtoNN: Compressed and accurate kNN for resource-scarce devices. In Doina Precup and Yee Whye Teh, editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 1331--1340. PMLR, 06--11 Aug 2017.Google Scholar
- Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. Mobilenets: Efficient convolutional neural networks for mobile vision applications, 2017.Google Scholar
- Mohsen Imani, Samuel Bosch, Sohum Datta, Sharadhi Ramakrishna, Sahand Salamat, Jan M Rabaey, and Tajana Rosing. Quanthd: A quantization framework for hyperdimensional computing. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 39(10):2268--2278, 2019.Google ScholarCross Ref
- Mohsen Imani, Deqian Kong, Abbas Rahimi, and Tajana Rosing. Voicehd: Hyperdimensional computing for efficient speech recognition. In 2017 IEEE international conference on rebooting computing (ICRC), pages 1--8. IEEE, 2017.Google ScholarCross Ref
- Zohar Jackson, César Souza, Jason Flaks, Yuxin Pan, Hereman Nicolas, and Adhish Thite. Jakobovski/free-spoken-digit-dataset: v1. 0.8, 2018.Google Scholar
- Mohammad Kachuee, Shayan Fazeli, and Majid Sarrafzadeh. Ecg heartbeat classification: A deep transferable representation. In 2018 IEEE International Conference on Healthcare Informatics (ICHI), pages 443--444, 2018.Google ScholarCross Ref
- Pentti Kanerva. Hyperdimensional computing: An introduction to computing in distributed representation with high-dimensional random vectors. Cognitive computation, 1:139--159, 2009.Google ScholarCross Ref
- Byeonggeun Kim, Mingu Lee, Jinkyu Lee, Yeonseok Kim, and Kyuwoong Hwang. Query-by-example on-device keyword spotting. In 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pages 532--538, 2019.Google ScholarCross Ref
- Denis Kleyko, Dmitri Rachkovskij, Evgeny Osipov, and Abbas Rahimi. A survey on hyperdimensional computing aka vector symbolic architectures, part ii: Applications, cognitive models, and challenges. ACM Computing Surveys, 55(9):1--52, 2023.Google ScholarDigital Library
- Denis Kleyko, Dmitri A. Rachkovskij, Evgeny Osipov, and Abbas Rahimi. A survey on hyperdimensional computing aka vector symbolic architectures, part I: models and data transformations. CoRR, abs/2111.06077, 2021.Google Scholar
- Ashish Kumar, Saurabh Goyal, and Manik Varma. Resource-efficient machine learning in 2 kb ram for the internet of things. In Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML'17, page 1935--1944. JMLR.org, 2017.Google Scholar
- Aditya Kusupati, Manish Singh, Kush Bhatia, Ashish Kumar, Prateek Jain, and Manik Varma. Fastgrnn: A fast, accurate, stable and tiny kilobyte sized gated recurrent neural network. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS'18, page 9031--9042, Red Hook, NY, USA, 2018. Curran Associates Inc.Google Scholar
- Jennifer R. Kwapisz, Gary M. Weiss, and Samuel A. Moore. Activity recognition using cell phone accelerometers. SIGKDD Explor. Newsl., 12(2):74--82, mar 2011.Google ScholarDigital Library
- Liangzhen Lai, Naveen Suda, and Vikas Chandra. Cmsis-nn: Efficient neural network kernels for arm cortex-m cpus, 2018.Google Scholar
- Ji Lin, Wei-Ming Chen, Han Cai, Chuang Gan, and Song Han. Mcunetv2: Memory-efficient patch-based inference for tiny deep learning. In Annual Conference on Neural Information Processing Systems (NeurIPS), 2021.Google Scholar
- Ji Lin, Wei-Ming Chen, John Cohn, Chuang Gan, and Song Han. Mcunet: Tiny deep learning on iot devices. In Annual Conference on Neural Information Processing Systems (NeurIPS), 2020.Google Scholar
- Zechun Liu, Zhiqiang Shen, Shichao Li, Koen Helwegen, Dong Huang, and Kwang-Ting Cheng. How do adam and training strategies help bnns optimization? arXiv preprint arXiv:2106.11309, 2021.Google Scholar
- Microchip Technology Inc. AVR® Instruction Set Manual, 2020.Google Scholar
- Microchip Technology Inc. PIC18F06/16Q40 Data Sheet, 2020.Google Scholar
- Microchip Technology Inc. PIC32 Family Reference Manual, 2020.Google Scholar
- Markus Nagel, Marios Fournarakis, Yelysei Bondarenko, and Tijmen Blankevoort. Overcoming oscillations in quantization-aware training. In International Conference on Machine Learning, pages 16318--16330. PMLR, 2022.Google Scholar
- Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 8024--8035. Curran Associates, Inc., 2019.Google Scholar
- F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825--2830, 2011.Google ScholarDigital Library
- Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. Xnor-net: Imagenet classification using binary convolutional neural networks. In European conference on computer vision, pages 525--542. Springer, 2016.Google ScholarCross Ref
- Chandan KA Reddy, Ebrahim Beyrami, Jamie Pool, Ross Cutler, Sriram Srinivasan, and Johannes Gehrke. A scalable noisy speech dataset and online subjective test framework. Proc. Interspeech 2019, pages 1816--1820, 2019.Google ScholarCross Ref
- Grand View Research. Microcontroller market size, share & trends analysis report by product (8-bit, 16-bit, 32-bit), by application (consumer electronics & telecom, automotive, industrial, medical devices, aerospace & defense), by region, and segment forecasts, 2023 - 2030, 2023.Google Scholar
- Kenny Schlegel, Peer Neubert, and Peter Protzel. A comparison of vector symbolic architectures. Artificial Intelligence Review, 55(6):4523--4555, 2022.Google ScholarDigital Library
- Stanley Smith Stevens, John Volkmann, and Edwin Broomell Newman. A scale for the measurement of the psychological magnitude pitch. The journal of the acoustical society of america, 8(3):185--190, 1937.Google Scholar
- STMicroelectronics. STM8 CPU programming manual, September 2011. Rev. 3.Google Scholar
- STMicroelectronics. X-cube-ai: Ai expansion pack for stm32cubemx. https://www.st.com/en/embedded-software/x-cube-ai.html, 2021.Google Scholar
- STMicroelectronics. stm32ai-modelzoo. https://github.com/STMicroelectronics/stm32ai-modelzoo, 2023.Google Scholar
- Infineon Technologies. Modustoolboxtm for machine learning.Google Scholar
- Pete. Warden. Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition. ArXiv e-prints, April 2018.Google Scholar
Index Terms
- MicroVSA: An Ultra-Lightweight Vector Symbolic Architecture-based Classifier Library for Always-On Inference on Tiny Microcontrollers
Recommendations
Microcontrollers and modern control methods
CEA'08: Proceedings of the 2nd WSEAS International Conference on Computer Engineering and ApplicationsThe aim of this paper is to present overview of the microcontrollers which are usable for the controllers based on the modern control algorithms. In this paper are described necessities of the self tuning controllers. In our applications, we are focused ...
Innovative Test Solutions for Pin-Limited Microcontrollers
ISQED '08: Proceedings of the 9th international symposium on Quality Electronic DesignA scan-based test methodology was adopted for the Freescale S08 and RS08 (8-bit) families of microcontrollers (MCUs) several years ago. This methodology has been shown to provide high quality testing and is an important part of Freescale’s “ZeroDefect” ...
Comments