skip to main content
10.1145/3582016.3582062acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections

Space-Efficient TREC for Enabling Deep Learning on Microcontrollers

Published:25 March 2023Publication History

ABSTRACT

Deploying deep neural networks (DNNs) for a resource-constrained environment and achieving satisfactory performance is challenging. It is especially so on microcontrollers for their stringent space and computing power. This paper focuses on new ways to make TREC, an optimization recently proposed to enable computation reuse in DNNs, space and time efficient on Microcontrollers. The solution maximizes the performance benefits while keeping the DNN accuracy stable. Experiments show that the solution eliminates over 96% computations in DNNs and makes them fit well into microcontrollers, producing 3.4-5× speedups with only marginal accuracy loss.

References

  1. 2020. CifarNet. http://places.csail.mit.edu/deepscene/small-projects/TRN-pytorch-pose/model_zoo/models/slim/nets/cifarnet.py Google ScholarGoogle Scholar
  2. Peter Bajcsy and Michael Majurski. 2021. Baseline Pruning-Based Approach to Trojan Detection in Neural Networks. arXiv preprint arXiv:2101.12016. Google ScholarGoogle Scholar
  3. Colby Banbury, Chuteng Zhou, Igor Fedorov, Ramon Matas, Urmish Thakker, Dibakar Gope, Vijay Janapa Reddi, Matthew Mattina, and Paul Whatmough. 2021. Micronets: Neural network architectures for deploying tinyml applications on commodity microcontrollers. Proceedings of Machine Learning and Systems, 3 (2021), 517–532. Google ScholarGoogle Scholar
  4. Jesús Benito-Picazo, Enrique Domínguez, Esteban J Palomo, Ezequiel López-Rubio, and Juan Miguel Ortiz-de Lazcano-Lobato. 2018. Deep learning-based anomalous object detection system powered by microcontroller for PTZ cameras. In 2018 International Joint Conference on Neural Networks (IJCNN). 1–7. Google ScholarGoogle ScholarCross RefCross Ref
  5. Neel Bhave, Aniket Dhagavkar, Kalpesh Dhande, Monis Bana, and Jyoti Joshi. 2019. Smart Signal–Adaptive Traffic Signal Control using Reinforcement Learning and Object Detection. In 2019 Third International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud)(I-SMAC). 624–628. Google ScholarGoogle ScholarCross RefCross Ref
  6. Dimosthenis E Bolanakis. 2019. A survey of research in microcontroller education. IEEE Revista Iberoamericana de Tecnologias del Aprendizaje, 14, 2 (2019), 50–57. Google ScholarGoogle ScholarCross RefCross Ref
  7. Gianmarco Cerutti, Renzo Andri, Lukas Cavigelli, Elisabetta Farella, Michele Magno, and Luca Benini. 2020. Sound event detection with binary neural networks on tightly power-constrained IoT devices. In Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design. 19–24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Beidi Chen, Zichang Liu, Binghui Peng, Zhaozhuo Xu, Jonathan Lingjie Li, Tri Dao, Zhao Song, Anshumali Shrivastava, and Christopher Re. 2021. MONGOOSE: A Learnable LSH Framework for Efficient Neural Network Training. In International Conference on Learning Representations. https://openreview.net/forum?id=wWK7yXkULyh Google ScholarGoogle Scholar
  9. Arm Company. 2010. Cortex®-M4 Technical Reference Manual. https://users.ece.utexas.edu/~valvano/EE345L/Labs/Fall2011/CortexM4_TRM_r0p1.pdf Google ScholarGoogle Scholar
  10. Robert David, Jared Duke, Advait Jain, Vijay Janapa Reddi, Nat Jeffries, Jian Li, Nick Kreeger, Ian Nappier, Meghna Natraj, and Tiezhen Wang. 2021. TensorFlow lite micro: Embedded machine learning for tinyml systems. Proceedings of Machine Learning and Systems, 3 (2021), 800–811. Google ScholarGoogle Scholar
  11. Amir Erfan Eshratifar, Amirhossein Esmaili, and Massoud Pedram. 2019. Bottlenet: A deep learning architecture for intelligent mobile cloud computing services. In 2019 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED). 1–6. Google ScholarGoogle ScholarCross RefCross Ref
  12. Derek Farren, Thai Pham, and Marco Alban-Hidalgo. 2016. Low latency anomaly detection and Bayesian network prediction of anomaly likelihood. arXiv preprint arXiv:1611.03898. Google ScholarGoogle Scholar
  13. Igor Fedorov, Ryan P Adams, Matthew Mattina, and Paul Whatmough. 2019. Sparse: Sparse architecture search for cnns on resource-constrained microcontrollers. Advances in Neural Information Processing Systems, 32 (2019). Google ScholarGoogle Scholar
  14. Igor Fedorov, Ryan P Adams, Matthew Mattina, and Paul Whatmough. 2019. Sparse: Sparse architecture search for cnns on resource-constrained microcontrollers. Advances in Neural Information Processing Systems, 32 (2019). Google ScholarGoogle Scholar
  15. Benjamin Graham, Martin Engelcke, and Laurens Van Der Maaten. 2018. 3d semantic segmentation with submanifold sparse convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 9224–9232. Google ScholarGoogle ScholarCross RefCross Ref
  16. Jiawei Guan, Feng Zhang, Jiesong Liu, Hsin-Hsuan Sung, Ruofan Wu, Xiaoyong Du, and Xipeng Shen. 2022. TREC: Transient Redundancy Elimination-based Convolution. In Neural Information Processing Systems 35 (Neurips 2022). Google ScholarGoogle Scholar
  17. Chirag Gupta, Arun Sai Suggala, Ankit Goyal, Harsha Vardhan Simhadri, Bhargavi Paranjape, Ashish Kumar, Saurabh Goyal, Raghavendra Udupa, Manik Varma, and Prateek Jain. 2017. Protonn: Compressed and accurate knn for resource-scarce devices. In International Conference on Machine Learning. 1331–1340. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Song Han, Huizi Mao, and William J Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149. Google ScholarGoogle Scholar
  19. Bian Haoqiong, Sha Tiannan, and Anastasia Ailamaki. 2023. Using Cloud Functions as Accelerator for Elastic Data Analytics. In SIGMOD. Google ScholarGoogle Scholar
  20. Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network (2015). arXiv preprint arXiv:1503.02531, 2 (2015). Google ScholarGoogle Scholar
  21. Forrest N Iandola, Song Han, Matthew W Moskewicz, Khalid Ashraf, William J Dally, and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv preprint arXiv:1602.07360. Google ScholarGoogle Scholar
  22. Forrest N Iandola, Song Han, Matthew W Moskewicz, Khalid Ashraf, William J Dally, and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv preprint arXiv:1602.07360. Google ScholarGoogle Scholar
  23. Sunil Jacob, Varun G Menon, Fadi Al-Turjman, PG Vinoj, and Leonardo Mostarda. 2019. Artificial muscle intelligence system with deep learning for post-stroke assistance and rehabilitation. Ieee Access, 7 (2019), 133463–133473. Google ScholarGoogle ScholarCross RefCross Ref
  24. Jari Kaivo-oja. 2012. Weak signals analysis, knowledge management theory and systemic socio-cultural transitions. Futures, 44, 3 (2012), 206–217. Google ScholarGoogle ScholarCross RefCross Ref
  25. Kuljeet Kaur, Sahil Garg, Gagangeet Singh Aujla, Neeraj Kumar, Joel JPC Rodrigues, and Mohsen Guizani. 2018. Edge computing in the industrial internet of things environment: Software-defined-networks-based edge-cloud interplay. IEEE communications magazine, 56, 2 (2018), 44–51. Google ScholarGoogle Scholar
  26. Dongyeon Kim, Kyuhong Park, Yongjin Park, and Jae-Hyeon Ahn. 2019. Willingness to provide personal information: Perspective of privacy calculus in IoT services. Computers in Human Behavior, 92 (2019), 273–281. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Aliaksei Kolesau and Dmitrij Šešok. 2020. Voice activation systems for embedded devices: Systematic literature review. Informatica, 31, 1 (2020), 65–88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images. Google ScholarGoogle Scholar
  29. Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images. Google ScholarGoogle Scholar
  30. Ashish Kumar, Saurabh Goyal, and Manik Varma. 2017. Resource-efficient machine learning in 2 KB RAM for the internet of things. In International Conference on Machine Learning. 1935–1944. Google ScholarGoogle Scholar
  31. Liangzhen Lai and Naveen Suda. 2018. Enabling deep learning at the LoT Edge. In 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). 1–6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Liangzhen Lai, Naveen Suda, and Vikas Chandra. 2017. Deep convolutional neural network inference with floating-point weights and fixed-point activations. arXiv preprint arXiv:1703.03073. Google ScholarGoogle Scholar
  33. Liangzhen Lai, Naveen Suda, and Vikas Chandra. 2018. Cmsis-nn: Efficient neural network kernels for arm cortex-m cpus. arXiv preprint arXiv:1801.06601. Google ScholarGoogle Scholar
  34. Liangzhen Lai, Naveen Suda, and Vikas Chandra. 2018. Not all ops are created equal!. arXiv preprint arXiv:1801.04326. Google ScholarGoogle Scholar
  35. Xuesong Li, Jose Guivant, Ngaiming Kwok, Yongzhi Xu, Ruowei Li, and Hongkun Wu. 2019. Three-dimensional backbone network for 3d object detection in traffic scenes. arXiv preprint arXiv:1901.08373. Google ScholarGoogle Scholar
  36. Andrea Massa, Davide Marcantonio, Xudong Chen, Maokun Li, and Marco Salucci. 2019. DNNs as applied to electromagnetics, antennas, and propagation—A review. IEEE Antennas and Wireless Propagation Letters, 18, 11 (2019), 2225–2229. Google ScholarGoogle ScholarCross RefCross Ref
  37. Simon Mittermaier, Ludwig Kürzinger, Bernd Waschneck, and Gerhard Rigoll. 2020. Small-footprint keyword spotting on raw audio data with sinc-convolutions. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 7454–7458. Google ScholarGoogle ScholarCross RefCross Ref
  38. Mao V Ngo, Hakima Chaouchi, Tie Luo, and Tony QS Quek. 2020. Adaptive anomaly detection for IoT data in hierarchical edge computing. arXiv preprint arXiv:2001.03314. Google ScholarGoogle Scholar
  39. Lin Ning and Xipeng Shen. 2019. Deep reuse: streamline CNN inference on the fly via coarse-grained computation reuse. In Proceedings of the ACM International Conference on Supercomputing. 438–448. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Lin Ning and Xipeng Shen. 2019. Deep Reuse: streamline CNN inference on the fly via coarse-grained computation reuse. In Proceedings of the ACM International Conference on Supercomputing. 438–448. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Wei Niu, Xiaolong Ma, Sheng Lin, Shihao Wang, Xuehai Qian, Xue Lin, Yanzhi Wang, and Bin Ren. 2020. PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with Pattern-Based Weight Pruning. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’20). Association for Computing Machinery, New York, NY, USA. 907–922. isbn:9781450371025 https://doi.org/10.1145/3373376.3378534 Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Nefy Puteri Novani, Mohammad Hafiz Hersyah, and Ryon Hamdanu. 2020. Electrical Household Appliances Control using Voice Command Based on Microcontroller. In 2020 International Conference on Information Technology Systems and Innovation (ICITSI). 288–293. Google ScholarGoogle ScholarCross RefCross Ref
  43. Michela Paganini and Jessica Forde. 2020. Streamlining tensor and network pruning in pytorch. arXiv preprint arXiv:2004.13770. Google ScholarGoogle Scholar
  44. Zheng Qin, Zhaoning Zhang, Xiaotao Chen, Changjian Wang, and Yuxing Peng. 2018. Fd-mobilenet: Improved mobilenet with a fast downsampling strategy. In 2018 25th IEEE International Conference on Image Processing (ICIP). 1363–1367. Google ScholarGoogle ScholarCross RefCross Ref
  45. Marc Riera, Jose-Maria Arnau, and Antonio Gonzalez. 2018. Computation Reuse in DNNs by Exploiting Input Similarity. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). 57–68. https://doi.org/10.1109/ISCA.2018.00016 Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Manuele Rusci, Alessandro Capotondi, and Luca Benini. 2020. Memory-driven mixed low precision quantization for enabling deep network inference on microcontrollers. Proceedings of Machine Learning and Systems, 2 (2020), 326–335. Google ScholarGoogle Scholar
  47. Falk Salewski and Stefan Kowalewski. 2008. Hardware/software design considerations for automotive embedded systems. IEEE Transactions on Industrial Informatics, 4, 3 (2008), 156–163. Google ScholarGoogle ScholarCross RefCross Ref
  48. Jiawei Shao and Jun Zhang. 2020. Bottlenet++: An end-to-end approach for feature compression in device-edge co-inference systems. In 2020 IEEE International Conference on Communications Workshops (ICC Workshops). 1–6. Google ScholarGoogle ScholarCross RefCross Ref
  49. Prerna Sharma and Deepali Kamthania. 2019. Intelligent object detection and avoidance system. In International Conference on Transforming IDEAS (Inter-Disciplinary Exchanges, Analysis, and Search) into Viable Solutions. 342–351. Google ScholarGoogle Scholar
  50. Stanislava Soro. 2021. Tinyml for ubiquitous edge ai. arXiv preprint arXiv:2102.01255. Google ScholarGoogle Scholar
  51. Srinivasa R Sridhara. 2011. Ultra-low power microcontrollers for portable, wearable, and implantable medical electronics. In 16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011). 556–560. Google ScholarGoogle ScholarCross RefCross Ref
  52. Hidetoshi Teraoka, Fumiharu Nakahara, and Kenichi Kurosawa. 2017. Incremental update method for vehicle microcontrollers. In 2017 IEEE 6th Global Conference on Consumer Electronics (GCCE). 1–2. Google ScholarGoogle ScholarCross RefCross Ref
  53. Ching-Biau Tzeng. 2018. Vibration detection and analysis of wind turbine based on a wireless embedded microcontroller system. In 2018 IEEE International Conference on Applied System Invention (ICASI). 133–136. Google ScholarGoogle ScholarCross RefCross Ref
  54. Jiayi Wang, Chengliang Chai, Nan Tang, Jiabin Liu, and Guoliang Li. 2022. Coresets over Multiple Tables for Feature-rich and Data-efficient Machine Learning. Proc. VLDB Endow., 16, 1 (2022), 64–76. https://www.vldb.org/pvldb/vol16/p64-wang.pdf Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Ruofan Wu, Feng Zhang, Jiawei Guan, Zhen Zheng, Xiaoyong Du, and Xipeng Shen. 2022. Drew: Efficient winograd cnn inference with deep reuse. In Proceedings of the ACM Web Conference 2022. 1807–1816. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Ruofan Wu, Feng Zhang, Zhen Zheng, Xiaoyong Du, and Xipeng Shen. 2021. Exploring deep reuse in winograd CNN inference. In Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 483–484. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Yan Yan, Yuxing Mao, and Bo Li. 2018. Second: Sparsely embedded convolutional detection. Sensors, 18, 10 (2018), 3337. Google ScholarGoogle ScholarCross RefCross Ref
  58. Hyunho Yeo, Youngmok Jung, Jaehong Kim, Jinwoo Shin, and Dongsu Han. 2018. Neural adaptive content-aware internet video delivery. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). 645–661. Google ScholarGoogle Scholar
  59. JZ Yi, YK Tan, ZR Ang, and SK Panda. 2007. Microcontroller based voice-activated powered wheelchair control. In Proceedings of the 1st international convention on Rehabilitation engineering & assistive technology: in conjunction with 1st Tan Tock Seng Hospital Neurorehabilitation Meeting. 67–72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Yunkai Yu, Zhihong Yang, Yuyang You, and Wenjing Shan. 2021. FASSNet: fast apnea syndrome screening neural network based on single-lead electrocardiogram for wearable devices. Physiological Measurement, 42, 8 (2021), 085005. Google ScholarGoogle ScholarCross RefCross Ref
  61. Jian Yuan, Kok Kiong Tan, Tong Heng Lee, and Gerald Choon Huat Koh. 2014. Power-efficient interrupt-driven algorithms for fall detection and classification of activities of daily living. IEEE Sensors Journal, 15, 3 (2014), 1377–1387. Google ScholarGoogle ScholarCross RefCross Ref
  62. Matthew D Zeiler and Rob Fergus. 2014. Visualizing and understanding convolutional networks. In European conference on computer vision. 818–833. Google ScholarGoogle ScholarCross RefCross Ref
  63. Feng Zhang, Jidong Zhai, Bingsheng He, Shuhao Zhang, and Wenguang Chen. 2016. Understanding co-running behaviors on integrated CPU/GPU architectures. IEEE Transactions on Parallel and Distributed Systems, 28, 3 (2016), 905–918. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Feng Zhang, Jidong Zhai, Xipeng Shen, Onur Mutlu, and Xiaoyong Du. 2022. POCLib: a high-performance framework for enabling near orthogonal processing on compression. IEEE Transactions on Parallel and Distributed Systems, 33, 2 (2022), 459–475. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Yundong Zhang, Naveen Suda, Liangzhen Lai, and Vikas Chandra. 2017. Hello edge: Keyword spotting on microcontrollers. arXiv preprint arXiv:1711.07128. Google ScholarGoogle Scholar

Index Terms

  1. Space-Efficient TREC for Enabling Deep Learning on Microcontrollers

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader