Tools and methods for Edge-AI-systems

Nils Schwabe; Yexu Zhou; Leon Hielscher; Tobias Röddiger; Till Riedel; Sebastian Reiter

doi:10.1515/auto-2022-0023

Published by De Gruyter (O) September 3, 2022

Tools and methods for Edge-AI-systems

Werkzeuge und Methoden zum Entwurf von Edge-AI-Systemen

Nils Schwabe

Nils Schwabe works as research scientist in the research division Intelligent Systems and Production Engineering (ISPE) at the FZI, after finishing his M. Sc. Embedded Systems Engineering at the University of Freiburg in 2019. His research focus is on SoC architectures for AI applications.
, Yexu Zhou

Yexu Zhou works as research scientist at the Karlsruhe Institute for Technology where he pursues his PhD focussing on AutoML and neural network applications. He received his M.Sc. in Mechanical Engineering from the Karlsruhe Institute of Technology in 2019.
, Leon Hielscher
Leon Hielscher works as research scientist in the research division Intelligent Systems and Production Engineering (ISPE) at the FZI, after finishing his Informatics M. Sc. at the Karlsruher Institute of Technology in 2017. His research focus is on automated design and generation methodologies for SoC platforms.
, Tobias Röddiger

Tobias Röddiger works as research scientist at the Karlsruhe Institute for Technology where he pursues his PhD focussing on Wearable Computing Systems. He received his M. Sc. in Computer Science from the Karlsruhe Institute of Technology in 2019.
, Till Riedel

Till Riedel lab leader at TECO within the Chair for Pervasive Computing Systems of Michael Beigl and lecturer at the Karlsruhe Institute of Technology. He defended his PhD g on Middleware for Ubiquitous Systems at the Karlsruhe Institute of Technology in in 2012.
and Sebastian Reiter

Sebastian Reiter works as department manager in the research division Intelligent Systems and Production Engineering (ISPE) at the FZI. He finished his diploma in computer science at the University of Karlsruhe in 2008.

From the journal at - Automatisierungstechnik

https://doi.org/10.1515/auto-2022-0023

Showing a limited preview of this publication:

Abstract

The enormous potential of artificial intelligence, especially artificial neural networks, when used for edge computing applications in cars, traffic lights or smart watches, has not yet been fully exploited today. The reasons for this are the computing, energy and memory requirements of modern neural networks, which typically cannot be met by embedded devices. This article provides a detailed summary of today’s challenges and gives a deeper insight into existing solutions that enable neural network performance with modern HW/SW co-design techniques.

Zusammenfassung

Das außerordentliche Potenzial der künstlichen Intelligenz, insbesondere künstlicher neuronaler Netze, kann heute noch nicht voll ausgeschöpft werden, vor allem, wenn sie für Edge-Computing-Anwendungen bspw. in Autos, Ampeln oder intelligenten Uhren eingesetzt wird. Grund dafür sind hohe Anforderungen an Rechenleistung, Energie und Speicher moderner neuronaler Netze, die normalerweise nicht von eingebetteten Geräten erfüllt werden können. Dieser Artikel bietet eine detaillierte Zusammenfassung der heutigen Herausforderungen und gibt einen tieferen Einblick in bestehende Lösungen, die die Leistungsfähigkeit neuronaler Netze mit modernen HW/SW-Co-Design Techniken erhöht.

Keywords: Edge-AI; machine learning; hardware acceleration; co-design; auto-ml

Schlagwörter: Edge-AI; maschinelles Lernen; Hardware-Beschleuniger; Co-Design; AutoML

Funding statement: This work was supported as part of the Competence Center Karlsruhe for AI Systems Engineering (CC-KING, Az: 3-4332.62-FhG/38, https://www.ai-engineering.eu) by the Ministry of Economic Affairs, Labour, and Tourism Baden Württemberg.

About the authors

Nils Schwabe

Nils Schwabe works as research scientist in the research division Intelligent Systems and Production Engineering (ISPE) at the FZI, after finishing his M. Sc. Embedded Systems Engineering at the University of Freiburg in 2019. His research focus is on SoC architectures for AI applications.

Yexu Zhou

Yexu Zhou works as research scientist at the Karlsruhe Institute for Technology where he pursues his PhD focussing on AutoML and neural network applications. He received his M.Sc. in Mechanical Engineering from the Karlsruhe Institute of Technology in 2019.

Leon Hielscher

Leon Hielscher works as research scientist in the research division Intelligent Systems and Production Engineering (ISPE) at the FZI, after finishing his Informatics M. Sc. at the Karlsruher Institute of Technology in 2017. His research focus is on automated design and generation methodologies for SoC platforms.

Tobias Röddiger

Tobias Röddiger works as research scientist at the Karlsruhe Institute for Technology where he pursues his PhD focussing on Wearable Computing Systems. He received his M. Sc. in Computer Science from the Karlsruhe Institute of Technology in 2019.

Till Riedel

Till Riedel lab leader at TECO within the Chair for Pervasive Computing Systems of Michael Beigl and lecturer at the Karlsruhe Institute of Technology. He defended his PhD g on Middleware for Ubiquitous Systems at the Karlsruhe Institute of Technology in in 2012.

Sebastian Reiter

Sebastian Reiter works as department manager in the research division Intelligent Systems and Production Engineering (ISPE) at the FZI. He finished his diploma in computer science at the University of Karlsruhe in 2008.

References

1. Benmeziane, Hadjer, Kaoutar El Maghraoui, Hamza Ouarnoughi, Smail Niar, Martin Wistuba and Naigang Wang. 2021. A comprehensive survey on hardware-aware neural architecture search. arXiv preprint arXiv:2101.09336.10.24963/ijcai.2021/592Search in Google Scholar

2. Cao, Shijie, Chen Zhang, Zhuliang Yao, Wencong Xiao, Lanshun Nie, Dechen Zhan, Yunxin Liu, Ming Wu and Lintao Zhang. 2019. Efficient and effective sparse LSTM on FPGA with bank-balanced sparsity. In: Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, pp. 63–72.10.1145/3289602.3293898Search in Google Scholar

3. Cheng, Yu, Duo Wang, Pan Zhou and Tao Zhang. 2017. A survey of model compression and acceleration for deep neural networks. arXiv preprint arXiv:1710.09282.Search in Google Scholar

4. Farshchi, Farzad, Qijing Huang and Heechul Yun. 2019. Integrating NVIDIA deep learning accelerator (NVDLA) with RISC-V SoC on FireSim. In: 2019 2nd Workshop on Energy Efficient Machine Learning and Cognitive Computing for Embedded Applications (EMC2). IEEE, pp. 21–25.10.1109/EMC249363.2019.00012Search in Google Scholar

5. Feurer, Matthias, Aaron Klein, Katharina Eggensperger, Jost Springenberg, Manuel Blum and Frank Hutter. 2015. Efficient and robust automated machine learning. In: Advances in neural information processing systems 28. Curran Associates, Inc., pp. 2962–2970.Search in Google Scholar

6. Fleck, Tobias, Sven Ochs, Marc Renè Zofka and J. Marius Zollner. 2020. Robust tracking of reference trajectories for autonomous driving in intelligent roadside infrastructure. In: 2020 IEEE Intelligent Vehicles Symposium (IV). IEEE, pp. 1337–1342.10.1109/IV47402.2020.9304620Search in Google Scholar

7. Genc, Hasan, Seah Kim, Alon Amid, Ameer Haj-Ali, Vighnesh Iyer, Pranav Prakash, Jerry Zhao, Daniel Grubb, Harrison Liew, Howard Mao and et al. 2021. Gemmini: Enabling systematic deep-learning architecture evaluation via full-stack integration. In: 2021 58th ACM/IEEE Design Automation Conference (DAC). IEEE, pp. 769–774.10.1109/DAC18074.2021.9586216Search in Google Scholar

8. Gonzalez, Abraham and Charles Hong. 2020. A Chipyard Comparison of NVDLA and Gemmini. Tech. Rep. EE, Berkeley, CA, USA, pp. 290–292.Search in Google Scholar

9. Guo, Yunhui. 2018. A survey on methods and theories of quantized neural networks. arXiv preprint arXiv:1808.04752.Search in Google Scholar

10. Gysel, Philipp, Mohammad Motamedi and Soheil Ghiasi. 2016. Hardware-oriented approximation of convolutional neural networks. arXiv preprint arXiv:1604.03168.Search in Google Scholar

11. HajiRassouliha, Amir, Andrew J. Taberner, Martyn P. Nash and Poul M.F. Nielsen. 2018. Suitability of recent hardware accelerators (DSPs, FPGAs, and GPUs) for computer vision and image processing algorithms. Signal Processing: Image Communication 68: 101–119.10.1016/j.image.2018.07.007Search in Google Scholar

12. Hinton, Geoffrey, Oriol Vinyals and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531.Search in Google Scholar

13. Iandola, Forrest N., Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50× fewer parameters and < 0.5 MB model size. arXiv preprint arXiv:1602.07360.Search in Google Scholar

14. Khurana, Udayan, Deepak Turaga, Horst Samulowitz and Srinivasan Parthasrathy. 2016. Cognito: Automated feature engineering for supervised learning. In: 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW). IEEE, pp. 1304–1307.10.1109/ICDMW.2016.0190Search in Google Scholar

15. Lu, Liqiang and Yun Liang. 2018. SpWA: An efficient sparse winograd convolutional neural networks accelerator on FPGAs. In: Proceedings of the 55th Annual Design Automation Conference. IEEE, pp. 1–6.10.1145/3195970.3196120Search in Google Scholar

16. Ma, Ningning, Xiangyu Zhang, Hai-Tao Zheng and Jian Sun. 2018. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European conference on computer vision (ECCV), pp. 116–131.10.1007/978-3-030-01264-9_8Search in Google Scholar

17. Mollah, Muhammad Baqer, Md Abul Kalam Azad and Athanasios Vasilakos. 2017. Security and privacy challenges in mobile cloud computing: Survey and way ahead. Journal of Network and Computer Applications 84: 38–54.10.1016/j.jnca.2017.02.001Search in Google Scholar

18. Nakahara, Hiroki, Haruyoshi Yonekawa, Tomoya Fujii and Shimpei Sato. 2018. A lightweight YOLOv2: A binarized CNN with a parallel support vector regression for an FPGA. In: Proceedings of the 2018 ACM/SIGDA International Symposium on field-programmable gate arrays. ACM, pp. 31–40.10.1145/3174243.3174266Search in Google Scholar

19. Ouyang, Zhenchao, Jianwei Niu, Yu Liu and Mohsen Guizani. 2019. Deep CNN-based real-time traffic light detector for self-driving vehicles. IEEE transactions on Mobile Computing 19(2): 300–313.10.1109/TMC.2019.2892451Search in Google Scholar

20. Palffy, Andras, Jiaao Dong, Julian F.P. Kooij and Dariu M. Gavrila. 2020. CNN based road user detection using the 3D radar cube. IEEE Robotics and Automation Letters 5(2): 1263–1270.10.1109/LRA.2020.2967272Search in Google Scholar

21. Redmon, Joseph and Ali Farhadi. 2018. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767.Search in Google Scholar

22. Sandler, Mark, Andrew Howard, Menglong Zhu, Andrey Zhmoginov and Liang-Chieh Chen. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp. 4510–4520.10.1109/CVPR.2018.00474Search in Google Scholar

23. Shrivastava, Ashish, Tomas Pfister, Oncel Tuzel, Joshua Susskind, Wenda Wang and Russell Webb. 2017. Learning from simulated and unsupervised images through adversarial training. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp. 2107–2116.10.1109/CVPR.2017.241Search in Google Scholar

24. Venkataramani, Swagath, Vijayalakshmi Srinivasan, Wei Wang, Sanchari Sen, Jintao Zhang, Ankur Agrawal, Monodeep Kar, Shubham Jain, Alberto Mannari, Hoang Tran and et al. 2021. RaPiD: AI accelerator for ultra-low precision training and inference. In: 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA). IEEE, pp. 153–166.10.1109/ISCA52012.2021.00021Search in Google Scholar

25. Vogelsang, Thomas. 2010. Understanding the energy consumption of dynamic random access memories. In: 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture. IEEE, pp. 363–374.10.1109/MICRO.2010.42Search in Google Scholar

26. Wu, Bichen, Xiaoliang Dai, Peizhao Zhang, Yanghan Wang, Fei Sun, Yiming Wu, Yuandong Tian, Peter Vajda, Yangqing Jia and Kurt Keutzer. 2019. Fbnet: Hardware-aware efficient convnet design via differentiable neural architecture search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, pp. 10734–10742.10.1109/CVPR.2019.01099Search in Google Scholar

27. Yang, Zhifang, Lei Yan and Jiakai Yuan. 2020. Design and Implementation of Driverless Perceptual System Based on CPU+ FPGA. In: 2020 5th International Conference on Control, Robotics and Cybernetics (CRC). IEEE, pp. 261–265.10.1109/CRC51253.2020.9253490Search in Google Scholar

Received: 2022-02-17

Accepted: 2022-08-05

Published Online: 2022-09-03

Published in Print: 2022-09-27

Tools and methods for Edge-AI-systems

Abstract

Zusammenfassung

About the authors

References

Journal and Issue

Articles in the same Issue