CAD-to-real: enabling deep neural networks for 3D pose estimation of electronic control units: A transferable and automated approach for industrial use cases

Simon Bäuerle; Moritz Böhland; Jonas Barth; Markus Reischl; Andreas Steimer; Ralf Mikut

doi:10.1515/auto-2021-0020

Published by De Gruyter (O) October 1, 2021

CAD-to-real: enabling deep neural networks for 3D pose estimation of electronic control units

A transferable and automated approach for industrial use cases

CAD-to-real: Eine Methode zum Einsatz tiefer neuronaler Netze bei der 3D-Lageerkennung von elektronischen Steuergeräten

Ein transferierbarer und automatisierter Ansatz für industrielle Anwendungen

Simon Bäuerle
Simon Bäuerle works in a joint research project of the Institute for Automation and Applied Informatics at the Karlsruhe Institute of Technology and the Robert Bosch GmbH in Reutlingen. Research interests: Machine learning, image processing, industrial AI.
, Moritz Böhland
Moritz Böhland works at the Institute for Automation and Applied Informatics at the Karlsruhe Institute of Technology. Research interests: Machine learning, data mining, image processing.
, Jonas Barth
Jonas Barth works at the Robert Bosch GmbH in Reutlingen. Research interests: Applied machine learning, data science, industrial AI.
, Markus Reischl
Markus Reischl is head of the research group “Machine Learning for High-Throughput Methods and Mechatronics” of the Institute for Automation and Applied Informatics at the Karlsruhe Institute of Technology. Research interests: Man-machine interfaces, image processing, machine learning, data analytics.
, Andreas Steimer
Andreas Steimer works at the Bosch Center for Artificial Intelligence at the Robert Bosch GmbH in Renningen. Research interests: Machine learning, data science, industrial AI.
and Ralf Mikut
Ralf Mikut is Head of the Research Area Automated Image and Data Analysis at the Institute for Automation and Applied Informatics at the Karlsruhe Institute of Technology and Speaker of the Helmholtz Information and Data Science School for Health (HIDSS4Health). Research interests: Computational intelligence, data analytics, modelling and image processing with applications in biology, chemistry, medical engineering, energy systems and robotics.

From the journal at - Automatisierungstechnik

https://doi.org/10.1515/auto-2021-0020

Showing a limited preview of this publication:

Abstract

Image processing techniques are widely used within automotive series production, including production of electronic control units (ECUs). Deep learning approaches have made rapid advances during the last years, but are not prominent in those industrial settings yet. One major obstacle is the lack of suitable training data. We adapt the recently developed method of domain randomization to our use case of 3D pose estimation of ECU housings. We create purely synthetic data with high visual diversity to train artificial neural networks (ANNs). This enables ANNs to estimate the 3D pose of a real sample part with high accuracy from a single low-resolution RGB image in a production-like setting. Requirements regarding measurement hardware are very low. Our entire setup is fully automated and can be transferred to related industrial use cases.

Zusammenfassung

Bildverarbeitungsmethoden sind in der Serienproduktion von Fahrzeugteilen weit verbreitet, wie zum Beispiel in der Produktion von elektronischen Steuergeräten. Ansätze mit tiefen neuronalen Netzen mit vielen Schichten (Deep learning) haben in den vergangenen Jahren beeindruckende Fortschritte gemacht, werden jedoch aktuell in diesem industriellen Umfeld nur begrenzt eingesetzt. Ein häufiges Hindernis ist dabei der Mangel an ausreichenden Trainingsdaten. Wir adaptieren eine kürzlich entwickelte Methode der zufälligen Veränderung von Bildern zur Verbesserung der Robustheit unter anderen Bedingungen (Domain randomization) für unsere Anwendung der 3D-Lageerkennung bei Gehäusebauteilen von Steuergeräten. Wir erzeugen ausschließlich künstliche Trainingsdaten mit einer hohen visuellen Vielfalt, um tiefe neuronale Netze zu trainieren. Dadurch können tiefe neuronale Netze die 3D-Orientierung eines echten Musterbauteils mit einer hohen Genauigkeit schätzen. Dies ist anhand eines einzigen Bildes aus einer produktionsähnlichen Umgebung mit hoher Genauigkeit möglich. Die Anforderungen an die Messeinrichtung sind sehr niedrig. Alle Vorgänge in unserem Ansatz laufen automatisch ab und lassen sich auf ähnliche Anwendungen transferieren.

Keywords: domain randomization; industrial AI; deep learning; domain adaptation; pose estimation

Schlagwörter: domain randomization; industrielle KI; tiefe neuronale Netze; domain adaptation; Lageerkennung

About the authors

Simon Bäuerle

Simon Bäuerle works in a joint research project of the Institute for Automation and Applied Informatics at the Karlsruhe Institute of Technology and the Robert Bosch GmbH in Reutlingen. Research interests: Machine learning, image processing, industrial AI.

Moritz Böhland

Moritz Böhland works at the Institute for Automation and Applied Informatics at the Karlsruhe Institute of Technology. Research interests: Machine learning, data mining, image processing.

Jonas Barth

Jonas Barth works at the Robert Bosch GmbH in Reutlingen. Research interests: Applied machine learning, data science, industrial AI.

apl. Prof. Dr. Markus Reischl

Markus Reischl is head of the research group “Machine Learning for High-Throughput Methods and Mechatronics” of the Institute for Automation and Applied Informatics at the Karlsruhe Institute of Technology. Research interests: Man-machine interfaces, image processing, machine learning, data analytics.

Dr. sc. Andreas Steimer

Andreas Steimer works at the Bosch Center for Artificial Intelligence at the Robert Bosch GmbH in Renningen. Research interests: Machine learning, data science, industrial AI.

apl. Prof. Dr.-Ing. Ralf Mikut

Ralf Mikut is Head of the Research Area Automated Image and Data Analysis at the Institute for Automation and Applied Informatics at the Karlsruhe Institute of Technology and Speaker of the Helmholtz Information and Data Science School for Health (HIDSS4Health). Research interests: Computational intelligence, data analytics, modelling and image processing with applications in biology, chemistry, medical engineering, energy systems and robotics.

Author contributions: We describe the individual contributions of Simon Bäuerle (SB), Moritz Böhland (MB), Jonas Barth (JB), Markus Reischl (MR), Andreas Steimer (AS) and Ralf Mikut (RM) using CRediT [4]: Writing – Original Draft: SB; Writing – Review & Editing: MB, JB, MR, AS, RM; Conceptualization: SB, JB, AS, RM; Investigation: SB, MB; Methodology: SB; Software: SB, MB; Supervision: JB, MR, AS, RM; Project Administration: JB, RM; Funding Acquisition: JB.

References

1. Ammirato, P.; Tremblay, J.; Liu, M.-Y.; Berg, A.; Fox, D.: Symgan: Orientation estimation without annotation for symmetric objects. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), p. 1668–1677, 2020.10.1109/WACV45572.2020.9093450Search in Google Scholar

2. Andrychowicz, O. M.; Baker, B.; Chociej, M.; Józefowicz, R.; McGrew, B.; Pachocki, J.; Petron, A.; Plappert, M.; Powell, G.; Ray, A.; Schneider, J.; Sidor, S.; Tobin, J.; Welinder, P.; Weng, L.; Zaremba, W.: Learning dexterous in-hand manipulation. The International Journal of Robotics Research 39 (2020) 1, p. 3–20.10.1177/0278364919887447Search in Google Scholar

3. Baeuerle, S.; Barth, J.; Tavares de Menezes, E. R.; Steimer, A.; Mikut, R.: CAD2Real: Deep learning with domain randomization of CAD data for 3D pose estimation of electronic control unit housings. In: Proceedings – 30. Workshop Computational Intelligence, p. 33–52, KIT Scientific Publishing, 2020.Search in Google Scholar

4. Brand, A.; Allen, L.; Altman, M.; Hlava, M.; Scott, J.: Beyond authorship: attribution, contribution, collaboration, and credit. Learned Publishing 28 (2015) 2, p. 151–155.10.1087/20150211Search in Google Scholar

5. Bäuerle, A.; Ropinski, T.: Net2Vis: Transforming Deep Convolutional Networks into Publication-Ready Visualizations. Techn. Rep., arXiv:1902.04394, 2019.Search in Google Scholar

6. Böhland, M.; Scherr, T.; Bartschat, A.; Mikut, R.; Reischl, M.: Influence of synthetic label image object properties on GAN supported segmentation pipelines. In: Proceedings – 29. Workshop Computational Intelligence, p. 289–309, KIT Scientific Publishing, 2019.Search in Google Scholar

7. Chollet, F.; et al.: Keras. https://keras.io, 2015.Search in Google Scholar

8. Do, T.-T.; Cai, M.; Pham, T.; Reid, I.: Deep-6Dpose: Recovering 6D object pose from a single RGB image. Techn. Rep., arXiv:1802.10367, 2018.Search in Google Scholar

9. Grün, S.; Höninger, S.; Scheikl, P. M.; Hein, B.; Kröger, T.: Evaluation of domain randomization techniques for transfer learning. In: 2019 19th International Conference on Advanced Robotics (ICAR), p. 481–486, IEEE, 2019.10.1109/ICAR46387.2019.8981654Search in Google Scholar

10. He, K.; Zhang, X.; Ren, S.; Sun, J.: Deep residual learning for image recognition. Techn. Rep., arXiv:1512.03385, 2015.10.1109/CVPR.2016.90Search in Google Scholar

11. Hinterstoisser, S.; Pauly, O.; Heibel, H.; Martina, M.; Bokeloh, M.: An annotation saved is an annotation earned: Using fully synthetic training for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2019.10.1109/ICCVW.2019.00340Search in Google Scholar

12. James, S.; Wohlhart, P.; Kalakrishnan, M.; Kalashnikov, D.; Irpan, A.; Ibarz, J.; Levine, S.; Hadsell, R.; Bousmalis, K.: Sim-to-real via sim-to-sim: Data-efficient robotic grasping via randomized-to-canonical adaptation networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), p. 12627–12637, 2019.10.1109/CVPR.2019.01291Search in Google Scholar

13. Kehl, W.; Manhardt, F.; Tombari, F.; Ilic, S.; Navab, N.: SSD-6D: Making RGB-based 3D detection and 6D pose estimation great again. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), p. 1521–1529, 2017.10.1109/ICCV.2017.169Search in Google Scholar

14. Kendall, A.; Grimes, M.; Cipolla, R.: PoseNet: A convolutional network for real-time 6-DOF camera relocalization. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), p. 2938–2946, 2015.10.1109/ICCV.2015.336Search in Google Scholar

15. Khirodkar, R.; Yoo, D.; Kitani, K.: Domain randomization for scene-specific car detection and pose estimation. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), p. 1932–1940, IEEE, 2019.10.1109/WACV.2019.00210Search in Google Scholar

16. Kleeberger, K.; Huber, M. F.: Single shot 6D object pose estimation. Techn. Rep., arXiv:2004.12729, 2020.10.1109/ICRA40945.2020.9197207Search in Google Scholar

17. Kuznetsova, A.; Rom, H.; Alldrin, N.; Uijlings, J.; Krasin, I.; Pont-Tuset, J.; Kamali, S.; Popov, S.; Malloci, M.; Kolesnikov, A.; Duerig, T.; Ferrari, V.: The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale. IJCV (2020).10.1007/s11263-020-01316-zSearch in Google Scholar

18. Liu, M.-Y.; Breuel, T.; Kautz, J.: Unsupervised image-to-image translation networks. In: Advances in Neural Information Processing Systems (Guyon, I.; Luxburg, U. V.; Bengio, S.; Wallach, H.; Fergus, R.; Vishwanathan, S.; Garnett, R., Eds.), Vol. 30, p. 700–708, Curran Associates, Inc., 2017.Search in Google Scholar

19. Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A. C.: SSD: Single Shot MultiBox Detector. In: Computer Vision – ECCV 2016 (Leibe, B.; Matas, J.; Sebe, N.; Welling, M., Eds.), p. 21–37, Cham: Springer International Publishing, 2016.10.1007/978-3-319-46448-0_2Search in Google Scholar

20. Mahendran, S.; Ali, H.; Vidal, R.: 3D pose regression using convolutional neural networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) Workshops, p. 2174–2182, 2017.10.1109/ICCVW.2017.254Search in Google Scholar

21. Peng, X. B.; Andrychowicz, M.; Zaremba, W.; Abbeel, P.: Sim-to-real transfer of robotic control with dynamics randomization. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), p. 3803–3810, IEEE, 2018.10.1109/ICRA.2018.8460528Search in Google Scholar

22. Rad, M.; Lepetit, V.: BB8: A scalable, accurate, robust to partial occlusion method for predicting the 3D poses of challenging objects without using depth. In: Proceedings of the IEEE International Conference on Computer Vision, p. 3828–3836, 2017.10.1109/ICCV.2017.413Search in Google Scholar

23. Ren, X.; Luo, J.; SolowjoW, E.; Ojea, J. A.; Gupta, A.; Tamar, A.; Abbeel, P.: Domain randomization for active pose estimation. In: 2019 International Conference on Robotics and Automation (ICRA), p. 7228–7234, IEEE, 2019.10.1109/ICRA.2019.8794126Search in Google Scholar

24. Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; Berg, A. C.; Fei-Fei, L.: ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115 (2015) 3, p. 211–252.10.1007/s11263-015-0816-ySearch in Google Scholar

25. Sadeghi, F.; Levine, S.: Cad2RL: Real single-image flight without a single real image. Techn. Rep., arXiv:1611.04201, 2016.10.15607/RSS.2017.XIII.034Search in Google Scholar

26. Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.-C.: MobilenetV2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p. 4510–4520, 2018.10.1109/CVPR.2018.00474Search in Google Scholar

27. Schutera, M.; Hussein, M.; Abhau, J.; Mikut, R.; Reischl, M.: Night-to-Day: Online image-to-image translation for object detection within autonomous driving by night. IEEE Transactions on Intelligent Vehicles (2020).10.1109/TIV.2020.3039456Search in Google Scholar

28. Sundermeyer, M.; Durner, M.; Puang, E. Y.; Marton, Z.-C.; Vaskevicius, N.; Arras, K. O.; Triebel, R.: Multi-path learning for object pose estimation across domains. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), p. 13916–13925, 2020.10.1109/CVPR42600.2020.01393Search in Google Scholar

29. Sundermeyer, M.; Marton, Z.-C.; Durner, M.; Brucker, M.; Triebel, R.: Implicit 3D orientation learning for 6D object detection from RGB images. In: Proceedings of the European Conference on Computer Vision (ECCV), p. 699–715, 2018.10.1007/978-3-030-01231-1_43Search in Google Scholar

30. Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p. 2818–2826, 2016.10.1109/CVPR.2016.308Search in Google Scholar

31. Tobin, J.; Fong, R.; Ray, A.; Schneider, J.; Zaremba, W.; Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), p. 23–30, IEEE, 2017.10.1109/IROS.2017.8202133Search in Google Scholar

32. Tremblay, J.; Prakash, A.; Acuna, D.; Brophy, M.; Jampani, V.; Anil, C.; To, T.; Cameracci, E.; Boochoon, S.; Birchfield, S.: Training deep networks with synthetic data: Bridging the reality gap by domain randomization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, p. 969–977, 2018.10.1109/CVPRW.2018.00143Search in Google Scholar

33. Tremblay, J.; To, T.; Sundaralingam, B.; Xiang, Y.; Fox, D.; Birchfield, S.: Deep object pose estimation for semantic robotic grasping of household objects. Techn. Rep., arXiv:1809.10790, 2018.Search in Google Scholar

34. Wang, M.; Deng, W.: Deep visual domain adaptation: A survey. Neurocomputing 312 (2018), p. 135–153.10.1016/j.neucom.2018.05.083Search in Google Scholar

35. Xiang, Y.; Schmidt, T.; Narayanan, V.; Fox, D.: PoseCNN: A convolutional neural network for 6D object pose estimation in cluttered scenes. Techn. Rep., arXiv:1711.00199, 2017.10.15607/RSS.2018.XIV.019Search in Google Scholar

36. Zhu, J.-Y.; Park, T.; Isola, P.; Efros, A. A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), p. 2223–2232, 2017.10.1109/ICCV.2017.244Search in Google Scholar

Received: 2021-01-28

Accepted: 2021-07-30

Published Online: 2021-10-01

Published in Print: 2021-10-26