skip to main content
10.1145/3208806.3208813acmconferencesArticle/Chapter ViewAbstractPublication Pagesweb3dConference Proceedingsconference-collections
research-article

Improving mobile MR applications using a cloud-based image segmentation approach with synthetic training data

Authors Info & Claims
Published:20 June 2018Publication History

ABSTRACT

In this paper, we show how the quality of augmentation in mobile Mixed Reality applications can be improved using a cloud-based image segmentation approach with synthetic training data. Many modern Augmented Reality frameworks are based on visual inertial odometry on mobile devices and therefore have limited access to tracking hardware (e.g., depth sensor). Consequently, tracking still suffers from drift that makes it difficult to utilize in use cases that require a higher precision. To improve tracking quality, we propose a cloud tracking approach that uses machine learning based image segmentation to recognize known objects in a real scene, which allows us to estimate a precise camera pose. Augmented Reality applications that utilize our web service can use the resulting camera pose to correct drift from time to time, while still using local tracking between key frames. Moreover, the device's position in the real world, when starting the application, is usually used as reference coordinate system. Therefore, we simplify the authoring of MR applications significantly due to a well-defined coordinate system, which is context-based and not dependend on the starting position of a user. We present all steps from web-based initialization over the generation of synthetic training data up to usage in production. In addition, we describe the underlying algorithms in detail. Finally, we show a mobile Mixed Reality application, which is based on this novel approach and discuss its advantages.

References

  1. 2017. Apple ARKit. (2017). https://developer.apple.com/arkit/.Google ScholarGoogle Scholar
  2. 2017. Apple CoreML. (2017). https://developer.apple.com/documentation/coreml.Google ScholarGoogle Scholar
  3. 2017. Google ARCore. (2017). https://developers.google.com/ar/.Google ScholarGoogle Scholar
  4. 2018. Blender. (2018). https://www.blender.org/.Google ScholarGoogle Scholar
  5. 2018. Node.js. (2018). https://nodejs.org/en/.Google ScholarGoogle Scholar
  6. 2018. Unity Game Engine. (2018). https://unity3d.com/.Google ScholarGoogle Scholar
  7. Vijay Badrinarayanan, Alex Kendall, and Roberto Cipolla. 2017. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE transactions on pattern analysis and machine intelligence 39, 12 (2017), 2481--2495.Google ScholarGoogle Scholar
  8. Igor Barros Barbosa, Marco Cristani, Barbara Caputo, Aleksander Rognhaugen, and Theoharis Theoharis. 2017. Looking beyond appearances: Synthetic training data for deep CNNs in re-identification. Computer Vision and Image Understanding (2017).Google ScholarGoogle Scholar
  9. Andreas Dietze, Marcel Klomann, Yvonne Jung, Michael Englert, Sebastian Rieger, Achim Rehberger, Silvan Hau, and Paul Grimm. 2017. SMULGRAS: A Platform for Smart Multicodal Graphics Search. In Proceedings Web3D '17. ACM, New York, USA, 17:1--17:9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Bert M Haralick, Chung-Nan Lee, Karsten Ottenberg, and Michael Nölle. 1994. Review and analysis of solutions of the three point perspective pose estimation problem. International journal of computer vision 13, 3 (1994), 331--356. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Tadanobu Inoue, Subhajit Chaudhury, Giovanni De Magistris, and Sakyasingha Dasgupta. 2017. Transfer learning from synthetic to real images using variational autoencoders for robotic applications. arXiv preprint arXiv:1709.06762 (2017).Google ScholarGoogle Scholar
  12. Alex Kendall and Roberto Cipolla. 2015. Modelling Uncertainty in Deep Learning for Camera Relocalization. CoRR abs/1509.05909 (2015). http://arxiv.org/abs/1509.05909Google ScholarGoogle Scholar
  13. Alex Kendall, Matthew Grimes, and Roberto Cipolla. 2015. Convolutional networks for real-time 6-DOF camera relocalization. CoRR abs/1505.07427 (2015). http://arxiv.org/abs/1505.07427Google ScholarGoogle Scholar
  14. Marcel Klomann, Michael Englert, Achim Rehberger, Andreas Dietze, Timo Geier, Sebastian Rieger, Paul Grimm, and Yvonne Jung. 2017. NetFlinCS: A Hybrid Cloud-based Framework to Allow Context-based Detection and Surveillance. In Proceedings VSMM '17. IEEE. 8 p.Google ScholarGoogle ScholarCross RefCross Ref
  15. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3431--3440.Google ScholarGoogle ScholarCross RefCross Ref
  17. E. Marchand, H. Uchiyama, and F. Spindler. 2016. Pose Estimation for Augmented Reality: A Hands-On Survey. IEEE Transactions on Visualization and Computer Graphics 22, 12 (2016), 2633--2651. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. David Nistér, Oleg Naroditsky, and James Bergen. 2004. Visual Odometry. In Proceedings CVPR 2004. IEEE, 652--659.Google ScholarGoogle Scholar
  19. Benjamin Planche, Ziyan Wu, Kai Ma, Shanhui Sun, Stefan Kluckner, Terrence Chen, Andreas Hutter, Sergey Zakharov, Harald Kosch, and Jan Ernst. 2017. Depthsynth: Real-time realistic synthetic data generation from cad models for 2.5 d recognition. arXiv preprint arXiv:1702.08558 (2017).Google ScholarGoogle Scholar
  20. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google ScholarGoogle Scholar
  21. Darko Stanimirovic, Nina Damasky, Sabine Webel, Dirk Koriath, Andrea Spillner, and Daniel Kurz. 2014. A Mobile Augmented Reality System to Assist Auto Mechanics. In Intl. Symposium on Mixed and Augmented Reality (ISMAR). IEEE, New York, USA.Google ScholarGoogle ScholarCross RefCross Ref
  22. Baochen Sun and Kate Saenko. 2014. From Virtual to Reality: Fast Adaptation of Virtual Object Detectors to Real Domains.. In BMVC, Vol. 1. 3.Google ScholarGoogle Scholar
  23. Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich, et al. 2015. Going deeper with convolutions. CVPR.Google ScholarGoogle Scholar
  24. Sabine Webel, Uli Bockholt, Timo Engelke, Nirit Gavish, Manuel Olbrich, and Carsten Preusche. 2013. An Augmented Reality Training Platform for Assembly and Maintenance Skills. Robot. Auton. Syst. 61, 4 (2013), 398--403. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Wu, L. Ma, and X. Hu. 2017. Delving deeper into convolutional neural networks for camera relocalization. In 2017 IEEE International Conference on Robotics and Automation (ICRA). 5644--5651.Google ScholarGoogle Scholar

Index Terms

  1. Improving mobile MR applications using a cloud-based image segmentation approach with synthetic training data

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              Web3D '18: Proceedings of the 23rd International ACM Conference on 3D Web Technology
              June 2018
              199 pages
              ISBN:9781450358002
              DOI:10.1145/3208806

              Copyright © 2018 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 20 June 2018

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              Overall Acceptance Rate27of71submissions,38%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader