Occlusion and Collision Aware Smartphone AR Using Time-of-Flight Camera

Tian, Yuan; Ma, Yuxin; Quan, Shuxue; Xu, Yi

doi:10.1007/978-3-030-33723-0_12

Yuan Tian²⁰,
Yuxin Ma²⁰,
Shuxue Quan²⁰ &
…
Yi Xu²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11845))

Included in the following conference series:

International Symposium on Visual Computing

1694 Accesses
4 Citations

Abstract

The development of Visual Inertial Odometry (VIO) systems such as ARKit and ARCore has brought smartphone Augmented Reality (AR) to mainstream. However, interactions between virtual objects and real objects are still limited due to the lack of 3D sensing capability. Recently, smartphone makers have been touting Time-of-Flight (ToF) cameras on their phones. ToF cameras can determine depth information in a photo using infrared light. By understanding the 3D structure of the scene, more AR capabilities can be enabled. In this paper, we propose practical methods to process ToF depth maps in real time and enable occlusion handling and collision detection for AR applications simultaneously. Our experimental results show real time performance and good visual quality for both occlusion rendering and collision detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aliaga, D.G.: Virtual and real object collisions in a merged environment. In: Proceedings of Virtual Reality Software and Technology 1994, pp. 287–298 (1994)
Google Scholar
Apple: ARKit3 (2019). https://developer.apple.com/augmented-reality/arkit/
Breen, D.E., Whitaker, R.T., Rose, E., Tuceryan, M.: Interactive occlusion and automatic object placement for augmented reality. Comput. Graph. Forum 15(3), 11–22 (1996)
Article Google Scholar
Bridson, R., Fedkiw, R., Anderson, J.: Robust treatment of collisions, contact and friction for cloth animation. ACM Trans. Graph. (ToG) 21(3), 594–603 (2002)
Article Google Scholar
Brochu, T., Edwards, E., Bridson, R.: Efficient geometrically exact continuous collision detection. ACM Trans. Graph. (ToG) 31(4), 96 (2012)
Article Google Scholar
Chen, L., Lin, H., Li, S.: Depth image enhancement for Kinect using region growing and bilateral filter. In: Proceedings of the 21st International Conference on Pattern Recognition, pp. 3070–3073 (Nov 2012)
Google Scholar
Curless, B., Levoy, M.: A volumetric method for building complex models from range images. In: Proceedings of SIGGRAPH 1996, pp. 303–312 (1996)
Google Scholar
Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 611–625 (2018)
Article Google Scholar
Holynski, A., Kopf, J.: Fast depth densification for occlusion-aware augmented reality. ACM Trans. Graph. 37(6), 194 (2019)
Google Scholar
Hornácek, M., Rhemann, C., Gelautz, M., Rother, C.: Depth super resolution by rigid body self-similarity in 3D. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1123–1130, June 2013
Google Scholar
Hui, T.-W., Loy, C.C., Tang, X.: Depth map super-resolution by deep multi-scale guidance. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 353–369. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_22
Chapter Google Scholar
Klingensmith, M., Dryanovski, I., Srinivasa, S., Xiao, J.: Chisel: Real time large scale 3D reconstruction onboard a mobile device using spatially hashed signed distance fields. In: Robotics: Science and Systems, vol. 4, p. 1 (2015)
Google Scholar
Ku, J., Harakeh, A., Waslander, S.L.: In defense of classical image processing: fast depth completion on the CPU. In: 2018 15th Conference on Computer and Robot Vision (CRV), pp. 16–22, May 2018
Google Scholar
Lee, D., Lee, S.G., Kim, W.M., Lee, Y.J.: Sphere-to-sphere collision estimation of virtual objects to arbitrarily-shaped real objects for augmented reality. Electron. Lett. 46(13), 915–916 (2010)
Article Google Scholar
Li, Y., Min, D., Do, M.N., Lu, J.: Fast guided global interpolation for depth and motion. ECCV 2016, 717–733 (2016)
Google Scholar
Luo, T., Liu, Z., Pan, Z., Zhang, M.: A virtual-real occlusion method based on GPU acceleration for MR. In: 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), pp. 1068–1069, March 2019
Google Scholar
Matyunin, S., Vatolin, D., Berdnikov, Y., Smirnov, M.: Temporal filtering for depth maps generated by Kinect depth camera. In: 2011 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video, pp. 1–4, May 2011
Google Scholar
Miles, H., Seungkyu, L., Ouk, C., Radu, H.: Time-of-Flight Cameras: Principles, Methods and Applications. Springer, London (2012). https://doi.org/10.1007/978-1-4471-4658-2
Book Google Scholar
Min, D., Choi, S., Lu, J., Ham, B., Sohn, K., Do, M.N.: Fast global image smoothing based on weighted least squares. IEEE Trans. Image Process. 23(12), 5638–5653 (2014)
Article MathSciNet Google Scholar
Newcombe, R.A., et al.: KinectFusion: real-time dense surface mapping and tracking. In: 2011 10th IEEE International Symposium on Mixed and Augmented Reality, pp. 127–136, October 2011
Google Scholar
Nießner, M., Zollhöfer, M., Izadi, S., Stamminger, M.: Real-time 3D reconstruction at scale using voxel hashing. ACM Trans. Graph. (ToG) 32(6), 169 (2013)
Article Google Scholar
Park, J., Kim, H., Tai, Y.-W., Brown, M.S., Kweon, I.: High quality depth map upsampling for 3D-TOF cameras. In: 2011 International Conference on Computer Vision, pp. 1623–1630, November 2011
Google Scholar
Qi, F., Han, J., Wang, P., Shi, G., Li, F.: Structure guided fusion for depth map inpainting. Pattern Recogn. Lett. 34(1), 70–76 (2013)
Article Google Scholar
Richardt, C., Stoll, C., Dodgson, N.A., Seidel, H., Theobalt, C.: Coherent spatiotemporal filtering, upsampling and rendering of RGBZ videos. Comput. Graph. Forum 31(2), 247–256 (2012)
Article Google Scholar
Roxas, M., Hori, T., Fukiage, T., Okamoto, Y., Oishi, T.: Occlusion handling using semantic segmentation and visibility-based rendering for mixed reality. In: Proceedings of the 24th ACM Symposium on Virtual Reality Software and Technology, VRST 2018, pp. 20:1–20:8. ACM, New York (2018)
Google Scholar
Song, X., Dai, Y., Qin, X.: Deep depth super-resolution: learning depth super-resolution using deep convolutional neural network. In: Lai, S.-H., Lepetit, V., Nishino, K., Sato, Y. (eds.) ACCV 2016. LNCS, vol. 10114, pp. 360–376. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54190-7_22
Chapter Google Scholar
Tang, M., Curtis, S., Yoon, S.E., Manocha, D.: ICCD: interactive continuous collision detection between deformable models using connectivity-based culling. IEEE Trans. Visual Comput. Graphics 15(4), 544–557 (2009)
Article Google Scholar
Teschner, M., Heidelberger, B., Mueller, M., Pomeranets, D., Gross, M.: Optimized spatial hashing for collision detection of deformable objects. In: Proceedings of the Vision, Modeling, Visualization (VMV), pp. 47–54 (2003)
Google Scholar
Tian, Y., Li, C., Guo, X., Prabhakaran, B.: Real time stable haptic rendering of 3D deformable streaming surface. In: Proceedings of the 8th ACM on Multimedia Systems Conference, MMSys 2017, pp. 136–146. ACM, New York (2017)
Google Scholar
Uhrig, J., Schneider, N., Schneider, L., Franke, U., Brox, T., Geiger, A.: Sparsity invariant CNNs. In: 2017 International Conference on 3D Vision (3DV), pp. 11–20, October 2017
Google Scholar
Valentin, J.P.C., et al.: Depth from motion for smartphone AR. ACM Trans. Graph. 37(6), 193 (2019)
Google Scholar
Walton, D.R., Steed, A.: Accurate real-time occlusion for mixed reality. In: Proceedings of the 23rd ACM Symposium on Virtual Reality Software and Technology, VRST 2017, pp. 11:1–11:10. ACM, New York (2017)
Google Scholar
Weerasekera, C.S., Dharmasiri, T., Garg, R., Drummond, T., Reid, I.D.: Just-in-time reconstruction: inpainting sparse maps using single view depth predictors as priors. In: International Conference on Robotics and Automation, pp. 1–9 (2018)
Google Scholar
Xie, J., Feris, R., Sun, M.T.: Edge-guided single depth image super resolution. IEEE Trans. Image Process. 25(1), 428–438 (2016)
Article MathSciNet Google Scholar
Xie, J., Feris, R., Yu, S.S., Sun, M.T.: Joint super resolution and denoising from a single depth image. IEEE Trans. Multimedia 17(9), 1525–1537 (2015)
Article Google Scholar
Xu, Y., Wu, Y., Zhou, H.: Multi-scale voxel hashing and efficient 3D representation for mobile augmented reality. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1505–1512 (2018)
Google Scholar
Zhang, Y., Funkhouser, T.: Deep depth completion of a single RGB-D image. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 175–185, June 2018
Google Scholar

Download references

Author information

Authors and Affiliations

OPPO US Research Center, 2479 E Bayshore Rd, Palo Alto, CA, 94303, USA
Yuan Tian, Yuxin Ma, Shuxue Quan & Yi Xu

Authors

Yuan Tian
View author publications
You can also search for this author in PubMed Google Scholar
Yuxin Ma
View author publications
You can also search for this author in PubMed Google Scholar
Shuxue Quan
View author publications
You can also search for this author in PubMed Google Scholar
Yi Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuan Tian .

Editor information

Editors and Affiliations

University of Nevada, Reno, NV, USA
George Bebis
NASA Ames Research Center, Moffett Field, CA, USA
Richard Boyle
University of Nevada, Reno, NV, USA
Bahram Parvin
Desert Research Institute, Reno, NV, USA
Darko Koracin
Lawrence Berkeley National Laboratory, Berkeley, CA, USA
Daniela Ushizima
Latent AI, Palo Alto, CA, USA
Sek Chai
Texas A&M University, College Station, TX, USA
Shinjiro Sueda
Louisiana State University, Baton Rouge, LA, USA
Xin Lin
University of North Carolina at Charlotte, Charlotte, NC, USA
Aidong Lu
École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
Daniel Thalmann
Notre Dame University, Notre Dame, IN, USA
Chaoli Wang
Bosch Research North America, Palo Alto, CA, USA
Panpan Xu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tian, Y., Ma, Y., Quan, S., Xu, Y. (2019). Occlusion and Collision Aware Smartphone AR Using Time-of-Flight Camera. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2019. Lecture Notes in Computer Science(), vol 11845. Springer, Cham. https://doi.org/10.1007/978-3-030-33723-0_12

Download citation

DOI: https://doi.org/10.1007/978-3-030-33723-0_12
Published: 21 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33722-3
Online ISBN: 978-3-030-33723-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics