Learning to Reconstruct 3D Structure from Object Motion

Liu, Wentao; Dou, Haobin; Wu, Xihong

doi:10.1007/978-3-319-26532-2_15

Wentao Liu¹⁷,
Haobin Dou¹⁷ &
Xihong Wu¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9489))

Included in the following conference series:

International Conference on Neural Information Processing

2268 Accesses
2 Citations

Abstract

In this paper, we propose a new approach for reconstructing 3D structure from motion parallax. Instead of obtaining 3D structure from multi-view geometry or factorization, a Deep Neural Network (DNN) based method is proposed without assuming the camera model explicitly. In the proposed method, the targets are first split into connected 3D corners, and then the DNN regressor is trained to estimate the relative 3D structure of each corner from the target rotation. Finally, a temporal integration is performed to further improve the reconstruction accuracy. The effectiveness of the method is proved by a typical experiment of the Kinetic Depth Effect (KDE) in human visual system, in which the DNN regressor reconstructs the structure of a rotating 3D bent wire. The proposed method is also applied to reconstruct another two real targets. Experimental results on both synthetic and real images show that the proposed method is accurate and effective.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Procrustean Regression Networks: Learning 3D Structure of Non-rigid Objects from 2D Annotations

ENG: End-to-End Neural Geometry for Robust Depth and Pose Estimation Using CNNs

Deep learning-based 3D reconstruction: a survey

Article 28 January 2023

References

Wallach, H., O’Connell, D.N.: The kinetic depth effect. J. Exp. Psychol. 45, 205–217 (1953)
Article Google Scholar
Tomasi, C., Kanade, T.: Shape and motion from image streams under orthography: a factorization method. Int. J. Comput. Vision 9, 137–154 (1992)
Article Google Scholar
Shi, J., Tomasi, C.: Good features to track. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Seattle, pp. 593–600 (1994)
Google Scholar
Palmer, S.E.: Vision Science: Photons to Phenomenology. MIT Press, Cambridge (1999)
Google Scholar
Bregler, C., Hertzmann, A., Biermann, H.: Recovering non-rigid 3D shape from image streams. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Hilton Head Island, pp. 690–696 (2000)
Google Scholar
Triggs, B., McLauchlan, P.F., Hartley, R.I., Fitzgibbon, A.W.: Bundle adjustment – a modern synthesis. In: Triggs, B., Zisserman, A., Szeliski, R. (eds.) ICCV-WS 1999. LNCS, vol. 1883, p. 298. Springer, Heidelberg (2000)
Chapter Google Scholar
Gruber, A., Weiss, Y.: Multibody factorization with uncertainty and missing data using the EM algorithm. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Washington, DC, pp. 707–714 (2004)
Google Scholar
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part I. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006)
Chapter Google Scholar
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18, 1527–1554 (2006)
Article MathSciNet MATH Google Scholar
Snavely, N., Seitz, S.M., Szeliski, R.: Photo tourism: exploring photo collections in 3D. ACM Trans. Graph. 5, 835–846 (2006)
Article Google Scholar
Klein, G., Murray, D.: Parallel tracking and mapping for small AR workspaces. In: 6th IEEE and ACM International Symposium on Mixed and Augmented Reality, Nara, pp. 225–234 (2007)
Google Scholar
Saxena, A., Sun, M., Ng, A.Y.: Learning 3-D scene structure from a single still image. In: International Conference on Computer Vision Workshop, Rio de Janeiro, pp. 1–8 (2007)
Google Scholar
Saxena, A., Schulte, J., Ng, A.Y.: Depth estimation using monocular and stereo cues. In: International Joint Conference on Artificial Intelligence, Hyderabad, pp, 2197–2203 (2007)
Google Scholar
Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2, 1–127 (2009)
Article MATH Google Scholar
Ross, D.A., Tarlow, D., Zemel, R.S.: Learning articulated structure and motion. Int. J. Comput. Vision 88, 214–237 (2010)
Article Google Scholar
Hedau, V., Hoiem, D., Forsyth, D.: Recovering free space of indoor scenes from a single image. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Rhode Island, pp. 2807–2814 (2012)
Google Scholar
Xiao, J., Russell, B.C., Torralba, A.: Localizing 3D cuboids in single-view images. In: Advances in Neural Information Processing Systems, Lake Tahoe, pp. 746–754 (2012)
Google Scholar
Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for scene labeling. Pattern Anal. Mach. Intell. 35, 1915–1929 (2013)
Article Google Scholar
Fouhey, D.F., Gupta, A., Hebert, M.: Data-driven 3D primitives for single image understanding. In: International Conference on Computer Vision, Sydney, pp. 3392–3399 (2013)
Google Scholar
Tanskanen, P., Kolev, K., Meier, L., Camposeco, F., Saurer, O., Pollefeys, M.: Live metric 3D reconstruction on mobile phones. In: International Conference on Computer Vision, Sydney, pp. 65–72 (2013)
Google Scholar
Li, B., Shen, C., Dai, Y., Hengel, A., He, M.: Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, pp. 1119–1127 (2015)
Google Scholar
Resch, B., Lensch, H.P.A., Wang, O., Pollefeys, M., Sorkine-Hornung, A.: Scalable Structure from Motion for Densely Sampled Videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, pp. 3936–3944 (2015)
Google Scholar

Download references

Acknowledgement

The work was supported in part by the National Basic Research Program of China (2013CB329304), the “Twelfth Five-Year” National Science&Technology Support Program of China (No.2012BAI12B01), the Major Project of National Social Science Foundation of China (No.12&ZD119), the research special fund for public welfare industry of health (201202001) and National Natural Science Foundation of China (No.81170906).

Author information

Authors and Affiliations

Key Lab of Machine Perception (MOE), Speech and Hearing Research Center, Peking University, Beijing, People’s Republic of China
Wentao Liu, Haobin Dou & Xihong Wu

Authors

Wentao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Haobin Dou
View author publications
You can also search for this author in PubMed Google Scholar
Xihong Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wentao Liu .

Editor information

Editors and Affiliations

University of Istanbul, Istanbul, Turkey
Sabri Arik
University at Qatar, Doha, Qatar
Tingwen Huang
Tunku Abdul Rahman University College, Kuala Lumpur, Malaysia
Weng Kin Lai
University of Science Technology, Wuhan, China
Qingshan Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, W., Dou, H., Wu, X. (2015). Learning to Reconstruct 3D Structure from Object Motion. In: Arik, S., Huang, T., Lai, W., Liu, Q. (eds) Neural Information Processing. ICONIP 2015. Lecture Notes in Computer Science(), vol 9489. Springer, Cham. https://doi.org/10.1007/978-3-319-26532-2_15

Download citation

DOI: https://doi.org/10.1007/978-3-319-26532-2_15
Published: 12 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26531-5
Online ISBN: 978-3-319-26532-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Learning to Reconstruct 3D Structure from Object Motion

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Procrustean Regression Networks: Learning 3D Structure of Non-rigid Objects from 2D Annotations

ENG: End-to-End Neural Geometry for Robust Depth and Pose Estimation Using CNNs

Deep learning-based 3D reconstruction: a survey

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Learning to Reconstruct 3D Structure from Object Motion

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Procrustean Regression Networks: Learning 3D Structure of Non-rigid Objects from 2D Annotations

ENG: End-to-End Neural Geometry for Robust Depth and Pose Estimation Using CNNs

Deep learning-based 3D reconstruction: a survey

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation