Monocular scene flow estimation via variational method

Xiao, Degui; Yang, Qiuwei; Yang, Bing; Wei, Wei

doi:10.1007/s11042-015-3091-6

Monocular scene flow estimation via variational method

Published: 02 December 2015

Volume 76, pages 10575–10597, (2017)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Degui Xiao¹,
Qiuwei Yang¹,
Bing Yang² &
…
Wei Wei³

684 Accesses
15 Citations
Explore all metrics

Abstract

Scene flow provides the 3D motion field of point clouds, which correspond to image pixels. Current algorithms usually need complex stereo calibration before estimating flow, which has strong restrictions on the position of the camera. This paper proposes a monocular camera scene flow estimation algorithm. Firstly, an energy functional is constructed, where three important assumptions are turned into data terms derivation: a brightness constancy assumption, a gradient constancy assumption, and a short time object velocity constancy assumption. Two smooth operators are used as regularization terms. Then, an occluded map computation algorithm is used to ensure estimating scene flow only on un-occluded points. After that, the energy functional is solved with a coarse-to-fine variational equation on Gaussian pyramid, which can prevent the iteration from converging to a local minimum value. The experiment results show that the algorithm can use three sequential frames at least to get scene flow in world coordinate, without optical flow or disparity inputting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Differential Optical Flow Estimation Under Monocular Epipolar Line Constraint

A Comparison of Scene Flow Estimation Paradigms

Robust Optical Flow Estimation Using the Monocular Epipolar Geometry

References

Adiv G (1985) Determining three-dimensional motion and structure from optical flow generated by several moving objects. IEEE Trans Pattern Anal Mach Intell 4:384–401
Article Google Scholar
Alcantarilla PF, Yebes JJ, Almazn J, Bergasa LM (2012) On combining visual slam and dense scene flow to increase the robustness of localization and mapping in dynamic environments. IEEE International Conference on Robotics and Automation (ICRA) 1290–1297
Baker S, Scharstein D, Lewis J, Roth S, Black MJ, Szeliski R (2011) A database and evaluation methodology for optical flow. Int J Comput Vis 92(1):1–31
Article Google Scholar
Basha T, Moses Y, Kiryati N (2013) Multi-view scene flow estimation: a view centered variational approach. Int J Comput Vis 101(1):6–21
Article MathSciNet MATH Google Scholar
Birkbeck N, Cobzas D, Jagersand M (2011) Basis constrained 3d scene flow on a dynamic proxy. IEEE International Conference on Computer Vision (ICCV) 1967–1974
Brox T, Bruhn A, Papenberg N, Weickert J (2004) High accuracy optical flow estimation based on a theory for warping. Eur Conf Comput Vis 25–36. doi:10.1007/978-3-540-24673-2_3
Civera J, Davison AJ, Montiel J (2008) Inverse depth parametrization for monocular slam. IEEE Trans Robot 24(5):932–945
Article Google Scholar
Cruz L, Lucio D, Velho L (2012) Kinect and rgbd images: challenges and applications. IEEE Conference on Graphics, Patterns and Images Tutorials (SIBGRAPI-T) 36–49
Dame A, Prisacariu V A, Ren C Y, Reid I (2013) Dense reconstruction using 3d object shape priors. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 1288–1295
Geiger A, Ziegler J, Stiller C (2011) Stereo scan: dense 3d reconstruction in real-time. IEEE Conference on Intelligent Vehicles Symposium 963–968
Geiger A, Lenz P, Stiller C, Urtasun R (2013) Vision meets robotics: the kitti dataset. Int J Robot Res 32(11):1231–1237
Article Google Scholar
Henry P, Krainin M, Herbst E, Ren X, Fox D (2012) Rgb-d mapping: using kinect-style depth cameras for dense 3d modeling of indoor environments. Int J Robot Res 31(5):647–663
Article Google Scholar
Herbst E, Ren X, Fox D (2013) Rgb-d flow: dense 3-d motion estimation using color and depth. IEEE International Conference on Robotics and Automation (ICRA) 2276–2282
Hornacek M, Rhemann C, Gelautz M, Rother C (2013) Depth super resolution by rigid body self-similarity in 3d. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 1123–1130
Huang S, Dissanayake G (2007) Convergence and consistency analysis for extended kalman filter based slam. IEEE Trans Robot 23(5):1036–1049
Article Google Scholar
Izadi S, Kim D, Hilliges O, Molyneaux D, Newcombe R, Kohli P, Shotton J, Hodges S, Freeman D, Davison A (2011) Kinect fusion: real time 3d reconstruction and interaction using a moving depth camera. ACM symposium on User interface software and technology 559–568
Jan Č, Sanchez-Riera J, Horaud R (2011) Scene flow estimation by growing correspondence seeds. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 3129–3136
Khoshelham K, Elberink S O (2012) Accuracy and resolution of kinect depth data for indoor mapping applications. Sensors 1437–1454
Klette R (2015) http://ccv.wordpress.fos.auckland.ac.nz/eisats/. Accessed 2 May 2014
Letouzey A, Petit B, Boyer E (2011) Scene flow from depth and color images. Proc Br Mach Vis Conf 46:1–11. doi:10.5244/C.25.46
Google Scholar
Newcombe RA, Lovegrove SJ, Davison AJ (2011) Dtam: dense tracking and mapping in real-time. IEEE International Conference on Computer Vision (ICCV) 2320–2327
Nie L, Akbari M, Li T, Chua T (2014) A joint local–global approach for medical terminology assignment. MedIR@SIGIR 24–27
Nie L, Li T, Akbari M, Shen J, Chua T (2014) WenZher: comprehensive vertical search for healthcare domain. ACM Conference on Research and Development in Information Retrieval (SIGIR) 1245–1246
Nie L, Zhang L, Yang Y, Wang M, Hong R, Chua T (2015) Beyond doctors: future health prediction from multimedia and multimodal observations. ACM Multimedia 591–600
Nie L, Zhao Y, Akbari M, Shen J, Chua T (2015) Bridging the vocabulary gap between health seekers and healthcare knowledge. IEEE Trans Knowl Data Eng 27(2):396–409
Article Google Scholar
Nie L, Wang M, Zhang L, Yan S, Zhang B, Chua T (2015) Disease inference from health-related questions via sparse deep learning. IEEE Trans Knowl Data Eng 27(8):2107–2119
Article Google Scholar
Stoyanov D (2012) Stereoscopic scene flow for robotic assisted minimally invasive surgery. Medical Image Computing and Computer-Assisted Intervention–MICCAI 2012:479–486
Google Scholar
Vedula S, Baker S, Rander P, Collins R, Kanade T (1999) Three dimensional scene flow. IEEE Int Conf Comput Vis 2:722–729
Google Scholar
Vogel C, Schindler K, Roth S (2011) 3d scene flow estimation with a rigid motion prior. IEEE International Conference on Computer Vision (ICCV) 1291–1298
Wedel A, Brox T, Vaudrey T, Rabe C, Franke U, Cremers D (2011) Stereoscopic scene flow computation for 3d motion understanding. Int J Comput Vis 95(1):29–51
Article MATH Google Scholar
Yan Y, Ricci E, Subramanian R, Lanz O, Sebe N (2013) No matter where you are: flexible graph-guided multi-task learning for multi-view head pose classification under target motion. IEEE International Conference on Computer Vision (ICCV) 1177–1184
Yan Y, Liu G, Ricci E, Sebe N (2014) Multi-task linear discriminant analysis for multi-view action recognition. IEEE Trans Image Process (TIP) 23(12):5599–5611
Article Google Scholar
Yan Y, Yang Y, Meng D, Liu G, Tong W (2015) Event oriented dictionary learning for complex event detection. IEEE Trans Image Process (TIP) 24(6):1867–1878
Article MathSciNet Google Scholar
Yan Y, Ricci E, Liu G, Sebe N (2015) Egocentric daily activity recognition via multitask clustering. IEEE Trans Image Process (TIP) 24(10):2984–2995
Article MathSciNet Google Scholar
Yang Z, Xiong Z, Zhang Y, Wang J, Wu F (2013) Depth acquisition from density modulated binary patterns. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 25–32
Zhang Z (2012) Microsoft kinect sensor and its effect. IEEE MultiMedia 19(2):4–10
Article Google Scholar

Download references

Acknowledgments

The authors would like to thank the anonymous reviewers for their insightful comments and suggestions. This work is supported in part by National Natural Science Foundation of China (Grant No.61272062, 61300036), the Projects in the National Science & Technology Pillar Program (Grant No.2013BAH38F01).

Author information

Authors and Affiliations

College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
Degui Xiao & Qiuwei Yang
School of Education, Hubei University, Wuhan, China
Bing Yang
School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
Wei Wei

Authors

Degui Xiao
View author publications
You can also search for this author inPubMed Google Scholar
Qiuwei Yang
View author publications
You can also search for this author inPubMed Google Scholar
Bing Yang
View author publications
You can also search for this author inPubMed Google Scholar
Wei Wei
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Bing Yang.

Appendix

1.1 Iterative process

Our iterative process is divided into two layers. The outer layer is constructed by the Gaussian pyramid from coarse to fine, iterations at this level is for getting unknown quantity itself. At each layer of the pyramid, the inner iteration can get the unknown variables incremental from SOR iteration. As Fig. 12 shows, the Gaussian pyramid layers is built according to input outer layer iteration value, during the build process, we calculate the scaling factor for each layer. In order to ensure correct correspondence between the image space and the world space, the scaling factor should not only work for image resolution, but also for focal length of the camera and the optical center position. At the inner iteration process, the scene flow quantity initial values are set to zero. By setting the value of inner iteration times, starting from the lowest resolution level of Gaussian pyramid, SOR iteration obtains the unknown value increment before convergence or reaching the number of iterations. Every final value of inner iteration will be added to current outer layer, and set as initial value of next outer layer. Algorithm 2 shows the whole iteration process.

It is necessary to determine the number of iterations of the inner and outer layers in the iterative process. The number of iterations of outer layers determines the pyramid layers. Our experiments set the outer iteration number as 10 due to memory limitations. Figure 13 shows the relationship between erroneous percentage and inner iteration numbers, the iterations are processed at the last outer layer. We set the number of inner iteration as 10, for the polyline shows: iterations more than 10 will cause over smooth.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xiao, D., Yang, Q., Yang, B. et al. Monocular scene flow estimation via variational method. Multimed Tools Appl 76, 10575–10597 (2017). https://doi.org/10.1007/s11042-015-3091-6

Download citation

Received: 01 August 2015
Revised: 21 October 2015
Accepted: 17 November 2015
Published: 02 December 2015
Issue Date: April 2017
DOI: https://doi.org/10.1007/s11042-015-3091-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Monocular scene flow estimation via variational method

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Differential Optical Flow Estimation Under Monocular Epipolar Line Constraint

A Comparison of Scene Flow Estimation Paradigms

Robust Optical Flow Estimation Using the Monocular Epipolar Geometry

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

1.1 Iterative process

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now