Skip to main content
Log in

Efficient Dense Rigid-Body Motion Segmentation and Estimation in RGB-D Video

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Motion is a fundamental grouping cue in video. Many current approaches to motion segmentation in monocular or stereo image sequences rely on sparse interest points or are dense but computationally demanding. We propose an efficient expectation–maximization (EM) framework for dense 3D segmentation of moving rigid parts in RGB-D video. Our approach segments images into pixel regions that undergo coherent 3D rigid-body motion. Our formulation treats background and foreground objects equally and poses no further assumptions on the motion of the camera or the objects than rigidness. While our EM-formulation is not restricted to a specific image representation, we supplement it with efficient image representation and registration for rapid segmentation of RGB-D video. In experiments, we demonstrate that our approach recovers segmentation and 3D motion at good precision.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. available from http://www.ais.uni-bonn.de/download/rigidmultibody.

  2. Due to the high run-time requirements of the method, we evaluated the approach at full frame-rate for sequence lengths that are multiples of 30 frames.

References

  • Agrawal, M., Konolige, K., & Iocchi, L. (2005). Real-time detection of independent motion using stereo. In Proceedings of the IEEE Workshop on Motion.

  • Ayvaci, A., & Soatto, S. (2009). Motion segmentation with occlusions on the superpixel graph. In Proceedings of the IEEE ICCV Workshops.

  • Bishop, C. M. (2006). Pattern Recognition and Machine Learning (Information Science and Statistics). Secaucus: Springer. ISBN 0387310738.

    Google Scholar 

  • Boykov, Y., & Jolly, M. -P. (2001). Interactive graph cuts for optimal boundary & region segmentation of objects in n-d images. In Proceedings of the IEEE International Conference on Computer Vision.

  • Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.

    Article  Google Scholar 

  • Brox, T., Bruhn, A., & Weickert, J. (2006). Variational motion segmentation with level sets. In Proceedings of the European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science (pp. 471–483).

  • Cremers, D., & Soatto, S. (2005). Motion competition: A variational approach to piecewise parametric motion segmentation. International Journal of Computer Vision, 62, 249–265.

    Article  Google Scholar 

  • Delong, A., Osokin, A., Isack, H. N., & Boykov, Y. (2012). Fast approximate energy minimization with label costs. International Journal of Computer Vision, 96(1), 1–27.

    Article  MATH  MathSciNet  Google Scholar 

  • Drost, Bertram, Ulrich, Markus, Navab, Nassir, & Ilic, Slobodan. (2010). Model globally, match locally: Efficient and robust 3D object recognition. In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).

  • Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2010). The Pascal visual object classes (VOC) challenge. International Journal of Computer Vision, 88(2), 303–338.

  • Fitzpatrick, P. (2003). First contact: an active vision approach to segmentation. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

  • Gruber, A., & Weiss, Y. (2004). Multibody factorization with uncertainty and missing data using the EM algorithm. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).

  • Hadfield, S., & Bowden, R. (2014). Scene particles: Unregularized particle based scene flow estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(3), 564–576.

    Article  Google Scholar 

  • Herbst, Evan, Ren, Xiaofeng, & Fox, Dieter. (2013). RGB-D flow: Dense 3-D motion estimation using color and depth. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) (pp. 2276–2282).

  • Herbst, Evan, Henry, Peter, & Fox, Dieter. (2014). Toward online 3-D object segmentation and mapping. In International Conference on Robotics and Automation (ICRA).

  • Hornacek, M., Fitzgibbon, A., & Rother, C. (2014). SphereFlow: 6 DoF scene flow from RGB-D pairs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  • Huguet, F., & Devernay, F. (2007). A variational method for scene flow estimation from stereo sequences. In Proceedings of the IEEE International Conference on Computer Vision (ICCV).

  • Kenney, J., Buckley, T., & Brock, O. (2009). Interactive segmentation for manipulation in unstructured environments. In Proceedings of the IEEE ICRA.

  • Kumar, M. P., Torr, P. H. S., & Zisserman, A. (2005). Learning layered motion segmentations of video. In Proceedings of the International Conference on Computer Vision (ICCV).

  • Ochs, P., Malik, J., & Brox, T. (2014). Segmentation of moving objects by long term video analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(6), 1187–1200. Preprint.

    Article  Google Scholar 

  • Quiroga, J., Devernay, F., & Crowley, J. L. (2013). Local/global scene flow estimation. In Proceedings of the IEEE International Conference on Image Processing (ICIP).

  • Ross, D., Tarlow, D., & Zemel, R. (2010). Learning articulated structure and motion. International Journal of Computer Vision, 88, 214–237.

    Article  Google Scholar 

  • Rothganger, F., Lazebnik, S., Schmid, C., & Ponce, J. (2007). Segmenting, modeling, and matching video clips containing multiple moving objects. IEEE Transactions on Pattern Analysis and Machine Intelligence (pp. 477–491).

  • Roussos, A., Russell, C., Garg, R., & de Agapito, L. (2012). Dense multibody motion estimation and reconstruction from a handheld camera. In Proceedings of the IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

  • Saito, M., Okatani, T., Deguchi, K. (2012). Application of the mean field methods to mrf optimization in computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1680–1687).

  • Schindler, K., & Suter, D. (2006). Two-view multibody structure-and-motion with outliers through model selection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 983–995. ISSN 0162–8828.

  • Sekkati, H., & Mitiche, A. (2006). Concurrent 3-D motion segmentation and 3-D interpretation of temporal sequences of monocular images. IEEE Transactions on Image Processing, 15(3), 641–653.

    Article  Google Scholar 

  • Stückler, J., & Behnke, S. (2013). Efficient dense 3D rigid-body motion segmentation in RGB-D video. In Proceedings of the British Machine Vision Conference (BMVC). BMVA Press.

  • Stückler, J., & Behnke, S. (2014). Multi-resolution surfel maps for efficient dense 3D modeling and tracking. Journal of Visual Communication and Image Representation, 25(1), 137–147.

  • Unger, M., Werlberger, M., Pock, T., & Bischof, H. (2012). Joint motion estimation and segmentation of complex scenes with label costs and occlusion modeling. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1878–1885).

  • Van den Bergh, M., & van Gool, L. (2012). Real-time stereo and flow-based video segmentation with superpixels. In IEEE Workshop on Applications of Computer Vision (WACV).

  • Wang, S., Yu, H., & Hu, R. (2013). 3D video based segmentation and motion estimation with active surface evolution. Journal of Signal Processing Systems, 71(1), 21–34.

  • Weber, J., & Malik, J. (1997). Rigid body segmentation and shape description from dense optical flow under weak perspective. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 139–143.

    Article  Google Scholar 

  • Wedel, A., & Cremers, D. (2011). Stereoscopic scene flow for 3D motion analysis.

  • Zelnik-Manor, L., Machline, M., & Irani, M. (2006). Multi-body factorization with uncertainty: Revisiting motion consistency. International Journal of Computer Vision, 68(1), 27–41.

  • Zhang, G., Jia, J., & Bao, H. (2011). Simultaneous multi-body stereo and segmentation. In Proc. of the IEEE International Conference on Computer Vision (ICCV).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jörg Stückler.

Additional information

Communicated by Tilo Burghardt, Majid Mirmehdi, Walterio Mayol-Cuevas, and Dima Damen.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Stückler, J., Behnke, S. Efficient Dense Rigid-Body Motion Segmentation and Estimation in RGB-D Video. Int J Comput Vis 113, 233–245 (2015). https://doi.org/10.1007/s11263-014-0796-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-014-0796-3

Keywords

Navigation