A Feature-based Approach for Dense Segmentation and Estimation of Large Disparity Motion

Wills, Josh; Agarwal, Sameer; Belongie, Serge

doi:10.1007/s11263-006-6660-3

A Feature-based Approach for Dense Segmentation and Estimation of Large Disparity Motion

Published: 01 March 2006

Volume 68, pages 125–143, (2006)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Josh Wills¹,
Sameer Agarwal¹ &
Serge Belongie¹

271 Accesses
32 Citations
3 Altmetric
Explore all metrics

Abstract

We present a novel framework for motion segmentation that combines the concepts of layer-based methods and feature-based motion estimation. We estimate the initial correspondences by comparing vectors of filter outputs at interest points, from which we compute candidate scene relations via random sampling of minimal subsets of correspondences. We achieve a dense, piecewise smooth assignment of pixels to motion layers using a fast approximate graphcut algorithm based on a Markov random field formulation. We demonstrate our approach on image pairs containing large inter-frame motion and partial occlusion. The approach is efficient and it successfully segments scenes with inter-frame disparities previously beyond the scope of layer-based motion segmentation methods. We also present an extension that accounts for the case of non-planar motion, in which we use our planar motion segmentation results as an initialization for a regularized Thin Plate Spline fit. In addition, we present applications of our method to automatic object removal and to structure from motion.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

Article Open access 08 October 2020

LSD-SLAM: Large-Scale Direct Monocular SLAM

Model Based Training, Detection and Pose Estimation of Texture-Less 3D Objects in Heavily Cluttered Scenes

References

Ayer, S. and Sawhney, H. 1995. Layered representation of motion video using robust maximum-likelihood estimation of mixture models and mdl encoding. In ICCV 95, pp. 777–784.
Belongie, S., Malik, J., and Puzicha, J. 2002. Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Analysis and Machine Intelligence, 24(4):509–522.
Article Google Scholar
Belongie, S. and Wills, J. 2004. Structure from periodic motion. In Spatial Coherence for Visual Motion Analysis, Prague, Czech Republic.
Black, M. and Jepson, A. 1996. Estimating optical flow in segmented images using variable-order parametric models with local deformations. T-PAMI, 18:972–986.
Google Scholar
Bookstein, F.L. 1989. Principal warps: Thin-plate splines and decomposition of deformations. IEEE Trans. Pattern Analysis and Machine Intelligence, 11(6):567–585.
Article MATH Google Scholar
Boykov, Y., Veksler, O., and Zabih, R. 1999. Approximate energy minimization with discontinuities. In IEEE International Workshop on Energy Minimization Methods in Computer Vision, pp. 205–220.
Boykov, Y., Veksler, O., and Zabih, R. 2001. Efficient approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(12):1222–1239.
Article Google Scholar
Brand, M. 2001. Morphable 3d models from video. In CVPR01, II: pp. 456–463.
Google Scholar
Brand, M. and Bhotika, R. 2001. Flexible flow for 3d nonrigid tracking and shape recovery. In CVPR01, I: pp. 315–322.
Google Scholar
Chui, H. and Rangarajan, A. 2000. A new algorithm for non-rigid point matching. In Proc. IEEE Conf. Comput. Vision and Pattern Recognition, pp. 44–51.
Cutler, R. and Davis, L. 2000. Robust real-time periodic motion detection, analysis, and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8).
Darrell, T. and Pentland, A. 1991. Robust estimation of a multi-layer motion representation. In Proc. IEEE Workshop on Visual Motion, Princeton, NJ.
Donato, G. and Belongie, S. 2002. Approximate thin plate spline mappings. In Proc. 7th Europ. Conf. Comput. Vision, Vol. 2, pp. 531–542.
Google Scholar
Duchon, J. 1976. Fonction-spline et esperances conditionnelles de champs gaussiens. Ann. Sci. Univ. Clermont Ferrand II Math., 14:19–27.
MathSciNet Google Scholar
Duchon, J. 1977. Splines minimizing rotation-invariant semi-norms in Sobolev spaces. In Constructive Theory of Functions of Several Variables, W. Schempp and K. Zeller (Eds.), Berlin: Springer-Verlag, pp. 85–100.
Google Scholar
Fischler, M. and Bolles, R. 1981. Random sample consensus: A paradigm for model fitting with application to image analysis and automated cartography. Commun. Assoc. Comp. Mach., 24:381–395.
MathSciNet Google Scholar
Förstner, W. and Gülch, E. 1987. A fast operator for detection and precise location of distinct points, corners and centres of circular features. In Intercommission Conference on Fast Processing of Photogrammetric Data, Interlaken, Switzerland, pp. 281–305.
Freeman, W.T. and Adelson, E.H. 1991. The design and use of steerable filters. IEEE Trans. Pattern Analysis and Machine Intelligence, 13(9):891–906.
Article Google Scholar
Girosi, F., Jones, M., and Poggio, T. 1995. Regularization theory and neural networks architectures. Neural Computation, 7(2):219–269.
Google Scholar
Hartley, R.I. and Zisserman, A. 2000. Multiple View Geometry in Computer Vision, Cambridge University Press, ISBN: 0521623049.
Irani, M. and Anandan, P. 1999. All about direct methods. In Vision Algorithms: Theory and Practice, W. Triggs, A. Zisserman and R. Szeliski (Eds.), Springer-Verlag.
Irani, M. and Peleg, S. 1993. Motion analysis for image enhancement: Resolution, occlusion, and transparency. Journal of Visual Communication and Image Representation, 4(4):324–335.
Article Google Scholar
Jones, D. and Malik, J. 1992. Computational framework to determining stereo correspondence from a set of linear spatial filters. Image and Vision Computing, 10(10):699–708.
Article Google Scholar
Kleinberg, J. and Tardos, E. 1999. Approximate algorithms for classification problems with pairwise relationships: Metric labelling and markov random fields. In Proceedings of the IEEE Symposium on Foundations of Computer Science.
Lhuillier, M. and Quan, L. 2002. Match propagation for image-based modeling and rendering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24 (8):1140–1146.
Article Google Scholar
Liu, Y., Collins, R., and Tsin, Y. 2002. Gait sequence analysis using Frieze patterns. In Proc. 7th Europ. Conf. Comput. Vision.
Meinguet, J. 1979. Multivariate interpolation at arbitrary points made simple. J. Appl. Math. Phys. (ZAMP), 5:439–468.
Google Scholar
Mikolajczyk, K. and Schmid, C. 2002. An affine invariant interest point detector. In European Conference on Computer Vision, Springer, Copenhagen, pp. 128–142.
Google Scholar
Odobez, J.-M. and Bouthemy, P. 1998. Direct incremental model-based image motion segmentation for video analysis. Signal Processing, 66(2):143–155.
Article MATH Google Scholar
Powell, M.J.D. 1995. A thin plate spline method for mapping curves into curves in two dimensions. In Computational Techniques and Applications (CTAC95), Melbourne, Australia.
Sawhney, H.S. and Hanson, A.R. 1993. Trackability as a cue for potential obstacle identification and 3D description. International Journal of Computer Vision, 11(3):237–265.
Article Google Scholar
Seitz, S.M. and Dyer, C.R. 1996. View morphing. In SIGGRAPH, pp. 21–30.
Smola, A. and Schölkopf, B. 2000. Sparse greedy matrix approximation for machine learning. In ICML.
Soatto, S. and Yezzi, A.J. 2002. DEFORMOTION: Deforming motion, shape average and the joint registration and segmentation of images. In European Conference on Computer Vision, Springer, Copenhagen, pp. 32–47.
Szeliski, R. and Coughlan, J. 1994. Hierarchical spline-based image registration. In IEEE Conference on Computer Vision Pattern Recognition, Seattle, Washington, pp. 194–201.
Szeliski, R. and Shum, H.-Y. 1996. Motion estimation with quadtree splines. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(12):1199–1210.
Article Google Scholar
Tomasi, C. and Kanade, T. 1991. Factoring image sequences into shape and motion. In Proc. IEEE Workshop on Visual Motion, IEEE.
Torr, P.H.S. 1998. Geometric motion segmentation and model selection. In Philosophical Transactions of the Royal Society A, J. Lasenby, A. Zisserman, R. Cipolla, and H. Longuet-Higgins (Eds.), Roy Soc, pp. 1321–1340.
Torr, P.H.S., Szeliski, R., and Anandan, P. 1999. An integrated Bayesian approach to layer extraction from image sequences. In Seventh International Conference on Computer Vision, Vol. 2, pp. 983–991.
Article Google Scholar
Torr, P.H.S., Zisserman, A., and Murray, D.W. 1995. Motion clustering using the trilinear constraint over three views. In Europe-China Workshop on Geometrical Modelling and Invariants for Computer Vision, R. Mohr and C. Wu (Eds.), Springer-Verlag, pp. 118–125.
Torresani, L., Bregler, C., and Hertzmann, A. 2003. Learning non-rigid 3d shape from 2d motion. In NIPS 2003.
Torresani, L. and Hertzmann, A. 2004. Automatic non-rigid 3d modeling from video. In ECCV04, Vol. II, pp. 299–312.
Google Scholar
Torresani, L., Yang, D., Alexander, G., and Bregler, C. 2001. Tracking and modelling non-rigid objects with rank constraints. In IEEE Conference on Computer Vision and Pattern Recognition, Kauai, Hawaii, pp. 493–500.
Vidal, R. and Ma, Y. 2004. A unified algebraic approah to 2-d and 3-d motion segmentation. In Proc. European Conf. Comput. Vision, Prague, Czech Republic.
Wahba, G. 1990. Spline Models for Observational Data, SIAM.
Wang, J. and Adelson, E.H. 1993. Layered representation for motion analysis. In Proc. Conf. Computer Vision and Pattern Recognition, pp. 361–366.
Weiss, Y. 1997. Smoothness in layers: Motion segmentation using nonparametric mixture estimation. In Proc. IEEE Conf. Comput. Vision and Pattern Recognition, pp. 520–526.
Wills, J., Agarwal, S., and Belongie, S. 2003. What went where. In Proc. IEEE Conf. Comput. Vision and Pattern Recognition, vol. 1, Madison, WI, June 2003, pp. 37–44.
Wills, J. and Belongie, S. 2004. A feature-based approach for determining long range correspondences. In Proc. European Conf. Comput. Vision, vol. 3, Prague, Czech Republic, pp. 170–182.
Xiao, J., Chai, J., and Kanade, T. 2004. A closed-form solution to non-rigid shape and motion recovery. In Proc. European Conf. Comput. Vision, Prague, Czech Republic.
Xiao, J. and Shah, M. 2004. Motion layer extraction in the presence of occlusion using graph cuts. In CVPR04, Washington, D.C.

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, 92093, USA
Josh Wills, Sameer Agarwal & Serge Belongie

Authors

Josh Wills
View author publications
You can also search for this author in PubMed Google Scholar
Sameer Agarwal
View author publications
You can also search for this author in PubMed Google Scholar
Serge Belongie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Josh Wills.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wills, J., Agarwal, S. & Belongie, S. A Feature-based Approach for Dense Segmentation and Estimation of Large Disparity Motion. Int J Comput Vision 68, 125–143 (2006). https://doi.org/10.1007/s11263-006-6660-3

Download citation

Received: 04 August 2004
Revised: 20 October 2005
Accepted: 16 November 2005
Published: 01 March 2006
Issue Date: June 2006
DOI: https://doi.org/10.1007/s11263-006-6660-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Feature-based Approach for Dense Segmentation and Estimation of Large Disparity Motion

Abstract

Access this article

Similar content being viewed by others

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

LSD-SLAM: Large-Scale Direct Monocular SLAM

Model Based Training, Detection and Pose Estimation of Texture-Less 3D Objects in Heavily Cluttered Scenes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Feature-based Approach for Dense Segmentation and Estimation of Large Disparity Motion

Abstract

Access this article

Similar content being viewed by others

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

LSD-SLAM: Large-Scale Direct Monocular SLAM

Model Based Training, Detection and Pose Estimation of Texture-Less 3D Objects in Heavily Cluttered Scenes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation