skip to main content

Robust background identification for dynamic video editing

Published: 05 December 2016 Publication History


Extracting background features for estimating the camera path is a key step in many video editing and enhancement applications. Existing approaches often fail on highly dynamic videos that are shot by moving cameras and contain severe foreground occlusion. Based on existing theories, we present a new, practical method that can reliably identify background features in complex video, leading to accurate camera path estimation and background layering. Our approach contains a local motion analysis step and a global optimization step. We first divide the input video into overlapping temporal windows, and extract local motion clusters in each window. We form a directed graph from these local clusters, and identify background ones by finding a minimal path through the graph using optimization. We show that our method significantly outperforms other alternatives, and can be directly used to improve common video editing applications such as stabilization, compositing and background reconstruction.

Supplementary Material

ZIP File (
Supplemental file.


Arev, I., Park, H. S., Sheikh, Y., Hodgins, J., and Shamir, A. 2014. Automatic editing of footage from multiple social cameras. ACM Trans. Graph. (SIGGRAPH 2014) 33, 4, 81.
Bai, X., Wang, J., and Simons, D. 2011. Towards temporally-coherent video matting. In Mirage.
Bai, J., Agarwala, A., Agrawala, M., and Ramamoorthi, R. 2014. User-assisted video stabilization. Computer Graphics Forum (EGSR 2014) 33, 4, 61--70.
Baker, S., and Matthews, I. 2004. Lucas-Kanade 20 years on: A unifying framework. IJCV 56, 3, 221--255.
Barnich, O., and Van Droogenbroeck, M. 2009. Vibe: A powerful random technique to estimate the background in video sequences. In IEEE ICASSP, 945--948.
Battiato, S., Gallo, G., Puglisi, G., and Scellato, S. 2007. Sift features tracking for video stabilization. In Int. Conf. Image Analysis and Processing, 825--830.
Boult, T. E., and Brown, L. G. 1991. Factorization-based segmentation of motions. In IEEE Workshop on Visual Motion, IEEE, 179--186.
Brox, T., and Malik, J. 2010. Object segmentation by long term analysis of point trajectories. In ECCV. Springer, 282--295.
Brox, T., and Malik, J. 2010. Object segmentation by long term analysis of point trajectories. In ECCV. Springer, 282--295.
Chang, J., Wei, D., and Fisher III, J. W. 2013. A video representation using temporal superpixels. In IEEE CVPR, 2051--2058.
Chen, B.-Y., Lee, K.-Y., Huang, W.-T., and Lin, J.-S. 2008. Capturing intention-based full-frame video stabilization. Computer Graphics Forum 27, 7, 1805--1814.
Chen, T., Zhu, J.-Y., Shamir, A., and Hu, S.-M. 2013. Motion-aware gradient domain video composition. IEEE Transactions on Image Processing 22, 7, 2532--2544.
Cheng, L., Gong, M., Schuurmans, D., and Caelli, T. 2011. Real-time discriminative background subtraction. IEEE Transactions on Image Processing 20, 5, 1401--1414.
Chien, S.-Y., Ma, S.-Y., and Chen, L.-G. 2002. Efficient moving object segmentation algorithm using background registration technique. IEEE TCSVT 12, 7 (Jul), 577--586.
Chiu, C.-C., Ku, M.-Y., and Liang, L.-W. 2010. A robust object segmentation system using a probability-based background extraction algorithm. IEEE TCSVT 20, 4 (April), 518--528.
Cho, S., Wang, J., and Lee, S. 2012. Video deblurring for hand-held cameras using patch-based synthesis. ACM Trans. Graph. (SIGGRAPH 2012) 31, 4, 64.
Christy, S., and Horaud, R. 1996. Euclidean shape and motion from multiple perspective views by affine iterations. IEEE TPAMI 18, 11 (Nov.), 1098--1104.
Costeira, J. P., and Kanade, T. 1998. A multibody factorization method for independently moving objects. IJCV 29, 3, 159--179.
Cui, X., Huang, J., Zhang, S., and Metaxas, D. N. 2012. Background subtraction using low rank and group sparsity constraints. In ECCV. Springer, 612--625.
Elgammal, A., Harwood, D., and Davis, L. 2000. Non-parametric model for background subtraction. In ECCV, Springer, 751--767.
Fragkiadaki, K., and Shi, J. 2011. Detection free tracking: Exploiting motion and topology for segmenting and tracking under entanglement. In IEEE CVPR, 2073--2080.
Galasso, F., Nagaraja, N., Cardenas, T., Brox, T., and Schiele, B. 2013. A unified video segmentation benchmark: Annotation, metrics and analysis. In IEEE ICCV, 3527--3534.
Gleicher, M. L., and Liu, F. 2008. Re-cinematography: Improving the camerawork of casual video. ACM TOMCCA 5, 1, 2.
Goldstein, A., and Fattal, R. 2012. Video stabilization using epipolar geometry. ACM Trans. Graph. (SIGGRAPH 2012) 31, 5, 126:1--10.
Grundmann, M., Kwatra, V., Han, M., and Essa, I. 2010. Efficient hierarchical graph based video segmentation. IEEE CVPR.
Grundmann, M., Kwatra, V., and Essa, I. 2011. Auto-directed video stabilization with robust l1 optimal camera paths. In IEEE CVPR, 225--232.
Hayman, E., and Eklundh, J.-O. 2003. Statistical background subtraction for a mobile observer. In IEEE ICCV, vol. 1, 67--74.
Jia, Y.-T., Hu, S.-M., and Martin, R. R. 2005. Video completion using tracking and fragment merging. The Visual Computer 21, 8--10.
Litvin, A., Konrad, J., and Karl, W. C. 2003. Probabilistic video stabilization using kalman filtering and mosaicing. In Electronic Imaging, 663--674.
Liu, F., Gleicher, M., Wang, J., Jin, H., and Agarwala, A. Subspace video stabilization. ACM Trans. Graph. (SIGGRAPH 2011) 30, 1, 15:1--10.
Liu, F., Gleicher, M., Jin, H., and Agarwala, A. 2009. Content-preserving warps for 3D video stabilization. ACM Trans. Graph. (SIGGRAPH Asia 2009) 28, 3, 44.
Liu, F., Niu, Y., and Jin, H. 2013. Joint subspace stabilization for stereoscopic video. In IEEE ICCV, 73--80.
Liu, S., Yuan, L., Tan, P., and Sun, J. 2013. Bundled camera paths for video stabilization. ACM Trans. Graph. (SIGGRAPH 2013) 32, 4, 78.
Luo, D., and Huang, H. 2014. Video motion segmentation using new adaptive manifold denoising model. In IEEE CVPR, 65--72.
Ma, Y., Derksen, H., Hong, W., and Wright, J. 2007. Segmentation of multivariate mixed data via lossy data coding and compression. IEEE TPAMI 29, 9, 1546--1562.
Malis, E., and Vargas, M. 2007. Deeper understanding of the homography decomposition for vision-based control.
Matsushita, Y., Ofek, E., Ge, W., Tang, X., and Shum, H.-Y. 2006. Full-frame video stabilization with motion inpainting. IEEE TPAMI 28, 7, 1150--1163.
Mumtaz, A., Zhang, W., and Chan, A. B. 2014. Joint motion segmentation and background estimation in dynamic scenes. In IEEE CVPR, 368--375.
Ochs, P., Malik, J., and Brox, T. 2014. Segmentation of moving objects by long term video analysis. IEEE TPAMI 36, 6, 1187--1200.
Papazoglou, A., and Ferrari, V. 2013. Fast object segmentation in unconstrained video. In IEEE ICCV, 1777--1784.
Perazzi, F., Wang, O., Gross, M., and Sorkine-Hornung, A. 2015. Fully connected object proposals for video segmentation. In IEEE ICCV, 3227--3234.
Perazzi, F., Pont-Tuset, J., McWilliams, B., Gool, L. V., Gross, M., and Sorkine-Hornung, A. 2016. A benchmark dataset and evaluation methodology for video object segmentation. In IEEE CVPR, 724--732.
Rao, S., Tron, R., Vidal, R., and Ma, Y. 2010. Motion segmentation in the presence of outlying, incomplete, or corrupted trajectories. IEEE TPAMI 32, 10, 1832--1845.
Rosten, E., Porter, R., and Drummond, T. 2010. Faster and better: A machine learning approach to corner detection. IEEE TPAMI 32, 1, 105--119.
Sand, P., and Teller, S. 2008. Particle video: Long-range motion estimation using point trajectories. IJCV 80, 1, 72--91.
Sheikh, Y., Javed, O., and Kanade, T. 2009. Background subtraction for freely moving cameras. In IEEE ICCV, 1219--1225.
Shi, J., and Tomasi, C. 1994. Good features to track. In IEEE CVPR, 593--600.
Sturm, P., and Triggs, B. 1996. A factorization based algorithm for multi-image projective structure and motion. In ECCV, 709--720.
Subbarao, R., and Meer, P. 2006. Nonlinear mean shift for clustering over analytic manifolds. In IEEE CVPR, vol. 1, 1168--1175.
Sun, D., Roth, S., and Black, M. J. 2010. Secrets of optical flow estimation and their principles. In IEEE CVPR, 2432--2439.
Taylor, B., Karasev, V., and Soatto, S. 2015. Causal video object segmentation from persistence of occlusions. In IEEE CVPR, 4268--4276.
Tomasi, C., and Kanade, T. 1992. Shape and motion from image streams under orthography: a factorization method. IJCV 9, 2, 137--154.
Tron, R., and Vidal, R. 2007. A benchmark for the comparison of 3-d motion segmentation algorithms. In IEEE CVPR, IEEE, 1--8.
Tuzel, O., Subbarao, R., and Meer, P. 2005. Simultaneous multiple 3d motion estimation via mode finding on lie groups. In IEEE ICCV, vol. 1, 18--25.
Van den Bergh, M., Boix, X., Roig, G., de Capitani, B., and Van Gool, L. 2012. Seeds: Superpixels extracted via energy-driven sampling. In ECCV, Springer, 13--26.
Vidal, R., Tron, R., and Hartley, R. 2008. Multiframe motion segmentation with missing data using powerfactorization and gpca. IJCV 79, 1, 85--105.
Wang, O., Schroers, C., Zimmer, H., Gross, M., and Sorkine-Hornung, A. 2014. Videosnapping: Interactive synchronization of multiple videos. ACM Trans. Graph. (SIGGRAPH 2014) 33, 4 (July), 77:1--77:10.
Wexler, Y., Shechtman, E., and Irani, M. 2004. Space-time video completion. In IEEE CVPR, vol. 1, I.120.
Wexler, Y., Shechtman, E., and Irani, M. 2007. Space-time completion of video. IEEE TPAMI 29, 3, 463--476.
Willi, S., and Grundhofer, A. 2016. Spatio-temporal point path analysis and optimization of a galvanoscopic scanning laser projector. IEEE Transactions on Visualization and Computer Graphics PP, 99, 1--8.
Wu, Y., Zhang, Z., Huang, T. S., and Lin, J. Y. 2001. Multi-body grouping via orthogonal subspace decomposition. In IEEE CVPR, vol. 2, IEEE, II--252.
Yan, J., and Pollefeys, M. 2005. A factorization-based approach to articulated motion recovery. In IEEE CVPR, vol. 2, IEEE, 815--821.
Yan, J., and Pollefeys, M. 2006. A general framework for motion segmentation: Independent, articulated, rigid, non-rigid, degenerate and non-degenerate. In ECCV. Springer, 94--106.
Yang, M., Pei, M., Wu, Y., and Jia, Y. Learning online structural appearance model for robust object tracking. Sci China Inf Sci 58, 3, 1--14.
Zhang, Y., Tang, Y.-L., and Cheng, K.-L. Efficient video cutout by paint selection. Journal of Computer Science and Technology 30, 3, 467--477.
Zhang, G., Jia, J., Xiong, W., Wong, T.-T., Heng, P.-A., and Bao, H. 2007. Moving object extraction with a hand-held camera. In IEEE ICCV, 1--8.
Zhang, F.-L., Wang, J., Zhao, H., Martin, R. R., and Hu, S.-M. 2015. Simultaneous camera path optimization and distraction removal for improving amateur video. IEEE Transactions on Image Processing 25, 12, 5982--5994.
Zhong, F., Yang, S., Qin, X., Lischinski, D., Cohen-Or, D., and Chen, B. 2014. Slippage-free background replacement for hand-held video. ACM Trans. Graph. (SIGGRAPH Asia 2014) 33, 6, 30:1--11.
Zhou, H., Yuan, Y., and Shi, C. 2009. Object tracking using sift features and mean shift. Computer vision and image understanding 113, 3, 345--352.

Cited By

View all
  • (2024)Result Diversification in Search and Recommendation: A SurveyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.338226236:10(5354-5373)Online publication date: 1-Oct-2024
  • (2023)Surveying of Nearshore Bathymetry Using UAVs Video StitchingJournal of Marine Science and Engineering10.3390/jmse1104077011:4(770)Online publication date: 31-Mar-2023
  • (2022)Deep Online Fused Video Stabilization2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV51458.2022.00094(865-873)Online publication date: Jan-2022
  • Show More Cited By



Information & Contributors


Published In

cover image ACM Transactions on Graphics
ACM Transactions on Graphics  Volume 35, Issue 6
November 2016
1045 pages
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]


Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 December 2016
Published in TOG Volume 35, Issue 6


Request permissions for this article.

Check for updates

Author Tags

  1. background detection
  2. camera path estimation
  3. feature point trajectory
  4. video enhancement
  5. video stabilization


  • Research-article

Funding Sources

  • Natural Science Foundation of China
  • Tsinghua University Initiative Scientific Research Program
  • the General Financial Grant from the China Postdoctoral Science Foundation
  • the National High Technology Research and Development Program of China


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)29
  • Downloads (Last 6 weeks)2
Reflects downloads up to 16 Feb 2025

Other Metrics


Cited By

View all
  • (2024)Result Diversification in Search and Recommendation: A SurveyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.338226236:10(5354-5373)Online publication date: 1-Oct-2024
  • (2023)Surveying of Nearshore Bathymetry Using UAVs Video StitchingJournal of Marine Science and Engineering10.3390/jmse1104077011:4(770)Online publication date: 31-Mar-2023
  • (2022)Deep Online Fused Video Stabilization2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV51458.2022.00094(865-873)Online publication date: Jan-2022
  • (2022)IMU-Assisted Online Video Background IdentificationIEEE Transactions on Image Processing10.1109/TIP.2022.318344231(4336-4351)Online publication date: 2022
  • (2021)Practical Wide-Angle Portraits Correction with Deep Structured Models2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR46437.2021.00350(3497-3505)Online publication date: Jun-2021
  • (2020)Semi-supervised Trajectory Understanding with POI Attention for End-to-End Trip RecommendationACM Transactions on Spatial Algorithms and Systems10.1145/33788906:2(1-25)Online publication date: 7-Feb-2020
  • (2020)Effective Video Stabilization via Joint Trajectory Smoothing and Frame WarpingIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2019.292319626:11(3163-3176)Online publication date: 1-Nov-2020
  • (2020)Temporally Coherent Video Harmonization Using Adversarial NetworksIEEE Transactions on Image Processing10.1109/TIP.2019.292555029(214-224)Online publication date: 2020
  • (2020)Effective Video Frame Acquisition for Image StitchingIEEE Access10.1109/ACCESS.2020.3041330(1-1)Online publication date: 2020
  • (2020)Multi-exposure photomontage with hand-held camerasComputer Vision and Image Understanding10.1016/j.cviu.2020.102929(102929)Online publication date: Feb-2020
  • Show More Cited By

View Options

Login options

Full Access

View options


View or Download as a PDF file.



View online with eReader.







Share this Publication link

Share on social media