Abstract
A wide variety of computer vision applications rely on superpixel or supervoxel algorithms as a preprocessing step. This underlines the overall importance that these algorithms have gained in the recent years. However, most methods show a lack of temporal consistency or fail in producing temporally stable segmentations. In this paper, we propose a novel, contour-based approach that generates temporally consistent superpixels for video content. It can be expressed in an expectation-maximization framework and utilizes an efficient label propagation built on backward optical flow in order to encourage the preservation of superpixel shapes and their spatial constellation over time. Using established benchmark suites, we show the superior performance of our approach compared to state of the art supervoxel and superpixel algorithms for video content.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The underlying assumption is that a temporal superpixel should share the same color in successive frames but not necessarily the same position.
- 2.
The changes after 5 iterations are only marginal. It should be noted that the boundary can move more than 1 pixel per iteration.
References
Ren, X., Malik, J.: Learning a classification model for segmentation. In: ICCV, pp. 10–17 (2003)
Lezama, J., Alahari, K., Sivic, J., Laptev, I.: Track to the future: spatio-temporal video segmentation with long-range motion cues. In: CVPR, pp. 3369–3376 (2011)
Galasso, F., Cipolla, R., Schiele, B.: Video segmentation with superpixels. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part I. LNCS, vol. 7724, pp. 760–774. Springer, Heidelberg (2013)
Wang, S., Lu, H., Yang, F., Yang, M.H.: Superpixel tracking. In: ICCV, pp. 1323– 1330 (2011)
Djelouah, A., Franco, J.S., Boyer, E., Le Clerc, F., Pérez, P.: Multi-view object segmentation in space and time. In: ICCV, pp. 2640–2647 (2013)
Vogel, C., Schindler, K., Roth, S.: Piecewise rigid scene flow. In: ICCV, pp. 1377– 1384 (2013)
Zhang, J., Kan, C., Schwing, A.G., Urtasun, R.: Estimating the 3D layout of indoor scenes and its clutter from depth sensors. In: ICCV, pp. 1273–1280 (2013)
van den Hengel, A., Dick, A., Thormählen, T., Ward, B., Torr, P.H.S.: VideoTrace. ACM TOG 26, 86 (2007)
Tighe, J., Lazebnik, S.: Superparsing. IJCV 101, 329–349 (2012)
Roig, G., Boix, X., Nijs, R.D., Ramos, S., Kuhnlenz, K., Gool, L.V.: Active MAP inference in CRFs for efficient semantic segmentation. In: ICCV, pp. 2312–2319 (2013)
Jain, A., Chatterjee, S., Vidal, R.: Coarse-to-fine semantic video segmentation using supervoxel trees. In: ICCV, pp. 1865–1872 (2013)
Hoiem, D., Efros, A.A., Hebert, M.: Geometric context from a single image. In: ICCV, pp. 654–661 (2005)
Grundmann, M., Kwatra, V., Han, M., Essa, I.: Efficient hierarchical graph-based video segmentation. In: CVPR, pp. 2141–2148 (2010)
Veksler, O., Boykov, Y., Mehrani, P.: Superpixels and supervoxels in an energy optimization framework. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 211–224. Springer, Heidelberg (2010)
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Susstrunk, S.: SLIC superpixels compared to state-of-the-art superpixel methods. TPAMI 34, 2274–2282 (2012)
Chang, J., Wei, D., Fisher, J.W.: A video representation using temporal superpixels. In: CVPR, pp. 2051–2058 (2013)
Van den Bergh, M., Roig, G., Boix, X., Manen, S., Van Gool, L.: Online video seeds for temporal window objectness. In: ICCV, pp. 377–384 (2013)
Reso, M., Jachalsky, J., Rosenhahn, B., Ostermann, J.: Temporally consistent superpixels. In: ICCV, pp. 385–392 (2013)
Levinshtein, A., Sminchisescu, C., Dickinson, S.: Spatiotemporal closure. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part I. LNCS, vol. 6492, pp. 369–382. Springer, Heidelberg (2011)
Zitnick, C.L., Jojic, N., Kang, S.B.: Consistent segmentation for optical flow estimation. In: ICCV, pp. 1308–1315 (2005)
Xu, C., Xiong, C., Corso, J.J.: Streaming hierarchical video segmentation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 626–639. Springer, Heidelberg (2012)
Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. IJCV 59, 167–181 (2004)
Van den Bergh, M., Boix, X., Roig, G., de Capitani, B., Van Gool, L.: SEEDS: superpixels extracted via energy-driven sampling. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VII. LNCS, vol. 7578, pp. 13–26. Springer, Heidelberg (2012)
Xu, C., Corso, J.J.: Evaluation of super-voxel methods for early video processing. In: CVPR, pp. 1202–1209 (2012)
Arbeláez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. TPAMI 33, 898–916 (2011)
Schick, A., Fischer, M., Stiefelhagen, R.: Measuring and evaluating the compactness of superpixels. In: ICPR, pp. 930–934 (2012)
Schick, A., Fischer, M., Stiefelhagen, R.: An evaluation of the compactness of superpixels. Pattern Recogn. Lett. 43, 71–80 (2014)
Sundberg, P., Brox, T., Maire, M., Arbelaez, P., Malik, J.: Occlusion boundary detection and figure/ground assignment from optical flow. In: CVPR, pp. 2233–2240 (2011)
Chen, A., Corso, J.J.: Propagating multi-class pixel labels throughout video frames. In: WNYIPW, pp. 14–17 (2010)
Galasso, F., Nagaraja, N.S., Cárdenas, T.J., Brox, T., Schiele, B.: A unified video segmentation benchmark: annotation, metrics and analysis. In: ICCV, pp. 3527–3534 (2013)
Moore, A.P., Prince, S., Warrell, J., Mohammed, U., Jones, G.: Superpixel lattices. In: CVPR, pp. 1–8 (2008)
Perbet, F., Maki, A.: Homogeneous superpixels from random walks. In: MVA, pp. 26–30 (2011)
Liu, C.: Beyond pixels: exploring new representations and applications for motion analysis. Ph.D. thesis, Massachusetts Institute of Technology (2009)
Horn, B.K.P., Schunck, B.G.: Determining optical flow. AI 17, 185–203 (1981)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Reso, M., Jachalsky, J., Rosenhahn, B., Ostermann, J. (2015). Superpixels for Video Content Using a Contour-Based EM Optimization. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision -- ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9006. Springer, Cham. https://doi.org/10.1007/978-3-319-16817-3_45
Download citation
DOI: https://doi.org/10.1007/978-3-319-16817-3_45
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16816-6
Online ISBN: 978-3-319-16817-3
eBook Packages: Computer ScienceComputer Science (R0)