Skip to main content
Log in

Motion Coherent Tracking Using Multi-label MRF Optimization

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

We present a novel off-line algorithm for target segmentation and tracking in video. In our approach, video data is represented by a multi-label Markov Random Field model, and segmentation is accomplished by finding the minimum energy label assignment. We propose a novel energy formulation which incorporates both segmentation and motion estimation in a single framework. Our energy functions enforce motion coherence both within and across frames. We utilize state-of-the-art methods to efficiently optimize over a large number of discrete labels. In addition, we introduce a new ground-truth dataset, called Georgia Tech Segmentation and Tracking Dataset (GT-SegTrack), for the evaluation of segmentation accuracy in video tracking. We compare our method with several recent on-line tracking algorithms and provide quantitative and qualitative performance comparisons.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bai, X., Wang, J., Simons, D., & Sapiro, G. (2009). Video snapcut: Robust video object cutout using localized classifiers. In Proceedings of SIGGRAPH.

    Google Scholar 

  • Balch, T., Dellaert, F., Feldman, A., Guillory, A., Isbell, C. L. Jr., Khan, Z., Pratt, S. C., Stein, A. N., & Wilde, H. (2006). How multirobot systems research will accelerate our understanding of social animal behavior. Proceedings of the IEEE, 94(7), 1445–1463. Invited paper.

    Article  Google Scholar 

  • Bibby, C., & Reid, I. (2008). Robust real-time visual tracking using pixel-wise posteriors. In Proceedings of ECCV.

    Google Scholar 

  • Bluff, L., & Rutz, C. (2008). A quick guide to video-tracking birds. Biology Letters, 4, 319–322.

    Article  Google Scholar 

  • Bouguet, J. Y. (2002). Pyramidal implementation of the Lucas Kanade feature tracker: Description of the algorithm (Technical Report). Microprocessor Research Labs, Intel Corporation.

  • Boykov, Y., & Funka-Lea, G. (2006). Graph cuts and efficient n-d image segmentation. International Journal of Computer Vision, 70(2), 109–131.

    Article  Google Scholar 

  • Boykov, Y., & Jolly, M. P. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in n-d images. In Proceedings of ICCV.

    Google Scholar 

  • Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.

    Article  Google Scholar 

  • Branson, K., Robie, A., Bender, J., Perona, P., & Dickinson, M. (2009). High-throughput ethomics in large groups of Drosophila. Nature Methods, 6, 451–457.

    Article  Google Scholar 

  • Brostow, G., Essa, I., Steedly, D., & Kwatra, V. (2004). Novel skeletal representation for articulated creatures. In Proceedings of ICCV.

    Google Scholar 

  • Caselles, V., Kimmel, R., & Sapiro, G. (1997). Geodesic active contours. International Journal of Computer Vision, 22(1), 61–79.

    Article  MATH  Google Scholar 

  • Cham, T. J., & Rehg, J. M. (1999). A multiple hypothesis approach to figure tracking. In Proceedings of CVPR.

    Google Scholar 

  • Chang, M. M., Tekalp, A. M., & Sezan, M. I. (1997). Simultaneous motion estimation and segmentation. IEEE Transactions on Image Processing, 6(9), 1326–1333.

    Article  Google Scholar 

  • Chellappa, R., Ferryman, J., & Tan, T. (Eds.) (2005). 2nd joint IEEE intl. workshop on visual surveillance and performance evaluation of tracking and surveillance (VS-PETS 05), Beijing, China. Held in conjunction with ICCV 2005.

    Google Scholar 

  • Chockalingam, P., Pradeep, N., & Birchfield, S. (2009). Adaptive fragments-based tracking of non-rigid objects using level sets. In International conference on computer vision (ICCV).

    Google Scholar 

  • Dankert, H., Wang, L., Hoopfer, E. D., Anderson, D. J., & Perona, P. (2009). Automated monitoring and analysis of social behavior in drosophila. Nature Methods, 6, 297–303.

    Article  Google Scholar 

  • Delcourt, J., Becco, C., Vandewalle, N., & Poncin, P. (2009). A video multitracking system for quantification of individual behavior in a large fish shoal: advantages and limits. Behavior Research Methods, 41(1), 228–235. http://hdl.handle.net/2268/6100.

    Article  Google Scholar 

  • Donoser, M., & Bischof, H. (2008). Fast non-rigid object boundary tracking. In Proceedings of British machine vision conference (BMVC) (pp. 1–10).

    Google Scholar 

  • Felzenschwalb, P. (2005). Representation and detection of deformable shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(2), 208–220.

    Article  Google Scholar 

  • Felzenszwalb, P. F., & Huttenlocher, D. P. (2004). Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2), 167–181.

    Article  Google Scholar 

  • Glocker, B., Paragios, N., Komodakis, N., Tziritas, G., & Navab, N. (2007). Inter and intra-modal deformable registration: continuous deformations meet efficient optimal linear programming. In Proceedings of IPMI.

    Google Scholar 

  • Glocker, B., Paragios, N., Komodakis, N., Tziritas, G., & Navab, N. (2008). Optical flow estimation with uncertainties through dynamic MRFs. In Proceedings of CVPR.

    Google Scholar 

  • Grundmann, M., Kwatra, V., Han, M., & Essa, I. (2010). Efficient hierarchical graph-based video segmentation. In Proceedings of CVPR.

    Google Scholar 

  • Kao, E. K., Daggett, M. P., & Hurley, M. B. (2009). An information theoretic approach for tracker performance evaluation. In Proceedings of ICCV.

    Google Scholar 

  • Khan, Z., Balch, T., & Dellaert, F. (2005). MCMC-based particle filtering for tracking a variable number of interacting targets. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27, 1805–1819.

    Article  Google Scholar 

  • Kohli, P., & Torr, P. (2005). Efficiently solving dynamic Markov random fields using graph cuts. In Proceedings of ICCV (pp. 922–929).

    Google Scholar 

  • Komodakis, N., Paragios, N., & Tziritas, G. (2007). MRF optimization via dual decomposition: Message-passing revisited. In International conference on computer vision (ICCV).

    Google Scholar 

  • Komodakis, N., & Tziritas, G. (2005). A new framework for approximate labeling via graph cuts. In Proceedings of ICCV.

    Google Scholar 

  • Komodakis, N., & Tziritas, G. (2007). Approximate labeling via graph-cuts based on linear programming. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 1436–1453.

    Article  Google Scholar 

  • Lempitsky, V., & Boykov, Y. (2007). Global optimization for shape fitting. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1–8).

    Chapter  Google Scholar 

  • Li, Y., Sun, J., & Shum, H. Y. (2005). Video object cut and paste. ACM Transactions on Graphics, 24(3), 595–600.

    Article  Google Scholar 

  • Martin, J. (2004). A portrait of locomotor behaviour in Drosophila determined by a video-tracking paradigm. Behavioural Processes, 67, 207–219.

    Article  Google Scholar 

  • Price, B. L., Morse, B. S., & Cohen, S. (2009). Livecut: Learning-based interactive video segmentation by evaluation of multiple propagated cues. In Proceedings of ICCV.

    Google Scholar 

  • Ramanan, D., & Forsyth, D. (2003). Using temporal coherence to build models of animals. In International conference on computer vision (ICCV).

    Google Scholar 

  • Ren, X., & Malik, J. (2007). Tracking as repeated figure/ground segmentation. In IEEE conference on computer vision and pattern recognition (CVPR).

    Google Scholar 

  • Rodriguez, M. D., Ahmed, J., & Shah, M. (2008). Action mach: A spatio-temporal maximum average correlation height filter for action recognition. In IEEE conference on computer vision and pattern recognition (CVPR).

    Google Scholar 

  • Rother, C., Kolmogorov, V., & Blake, A. (2004). Grabcut: Interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics, 23(3), 309–314.

    Article  Google Scholar 

  • Schoenemann, T., & Cremers, D. (2010). A combinatorial solution for model-based image segmentation and real-time tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(7), 1153–1164.

    Article  Google Scholar 

  • Shi, J., & Tomasi, C. (1994). Good features to track. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 593–600).

    Google Scholar 

  • Sigal, L., Balan, A., & Black, M. J. (2009). Humaneva: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion. International Journal of Computer Vision, 87, 4–27.

    Article  Google Scholar 

  • Sminchisescu, C., & Triggs, B. (2003). Kinematic jump processes for monocular 3d human tracking. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 69–76).

    Google Scholar 

  • Tsai, D., Flagg, M., & Rehg, J. M. (2010). Motion coherent tracking with multi-label MRF optimization. In British machine vision conference (BMVC). Recipient of the Best Student Paper Prize.

    Google Scholar 

  • Tsibidis, G., & Tavernarakis, N. (2007). Nemo: A computational tool for analyzing nematode locomotion. BMC Neuroscience, 8(1), 86. doi:10.1186/1471-2202-8-86. http://www.biomedcentral.com/1471-2202/8/86.

    Article  Google Scholar 

  • Vaswani, N., Tannenbaum, A., & Yezzi, A. (2007). Tracking deforming objects using particle filtering for geometric active contours. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(8), 1470–1475.

    Article  Google Scholar 

  • Wang, J., Bhat, P., Colburn, R. A., Agrawala, M., & Cohen, M. F. (2005). Interactive video cutout. In SIGGRAPH ’05 ACM SIGGRAPH 2005 papers (pp. 585–594). New York: ACM. doi:10.1145/1186822.1073233.

    Chapter  Google Scholar 

  • Wang, P., & Rehg, J. M. (2006). A modular approach to the analysis and evaluation of particle filters for figure tracking. In IEEE conference on computer vision and pattern recognition (CVPR), New York, NY (Vol. 1, pp. 790–797).

    Google Scholar 

  • Xiao, J., & Shah, M. (2005). Motion layer extraction in the presence of occlusion using graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1644–1659.

    Article  Google Scholar 

  • Zhaozheng, Y., & Collins, R. (2009). Shape constrained figure-ground segmentation and tracking. In Proceedings of CVPR.

    Google Scholar 

  • Zitnick, C. L., Jojic, N., & Kang, S. B. (2005). Consistent segmentation for optical flow estimation. In Proceedings of ICCV.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Tsai.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

(AVI 10.8 MB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tsai, D., Flagg, M., Nakazawa, A. et al. Motion Coherent Tracking Using Multi-label MRF Optimization. Int J Comput Vis 100, 190–202 (2012). https://doi.org/10.1007/s11263-011-0512-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-011-0512-5

Keywords

Navigation