Motion Coherent Tracking Using Multi-label MRF Optimization

Tsai, David; Flagg, Matthew; Nakazawa, Atsushi; Rehg, James M.

doi:10.1007/s11263-011-0512-5

Motion Coherent Tracking Using Multi-label MRF Optimization

Published: 21 December 2011

Volume 100, pages 190–202, (2012)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

David Tsai¹,
Matthew Flagg¹,
Atsushi Nakazawa² &
…
James M. Rehg¹

1778 Accesses
174 Citations
6 Altmetric
Explore all metrics

Abstract

We present a novel off-line algorithm for target segmentation and tracking in video. In our approach, video data is represented by a multi-label Markov Random Field model, and segmentation is accomplished by finding the minimum energy label assignment. We propose a novel energy formulation which incorporates both segmentation and motion estimation in a single framework. Our energy functions enforce motion coherence both within and across frames. We utilize state-of-the-art methods to efficiently optimize over a large number of discrete labels. In addition, we introduce a new ground-truth dataset, called Georgia Tech Segmentation and Tracking Dataset (GT-SegTrack), for the evaluation of segmentation accuracy in video tracking. We compare our method with several recent on-line tracking algorithms and provide quantitative and qualitative performance comparisons.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

Bai, X., Wang, J., Simons, D., & Sapiro, G. (2009). Video snapcut: Robust video object cutout using localized classifiers. In Proceedings of SIGGRAPH.
Google Scholar
Balch, T., Dellaert, F., Feldman, A., Guillory, A., Isbell, C. L. Jr., Khan, Z., Pratt, S. C., Stein, A. N., & Wilde, H. (2006). How multirobot systems research will accelerate our understanding of social animal behavior. Proceedings of the IEEE, 94(7), 1445–1463. Invited paper.
Article Google Scholar
Bibby, C., & Reid, I. (2008). Robust real-time visual tracking using pixel-wise posteriors. In Proceedings of ECCV.
Google Scholar
Bluff, L., & Rutz, C. (2008). A quick guide to video-tracking birds. Biology Letters, 4, 319–322.
Article Google Scholar
Bouguet, J. Y. (2002). Pyramidal implementation of the Lucas Kanade feature tracker: Description of the algorithm (Technical Report). Microprocessor Research Labs, Intel Corporation.
Boykov, Y., & Funka-Lea, G. (2006). Graph cuts and efficient n-d image segmentation. International Journal of Computer Vision, 70(2), 109–131.
Article Google Scholar
Boykov, Y., & Jolly, M. P. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in n-d images. In Proceedings of ICCV.
Google Scholar
Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.
Article Google Scholar
Branson, K., Robie, A., Bender, J., Perona, P., & Dickinson, M. (2009). High-throughput ethomics in large groups of Drosophila. Nature Methods, 6, 451–457.
Article Google Scholar
Brostow, G., Essa, I., Steedly, D., & Kwatra, V. (2004). Novel skeletal representation for articulated creatures. In Proceedings of ICCV.
Google Scholar
Caselles, V., Kimmel, R., & Sapiro, G. (1997). Geodesic active contours. International Journal of Computer Vision, 22(1), 61–79.
Article MATH Google Scholar
Cham, T. J., & Rehg, J. M. (1999). A multiple hypothesis approach to figure tracking. In Proceedings of CVPR.
Google Scholar
Chang, M. M., Tekalp, A. M., & Sezan, M. I. (1997). Simultaneous motion estimation and segmentation. IEEE Transactions on Image Processing, 6(9), 1326–1333.
Article Google Scholar
Chellappa, R., Ferryman, J., & Tan, T. (Eds.) (2005). 2nd joint IEEE intl. workshop on visual surveillance and performance evaluation of tracking and surveillance (VS-PETS 05), Beijing, China. Held in conjunction with ICCV 2005.
Google Scholar
Chockalingam, P., Pradeep, N., & Birchfield, S. (2009). Adaptive fragments-based tracking of non-rigid objects using level sets. In International conference on computer vision (ICCV).
Google Scholar
Dankert, H., Wang, L., Hoopfer, E. D., Anderson, D. J., & Perona, P. (2009). Automated monitoring and analysis of social behavior in drosophila. Nature Methods, 6, 297–303.
Article Google Scholar
Delcourt, J., Becco, C., Vandewalle, N., & Poncin, P. (2009). A video multitracking system for quantification of individual behavior in a large fish shoal: advantages and limits. Behavior Research Methods, 41(1), 228–235. http://hdl.handle.net/2268/6100.
Article Google Scholar
Donoser, M., & Bischof, H. (2008). Fast non-rigid object boundary tracking. In Proceedings of British machine vision conference (BMVC) (pp. 1–10).
Google Scholar
Felzenschwalb, P. (2005). Representation and detection of deformable shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(2), 208–220.
Article Google Scholar
Felzenszwalb, P. F., & Huttenlocher, D. P. (2004). Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2), 167–181.
Article Google Scholar
Glocker, B., Paragios, N., Komodakis, N., Tziritas, G., & Navab, N. (2007). Inter and intra-modal deformable registration: continuous deformations meet efficient optimal linear programming. In Proceedings of IPMI.
Google Scholar
Glocker, B., Paragios, N., Komodakis, N., Tziritas, G., & Navab, N. (2008). Optical flow estimation with uncertainties through dynamic MRFs. In Proceedings of CVPR.
Google Scholar
Grundmann, M., Kwatra, V., Han, M., & Essa, I. (2010). Efficient hierarchical graph-based video segmentation. In Proceedings of CVPR.
Google Scholar
Kao, E. K., Daggett, M. P., & Hurley, M. B. (2009). An information theoretic approach for tracker performance evaluation. In Proceedings of ICCV.
Google Scholar
Khan, Z., Balch, T., & Dellaert, F. (2005). MCMC-based particle filtering for tracking a variable number of interacting targets. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27, 1805–1819.
Article Google Scholar
Kohli, P., & Torr, P. (2005). Efficiently solving dynamic Markov random fields using graph cuts. In Proceedings of ICCV (pp. 922–929).
Google Scholar
Komodakis, N., Paragios, N., & Tziritas, G. (2007). MRF optimization via dual decomposition: Message-passing revisited. In International conference on computer vision (ICCV).
Google Scholar
Komodakis, N., & Tziritas, G. (2005). A new framework for approximate labeling via graph cuts. In Proceedings of ICCV.
Google Scholar
Komodakis, N., & Tziritas, G. (2007). Approximate labeling via graph-cuts based on linear programming. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 1436–1453.
Article Google Scholar
Lempitsky, V., & Boykov, Y. (2007). Global optimization for shape fitting. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1–8).
Chapter Google Scholar
Li, Y., Sun, J., & Shum, H. Y. (2005). Video object cut and paste. ACM Transactions on Graphics, 24(3), 595–600.
Article Google Scholar
Martin, J. (2004). A portrait of locomotor behaviour in Drosophila determined by a video-tracking paradigm. Behavioural Processes, 67, 207–219.
Article Google Scholar
Price, B. L., Morse, B. S., & Cohen, S. (2009). Livecut: Learning-based interactive video segmentation by evaluation of multiple propagated cues. In Proceedings of ICCV.
Google Scholar
Ramanan, D., & Forsyth, D. (2003). Using temporal coherence to build models of animals. In International conference on computer vision (ICCV).
Google Scholar
Ren, X., & Malik, J. (2007). Tracking as repeated figure/ground segmentation. In IEEE conference on computer vision and pattern recognition (CVPR).
Google Scholar
Rodriguez, M. D., Ahmed, J., & Shah, M. (2008). Action mach: A spatio-temporal maximum average correlation height filter for action recognition. In IEEE conference on computer vision and pattern recognition (CVPR).
Google Scholar
Rother, C., Kolmogorov, V., & Blake, A. (2004). Grabcut: Interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics, 23(3), 309–314.
Article Google Scholar
Schoenemann, T., & Cremers, D. (2010). A combinatorial solution for model-based image segmentation and real-time tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(7), 1153–1164.
Article Google Scholar
Shi, J., & Tomasi, C. (1994). Good features to track. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 593–600).
Google Scholar
Sigal, L., Balan, A., & Black, M. J. (2009). Humaneva: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion. International Journal of Computer Vision, 87, 4–27.
Article Google Scholar
Sminchisescu, C., & Triggs, B. (2003). Kinematic jump processes for monocular 3d human tracking. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 69–76).
Google Scholar
Tsai, D., Flagg, M., & Rehg, J. M. (2010). Motion coherent tracking with multi-label MRF optimization. In British machine vision conference (BMVC). Recipient of the Best Student Paper Prize.
Google Scholar
Tsibidis, G., & Tavernarakis, N. (2007). Nemo: A computational tool for analyzing nematode locomotion. BMC Neuroscience, 8(1), 86. doi:10.1186/1471-2202-8-86. http://www.biomedcentral.com/1471-2202/8/86.
Article Google Scholar
Vaswani, N., Tannenbaum, A., & Yezzi, A. (2007). Tracking deforming objects using particle filtering for geometric active contours. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(8), 1470–1475.
Article Google Scholar
Wang, J., Bhat, P., Colburn, R. A., Agrawala, M., & Cohen, M. F. (2005). Interactive video cutout. In SIGGRAPH ’05 ACM SIGGRAPH 2005 papers (pp. 585–594). New York: ACM. doi:10.1145/1186822.1073233.
Chapter Google Scholar
Wang, P., & Rehg, J. M. (2006). A modular approach to the analysis and evaluation of particle filters for figure tracking. In IEEE conference on computer vision and pattern recognition (CVPR), New York, NY (Vol. 1, pp. 790–797).
Google Scholar
Xiao, J., & Shah, M. (2005). Motion layer extraction in the presence of occlusion using graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1644–1659.
Article Google Scholar
Zhaozheng, Y., & Collins, R. (2009). Shape constrained figure-ground segmentation and tracking. In Proceedings of CVPR.
Google Scholar
Zitnick, C. L., Jojic, N., & Kang, S. B. (2005). Consistent segmentation for optical flow estimation. In Proceedings of ICCV.
Google Scholar

Download references

Author information

Authors and Affiliations

Center for Behavior Imaging and the Computational Perception Laboratory, School of Interactive Computing, Georgia Institute of Technology, Atlanta, GA, 30332, USA
David Tsai, Matthew Flagg & James M. Rehg
Cybermedia Center, Osaka University, 1-32 Machikaneyama, Toyonaka, Osaka, 560-0043, Japan
Atsushi Nakazawa

Authors

David Tsai
View author publications
You can also search for this author in PubMed Google Scholar
Matthew Flagg
View author publications
You can also search for this author in PubMed Google Scholar
Atsushi Nakazawa
View author publications
You can also search for this author in PubMed Google Scholar
James M. Rehg
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Tsai.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

(AVI 10.8 MB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tsai, D., Flagg, M., Nakazawa, A. et al. Motion Coherent Tracking Using Multi-label MRF Optimization. Int J Comput Vis 100, 190–202 (2012). https://doi.org/10.1007/s11263-011-0512-5

Download citation

Received: 26 December 2010
Accepted: 01 December 2011
Published: 21 December 2011
Issue Date: November 2012
DOI: https://doi.org/10.1007/s11263-011-0512-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Motion Coherent Tracking Using Multi-label MRF Optimization

Abstract

Access this article

Similar content being viewed by others

BoostTrack: boosting the similarity measure and detection confidence for improved multiple object tracking

ByteTrack: Multi-object Tracking by Associating Every Detection Box

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

References

Author information

Authors and Affiliations

Corresponding author

Electronic Supplementary Material

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Motion Coherent Tracking Using Multi-label MRF Optimization

Abstract

Access this article

Similar content being viewed by others

BoostTrack: boosting the similarity measure and detection confidence for improved multiple object tracking

ByteTrack: Multi-object Tracking by Associating Every Detection Box

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

References

Author information

Authors and Affiliations

Corresponding author

Electronic Supplementary Material

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation