Robust Visual Tracking via Structured Multi-Task Sparse Learning

Zhang, Tianzhu; Ghanem, Bernard; Liu, Si; Ahuja, Narendra

doi:10.1007/s11263-012-0582-z

Robust Visual Tracking via Structured Multi-Task Sparse Learning

Published: 09 November 2012

Volume 101, pages 367–383, (2013)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Tianzhu Zhang¹,
Bernard Ghanem²,
Si Liu³ &
…
Narendra Ahuja⁴

4745 Accesses
388 Citations
Explore all metrics

Abstract

In this paper, we formulate object tracking in a particle filter framework as a structured multi-task sparse learning problem, which we denote as Structured Multi-Task Tracking (S-MTT). Since we model particles as linear combinations of dictionary templates that are updated dynamically, learning the representation of each particle is considered a single task in Multi-Task Tracking (MTT). By employing popular sparsity-inducing $\ell _{p,q}$ mixed norms $(\text{ specifically} p\in \{2,\infty \}$ and $q=1),$ we regularize the representation problem to enforce joint sparsity and learn the particle representations together. As compared to previous methods that handle particles independently, our results demonstrate that mining the interdependencies between particles improves tracking performance and overall computational complexity. Interestingly, we show that the popular $L_1$ tracker (Mei and Ling, IEEE Trans Pattern Anal Mach Intel 33(11):2259–2272, 2011) is a special case of our MTT formulation (denoted as the $L_{11}$ tracker) when $p=q=1.$ Under the MTT framework, some of the tasks (particle representations) are often more closely related and more likely to share common relevant covariates than other tasks. Therefore, we extend the MTT framework to take into account pairwise structural correlations between particles (e.g. spatial smoothness of representation) and denote the novel framework as S-MTT. The problem of learning the regularized sparse representation in MTT and S-MTT can be solved efficiently using an Accelerated Proximal Gradient (APG) method that yields a sequence of closed form updates. As such, S-MTT and MTT are computationally attractive. We test our proposed approach on challenging sequences involving heavy occlusion, drastic illumination changes, and large pose variations. Experimental results show that S-MTT is much better than MTT, and both methods consistently outperform state-of-the-art trackers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Visual tracking via robust multi-task multi-feature joint sparse representation

Article 07 June 2018

Robust particle tracking via spatio-temporal context learning and multi-task joint local sparse representation

Article 15 March 2019

Robust object tracking via local constrained and online weighted

Article 03 May 2015

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

The score is the ratio of the intersection to the union of two bounding boxes. In our case, it would be the ratio of the intersection of the ground truth and the predicted tracks to their union in each frame.
Since the degree matrix $\hat{\mathbf{D }}$ is diagonal and non-negative and since the Laplacian $\mathbf L $ of any graph is positive semi-definite, the normalized Laplacian $\hat{\mathbf{L }}$ is positive semi-definite. Thus, $G(\mathbf C )$ is convex in $\mathbf C .$
The proximal mapping of a non-smooth convex function $h(.)$ is defined as: $\mathbf{prox }_h(\mathbf x )=\arg \min _\mathbf{u }\left(h(\mathbf u )+\frac{1}{2}\Vert \mathbf u -\mathbf x \Vert _2^2\right).$
https://sites.google.com/site/videoadsc/.
http://www.cs.toronto.edu/~dross/ivt/.
http://vision.ucsd.edu/~bbabenko/project_miltrack.shtml.
http://cv.snu.ac.kr/research/~vtd/.
This dissimilarity measure is used often to compare tracking performance. Other measures can be used, including the PASCAL overlap score.

References

Adam, A., Rivlin, E.,& Shimshoni, I. (2006). Robust fragments-based tracking using the integral histogram. In IEEE conference on computer vision and pattern recognition (pp. 798–805).
Avidan, S. (2005). Ensemble tracking. In IEEE conference on computer vision and pattern recognition (pp. 494–501).
Babenko, B., Yang, M. H.,& Belongie, S. (2009). Visual tracking with online multiple instance learning. In IEEE conference on computer vision and pattern recognition (pp. 983–990).
Bao, C., Wu, Y., Ling, H.,& Ji, H. (2012). Real time robust l1 tracker using accelerated proximal gradient approach. In IEEE conference on computer vision and pattern recognition (pp. 1–8).
Beck, A.,& Teboulle, M. (2009). A fast iterative shrinkagethresholding algorithm for linear inverse problems. SIAM Journal on Imaging Science, 2(1), 183–202.
Google Scholar
Black, M. J.,& Jepson, A. D. (1998). Eigentracking: Robust matching and tracking of articulated objects using a view-based representation. International Journal of Computer Vision, 26(1), 63–84.
Google Scholar
Blasch, E.,& Kahler, B. (2005). Multiresolution EO/IR target tracking and identification. In International conference on information fusion (Vol. 8, pp. 1–8).
Candès, E. J., Romberg, J. K.,& Tao, T. (2006). Stable signal recovery from incomplete and inaccurate measurements. Communications on Pure and Applied Mathematics, 59(8), 1207–1223.
Google Scholar
Chen, X., Pan, W., Kwok, J.,& Carbonell, J. (2009). Accelerated gradient method for multi-task sparse learning problem. In IEEE international conference on data mining (pp. 746–751).
Collins, R. T.,& Liu, Y. (2003). On-line selection of discriminative tracking features. In International conference on computer vision (pp. 346–352).
Comaniciu, D., Ramesh, V.,& Meer, P. (2003). Kernel-based object tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(5), 564–575.
Google Scholar
Doucet, A., De Freitas, N.,& Gordon, N. (2001). Sequential Monte Carlo methods in practice (1st ed.). Springer.
Grabner, H., Grabner, M.,& Bischof, H. (2006). Real-time tracking via on-line boosting. In British machine vision conference (pp. 1–10).
Grabner, H., Leistner, C.,& Bischof, H. (2008). Semi-supervised on-line boosting for robust tracking. In European conference on computer vision (pp. 234–247).
Isard, M.,& Blake, A. (1998). Condensation—Conditional density propagation for visual tracking. International Journal of Computer Vision, 29(1), 5–28.
Google Scholar
Jepson, A., Fleet, D.,& El-Maraghi, T. (2003). Robust on-line appearance models for visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(10), 1296–1311.
Google Scholar
Jiang, N., Liu, W.,& Wu, Y. (2011). Adaptive and discriminative metric differential tracking. In IEEE conference on computer vision and pattern recognition (pp. 1161–1168).
Khan, Z., Balch, T.,& Dellaert, F. (2004). A rao-blackwellized particle filter for eigentracking. In IEEE conference on computer vision and pattern recognition (pp. 980–986).
Kwon, J.,& Lee, K. M. (2010). Visual tracking decomposition. In IEEE conference on computer vision and pattern recognition (pp. 1269–1276).
Leistner, C., Godec, M., Saffari, A.,& Bischof, H. (2010). Online multi-view forests for tracking. In DAGM (pp. 493–502).
Li, H., Shen, C.,& Shi, Q. (2011). Real-time visual tracking with compressed sensing. In IEEE conference on computer vision and pattern recognition (pp. 1305–1312).
Liu, B., Huang, J., Yang, L.,& Kulikowski, C. (2011). Robust visual tracking with local sparse appearance model and k-selection. In IEEE conference on computer vision and pattern recognition (pp. 1–8).
Liu, B., Yang, L., Huang, J., Meer, P., Gong, L.,& Kulikowski, C. (2010). Robust and fast collaborative tracking with two stage sparse optimization. In European conference on computer vision (pp. 1–14).
Liu, R., Cheng, J.,& Lu, H. (2009). A robust boosting tracker with minimum error bound in a co-training framework. In International conference on computer vision (pp. 1459–1466).
Mei, X.,& Ling, H. (2011). Robust visual tracking and vehicle classification via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(11), 2259–2272.
Google Scholar
Mei, X., Ling, H., Wu, Y., Blasch, E.,& Bai, L. (2011). Minimum error bounded efficient l1 tracker with occlusion detection. In IEEE conference on computer vision and pattern recognition (pp. 1257–1264).
Nesterov, Y. (2007). Gradient methods for minimizing composite objective function. In CORE discussion paper.
Peng, Y., Ganesh, A., Wright, J., Xu, W.,& Ma, Y. (2012). RASL: Robust alignment by sparse and low-rank decomposition for linearly correlated images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34, 2233–2246.
Google Scholar
Quattoni, A., Carreras, X., Collins, M.,& Darrell, T. (2009). An efficient projection for l 1, infinity regularization. In International conference on machine learning (pp. 857–864).
Ross, D., Lim, J., Lin, R. S.,& Yang, M. H. (2008). Incremental learning for robust visual tracking. International Journal of Computer Vision, 77(1), 125–141.
Google Scholar
Tseng, P. (2008). On accelerated proximal gradient methods for convex–concave optimization. Technical report. http://pages.cs.wisc.edu/~brecht/cs726docs/Tseng.APG.pdf.
Wu, Y.,& Huang, T. S. (2004). Robust visual tracking by integrating multiple cues based on co-inference learning. International Journal of Computer Vision, 58, 55–71.
Google Scholar
Yang, C., Duraiswami, R.,& Davis, L. (2005). Fast multiple object tracking via a hierarchical particle filter. In International conference on computer vision (pp. 212–219).
Yang, M., Wu, Y.,& Hua, G. (2009). Context-aware visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(7), 1195–1209.
Google Scholar
Yilmaz, A., Javed, O.,& Shah, M. (2006). Object tracking: A survey. ACM Computing Surveys, 38(4), 13.
Google Scholar
Yin, Z.,& Collins, R. (2008). Object tracking and detection after occlusion via numerical hybrid local and global mode-seeking. In IEEE conference on computer vision and pattern recognition (pp. 1–8).
Yu, Q., Dinh, T. B.,& Medioni, G. (2008). Online tracking and reacquisition using co-trained generative and discriminative trackers. In European conference on computer vision (pp. 678–691).
Yuan, X.,& Yan, S. (2010). Visual classification with multi-task joint sparse representation. In IEEE conference on computer vision and pattern recognition (pp. 3493–3500).
Zhang, T., Ghanem, B., Liu, S.,& Ahuja, N. (2012a). Low-rank sparse learning for robust visual tracking. In European conference on computer vision (pp. 1–8).
Zhang, T., Ghanem, B., Liu, S.,& Ahuja, N. (2012b). Robust visual tracking via multi-task sparse learning. In IEEE conference on computer vision and pattern recognition (pp. 1–8).
Zhou, S. K., Chellappa, R.,& Moghaddam, B. (2004). Visual tracking and recognition using appearance-adaptive models in particle filters. IEEE Transactions on Image Processing, 11(1), 1491–1506.
Google Scholar
Zhu, X. (2008). Semi-supervised learning literature survey. Computer sciences technical report 1530, University of Madison.

Download references

Acknowledgments

This study is supported by the research grant for the Human Sixth Sense Programme at the Advanced Digital Sciences Center from Singapore’s Agency for Science, Technology and Research (A$^*$STAR).

Author information

Authors and Affiliations

Advanced Digital Sciences Center (ADSC), 1 Fusionopolis Way, #08-10 Connexis North Tower, Singapore, 138632, Singapore
Tianzhu Zhang
King Abdullah University of Science and Technology (KAUST), Al Khwarizmi Building #2224, Thuwal, Kingdom of Saudi Arabia
Bernard Ghanem
Department of Electrical and Computer Engineering, National University of Singapore, 4 Engineering Drive 3, Singapore, 117576, Singapore
Si Liu
Department of Electrical and Computer Engineering, Beckman Institute, and Coordinated Science Laboratory, University of Illinois at Urbana-Champaign, 2041 Beckman Institute, 405 N. Mathews Ave., Urbana, IL, 61801, USA
Narendra Ahuja

Authors

Tianzhu Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Bernard Ghanem
View author publications
You can also search for this author inPubMed Google Scholar
Si Liu
View author publications
You can also search for this author inPubMed Google Scholar
Narendra Ahuja
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Si Liu.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, T., Ghanem, B., Liu, S. et al. Robust Visual Tracking via Structured Multi-Task Sparse Learning. Int J Comput Vis 101, 367–383 (2013). https://doi.org/10.1007/s11263-012-0582-z

Download citation

Received: 02 April 2012
Accepted: 30 September 2012
Published: 09 November 2012
Issue Date: January 2013
DOI: https://doi.org/10.1007/s11263-012-0582-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust Visual Tracking via Structured Multi-Task Sparse Learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Visual tracking via robust multi-task multi-feature joint sparse representation

Robust particle tracking via spatio-temporal context learning and multi-task joint local sparse representation

Robust object tracking via local constrained and online weighted

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (avi 1635 KB)

Supplementary material 2 (avi 1575 KB)

Supplementary material 3 (avi 1890 KB)

Supplementary material 4 (avi 1777 KB)

Supplementary material 5 (avi 4539 KB)

Supplementary material 6 (avi 2923 KB)

Supplementary material 7 (avi 2204 KB)

Supplementary material 8 (avi 3211 KB)

Supplementary material 9 (avi 1682 KB)

Supplementary material 10 (avi 2641 KB)

Supplementary material 11 (avi 4459 KB)

Supplementary material 12 (avi 5427 KB)

Supplementary material 13 (avi 1603 KB)

Supplementary material 14 (avi 2836 KB)

Supplementary material 15 (avi 2762 KB)

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Robust Visual Tracking via Structured Multi-Task Sparse Learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Explore related subjects

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now