Abstract
Correlation filtering based tracking model has received significant attention and achieved great success in terms of both tracking accuracy and computational complexity. However, due to the limitation of the loss function, current correlation filtering paradigm could not reliably respond to the abrupt appearance changes of the target object. This study focuses on improving the robustness of the correlation filter learning. An anisotropy of the filter response is observed and analyzed for the correlation filtering based tracking model, through which the overfitting issue of previous methods is alleviated. Three sparsity related loss functions are proposed to exploit the anisotropy, leading to three implementations of visual trackers, correspondingly resulting in improved overall tracking performance. A large number of experiments are conducted and these experimental results demonstrate that the proposed approach greatly improves the robustness of the learned correlation filter. The proposed trackers performs comparably against state-of-the-art tracking methods on four latest standard tracking benchmark datasets.
Similar content being viewed by others
Notes
The Gaussian shaped response is not necessarily isotropic because the covariance matrix determines the shape of a Gaussian. It is isotropic only if the covariance matrix is diagonal and all the diagonal elements have equal values. In previous methods, only the isotropic Gaussian response is employed since it is considered as the continuous version of an impulse signal in the image space. For the sake of simplicity, the Gaussian shaped response refers to the isotropic Gaussian case in this work hereafter.
The exact equivalence between regression and correlation filtering under the circulant structure assumption is proved in Henriques et al. (2015).
The rows of the kernel matrix \({\mathbf {K}}\) are actually obtained from the fully cyclic shifts of the vector \({\mathbf {k}}_1\).
References
Bach, F., Jenatton, R., Mairal, J., & Obozinski, G. (2011). Convex optimization with sparsity-inducing norms. Optimization for Machine Learning, 5, 19–53.
Beck, A., & Teboulle, M. (2009). A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences, 2(1), 183–202.
Bertinetto, L., Henriques, J., Valmadre, J., Torr, P., & Vedaldi, A. (2016a). SiameseFC-ResNet. In ECCV VOT workshop.
Bertinetto, L., Valmadre, J., Golodetz, S., Miksik, O., & Torr, P. (2016b). Staple: Complementary learners for real-time tracking. In CVPR.
Bertinetto, L., Valmadre, J., Miksik, O., Golodetz, S., & Torr, P. H. (2015). The importance of estimating object extent when tracking with correlation filters. In ICCV VOT workshop.
Bibi, A., Mueller, M., & Ghanem, B. (2016). Target response adaptation for correlation filter tracking. In ECCV.
Bogun, I., & Ribeiro, E. (2015). Structure tracker with the robust Kalman filter. In ICCV VOT workshop.
Bolme, D., Beveridge, J. R., Draper, Ba., & Lui, Y. M. (2010). Visual object tracking using adaptive correlation filters. In CVPR.
Chen, B., Wang, L., & Lu, H. (2017). FSTC. In ICCV VOT workshop
Chen, K., & Tao, W. (2016). Convolutional regression for visual tracking. arXiv.
Chi, Z., Lu, H., Wang, L., & Sun, C. (2016). Dual deep network tracker. In ECCV VOT workshop.
Danelljan, M., Bhat, G., Khan, S., & Felsberg, M. (2017a). Efficient convolution operator tracker: Hand crafted. In ICCV VOT workshop.
Danelljan, M., Ghat, G., Khan, F., & Felsberg, M. (2017b). ECO: Efficient convolution operators for tracking. In CVPR.
Danelljan, M., Gustav, H., Khan, F. S., & Felsberg, M. (2015). Learning spatially regularized correlation filters for visual tracking. In ICCV.
Danelljan, M., Häger, G., Khan, F.S., & Felsberg, M. (2014a). Accurate scale estimation for robust visual tracking. In BMVC.
Danelljan, M., Khan, F. S., Felsberg, M., & Weijer, J. V. D. (2014b). Adaptive color attributes for real-time visual tracking. In CVPR.
Danelljan, M., Robinson, A., Shahbaz, K., & Felsberg, M. (2016). Beyond correlation filters: Learning continuous convolution operators for visual tracking. In ECCV.
Duffner, S., & Garcia, C. (2015). Using discriminative motion context for online visual object tracking. IEEE Transactions on Circuits and Systems for Video Technology (TCVST), 26(12), 2215–2225.
Gao, J., Zhang, T., Xu, C., & Liu, B. (2016). Discriminative deep correlation tracking. In ECCV VOT workshop.
Gundogdu, E., & Alatan, A. (2017). Good features to correlate for visual tracking. arXiv.
Hare, S., Saffari, A., & Torr, P. (2011). Struck: Structured output tracking with kernels. In ICCV.
He, Z., Fan, Y., & Zhuang, J. (2017). CFWCR. In ICCV VOT workshop.
Henriques, F., Caseiro, R., Martins, P., & Batista, J. (2012). Exploiting the circulant structure of tracking-by-detection with kernels. In ECCV.
Henriques, J., Caseiro, R., Martins, P., & Batista, J. (2015). High-speed tracking with kernelized correlation filters. IEEE TPAMI, 37(3), 583–596.
Hu, T., Du, D., Wen, L., Li, W., Qi, H., & Lyu, S. (2016). Geometric structure hyper-graph based tracker version 2. In ECCV VOT Workshop.
Hua, Y., Alahari, K., & Schmid, C. (2015). Online object tracking with proposal selection. In ICCV.
Kalal, Z., Mikolajczyk, K., & Matas, J. (2012). Tracking–learning-detection. IEEE TPAMI, 34(7), 1409–1422.
Kristan, M., Matas, J., Leonardis, A., Vojir, T., Pflugfelder, R., Fernandez, G., et al. (2016). A novel performance evaluation methodology for single-target trackers. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 38(11), 2137–2155.
Kwon, J., & Lee, K. (2010). Visual tracking decomposition. In CVPR.
Lee, H., & Kim, D. (2016). Salient region based tracker. In ECCV VOT workshop.
Lee, J. Y., Choi, S., Jeong, J. C., Kim, J. W., & Cho, J. I. (2015). Scaled SumShift tracker. In ICCV VOT workshop.
Lee, J. Y., Choi, S., Jeong, J. C., Kim, J. W., & Cho, J. I. (2016). SumShift tracker with kernelized correlation filter. In ECCV VOT workshop.
Lee, J. Y., & Yu, W. (2011). Visual tracking by partition-based histogram backprojection and maximum support criteria. In IEEE international conference on robotics and biomimetics.
Li, Y., & Zhu, J. (2014). A scale adaptive kernel correlation filter tracker with feature integration. In ECCV workshop.
Li, Y., & Zhu, J. (2015). NSAMF. In ICCV VOT workshop.
Liu, S., Zhang, T., Cao, X., & Xu, C. (2016). Structural correlation filter for robust visual tracking. In CVPR.
Liu, T., Wang, G., & Yang, Q. (2015). Real-time part-based visual tracking via adaptive correlation filters. In CVPR.
Lukezic, A., Cehovin, L., & Kristan, M. (2015). Layered deformable parts tracker. In ICCV VOT workshop.
Lukezic, A., Cehovin, L., & Kristan, M. (2016). Deformable parts correlation filters for robust visual tracking. arXiv.
Lukezic, A., Vojir, T., Cehovin, L., Matas, J., & Kristan, M. (2017a). Discriminative correlation filter with channel and spatial reliability. In CVPR.
Lukezic, A., Vojir, T., Cehovin, L., Matas, J., & Kristan, M. (2017b). Discriminative correlation filter with channel and spatial reliability: Fast. In ICCV VOT workshop.
Ma, C., Huang, J. B., Yang, X., & Yang, M. H. (2015a). Hierarchical convolutional features for visual tracking. In ICCV.
Ma, C., Yang, X., Zhang, C., & Yang, Mh. (2015b). Long-term correlation tracking. In CVPR.
Mei, X., & Ling, H. (2011). Robust visual tracking and vehicle classification via sparse representation. IEEE TPAMI, 33(11), 2259–2272.
Mocanu, B., Tapu, R., & Zaharia, T. (2017). Adaptive single object tracking using offline learned motion and visual similar patterns. In ICCV VOT workshop.
Nam, B. M. H., & Han, B. (2016a). Modeling and propagating CNNs in a tree structure for visual tracking. arXiv.
Nam, H., & Han, B. (2016b). Learning multi-domain convolutional neural networks for visual tracking. In CVPR.
Poostchi, M., Palaniappan, K., Seetharaman, G., & Gao, K. (2017). Spatial pyramid context-aware tracker. In ICCV VOT workshop.
Possegger, H., Mauthner, T., & Bischof, H. (2015). In defense of color-based model-free tracking. In CVPR.
Qi, Y., Qin, L., Zhang, S., & Huang, Q. (2016a). Scale-and-state aware tracker. In ECCV VOT workshop.
Qi, Y., Zhang, S., Qin, L., Yao, H., Huang, Q., Lim, J., et al. (2016b). Hedged deep tracking. In CVPR.
Singh, S., & Mishra, D. (2017). gNetTracker. In ICCV VOT workshop.
Smeulders, A. W. M., Chu, D. M., Cucchiara, R., Calderara, S., Dehghan, A., & Shah, M. (2014). Visual tracking: An experimental survey. IEEE TPAMI, 36(7), 1442–1468.
Sui, Y., Tang, Y., & Zhang, L. (2015a). Discriminative low-rank tracking. In ICCV.
Sui, Y., Tang, Y., Zhang, L., & Wang, G. (2018a). Visual tracking via subspace learning: A discriminative approach. International Journal of Computer Vision (IJCV), 126(5), 515–536.
Sui, Y., Wang, G., Tang, Y., & Zhang, L. (2016a). Tracking completion. In ECCV.
Sui, Y., Wang, G., & Zhang, L. (2018b). Correlation filter learning toward peak strength for visual tracking. IEEE Transactions on Cybernetics, 48(4), 1290–1303.
Sui, Y., Wang, G., Zhang, L., & Yang, M. H. (2018c). Exploiting spatial–temporal locality of tracking via structured dictionary learning. IEEE Transactions on Image Processing (TIP), 27(3), 1282–1296.
Sui, Y., & Zhang, L. (2015). Visual tracking via locally structured Gaussian process regression. IEEE SPL, 22(9), 1331–1335.
Sui, Y., & Zhang, L. (2016). Robust tracking via locally structured representation. IJCV, 119(2), 110–144.
Sui, Y., Zhang, S., & Zhang, L. (2015b). Robust visual tracking via sparsity-induced subspace learning. IEEE TIP, 24(12), 4686–4700.
Sui, Y., Zhang, Z., Wang, G., Tang, Y., & Zhang, L. (2016b). Real-time visual tracking: Promoting the robustness of correlation filter learning. In ECCV.
Sui, Y., Zhao, X., Zhang, S., Yu, X., Zhao, S., & Zhang, L. (2015c). Self-expressive tracking. Pattern Recognit., 48(9), 2872–2884.
Sun, C., Liu, J., Lu, H., & Yang, M. H. (2017). Learning spatial-aware regressions for visual tracking. In ICCV VOT workshop.
Tang, M., & Feng, J. (2015). Multi-kernel correlation filter for visual tracking. In ICCV.
Vojir, T., Matas, J., & Noskova, J. (2015). Online adaptive hidden Markov model for multi-tracker fusion. arXiv.
Vojir, T., Noskova, J., & Matas, J. (2014). Robust scale-adaptive mean-shift for tracking. Pattern Recognit. Lett., 40, 250–258.
Walsh, R., & Mederios, H. (2016). CF2 with Response Information Failure Detection. In ECCV VOT workshop.
Wang, D., Lu, H., & Yang, M. H. (2013). Least soft-thresold squares tracking. In CVPR.
Wang, L., Lu, H., Wang, Y., & Sun, C. (2016a). Multi-level deep feature tracker. In ECCV VOT workshop.
Wang, L., Ouyang, W., Wang, X., & Lu, H. (2015a). Visual tracking with fully convolutional networks. In ICCV.
Wang, L., Ouyang, W., Wang, X., & Lu, H. (2016b). STCT: Sequentially training convolutional networks for visual tracking. In CVPR.
Wang, N., Huang, Z., Li, S., & Yeung, D. Y. (2015b). Ensemble-based tracking: Aggregating crowdsourced structured time series data. In ICML.
Wang, N., Li, S., Gupta, A., & Yeung, D. Y. (2015c). Transferring rich feature hierarchies for robust visual tracking. arXiv.
Wang, N., Zhou, W., & Li, H. (2017a). Dual deep network tracker. In ICCV VOT workshop.
Wang, Q., Gao, J., Xing, J., Zhang, M., Z. Z., & Hu, W. (2017b). SiamDCF. In ICCV VOT workshop.
Wen, L., Du, D., Li, S., Chang, C.M., Lyu, S., & Huang, Q. (2016). Structure hyper-graph based correlation filter tracker. In ECCV VOT workshop.
Wright, J., Ma, Y., Mairal, J., & Sapiro, G. (2010). Sparse representation for computer vision and pattern recognition. Proceedings of The IEEE, 98(6), 1031–1044.
Wu, Y., Lim, J., & Yang, M. H. (2013). Online object tracking: A benchmark. In CVPR.
Wu, Y., Lim, J., & Yang, M. H. (2015). Object tracking benchmark. IEEE TPAMI, 37(9), 1834–1848.
Xu, Z., Li, Y., & Zhu, J. (2016). An improved STAPLE tracker with multiple feature integration. In ECCV VOT Workshop.
Yang, L., Liu, R., Zhang, D., & Zhang, L. (2017). Deep location-specific tracking. In ICCV VOT workshop.
Yilmaz, A., Javed, O., & Shah, M. (2006). Object tracking: A Survey. ACM Computing Surveys, 38(4), 13–57.
Zhang, J., Ma, S., & Sclaroff, S. (2014a). MEEM: Robust tracking via multiple experts using entropy minimization. In ECCV.
Zhang, K., Zhang, L., Liu, Q., Zhang, D., & Yang, M. H. (2014b). Fast visual tracking via dense spatio-temporal context learning. In ECCV.
Zhang, M., Xing, J., Gao, J., & Hu, W. (2016). Fully-functional correlation filtering-based tracker. In ECCV VOT workshop.
Zhang, M., Xing, J., Gao, J., Shi, X., Wang, Q., & Hu, W. (2015a). Rotation adaptive joint scale-spatial correlation filter based tracker. In ICCV VOT workshop.
Zhang, S., Sui, Y., Zhao, S., Yu, X., & Zhang, L. (2015b). Multi-local-task learning with global regularization for object tracking. Pattern Recognit., 48(12), 3881–3894.
Zhang, S., Zhao, S., Sui, Y., & Zhang, L. (2015c). Single object tracking with fuzzy least squares support vector machine. IEEE TIP, 24(12), 5723–5738.
Zhang, T., Gao, J., & Xu, C. (2017a). Robust correlation particle filter. In ICCV VOT workshop.
Zhang, T., Ghanem, B., & Liu, S. (2012a). Robust visual tracking via multi-task sparse learning. In CVPR.
Zhang, T., Ghanem, B., Liu, S., & Ahuja, N. (2012b). Low-rank sparse learning for robust visual tracking. In ECCV.
Zhang, T., Liu, S., Xu, C., Yan, S., Ghanem, B., Ahuja, N., & Yang, Mh. (2015d). Structural sparse tracking. In CVPR.
Zhang, T., Xu, C., & Yang, M. H. (2017b). Multi-task correlation particle filter for robust object tracking. In CVPR.
Zhu, G., Porikli, F., & Li, H. (2016). Beyond local search: Tracking objects everywhere with instance-specific proposals. CVPR.
Zhu, Z., Huang, G., Zou, W., & Du, D., Huang, C. (2017). UCT. In ICCV VOT workshop.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Xiaoou Tang.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was supported in part by the National Natural Science Foundation of China (NSFC) under Grants 61132007 and 61573351, in part by the Kansas NASA EPSCoR Program under Grant KNEP-PDG-10-2017-KU, and in part by the joint fund of Civil Aviation Research by the National Natural Science Foundation of China (NSFC) and Civil Aviation Administration under Grant U1533132.
Appendix: Baseline Trackers
Appendix: Baseline Trackers
Extensive visual trackers are employed in the experimental evaluations as the baseline trackers. In this appendix section, we present the citations of these baseline trackers.
1.1 Baseline Trackers on the OTB 2015 Benchmark
PSCF (Sui et al. 2018b), RCF (Sui et al. 2016b), KCF_AT (Bibi et al. 2016), SRDCF (Danelljan et al. 2015), HCFT (Ma et al. 2015a), SAMF (Li and Zhu 2014), DSST (Danelljan et al. 2014a), KCF (Henriques et al. 2015), CN (Danelljan et al. 2014b), and CSK (Henriques et al. 2012).
1.2 Baseline Trackers on the VOT 2015 Benchmark
MDNet (Nam and Han 2016b), DeepSRDCF (Danelljan et al. 2015), EBT (Zhu et al. 2016), SRDCF(Danelljan et al. 2015), LDP (Lukezic et al. 2015), sPST (Hua et al. 2015), SC-EBT (Wang et al. 2015b), NSAMF (Li and Zhu 2015), Struck (Hare et al. 2011), RAJSSC (Zhang et al. 2015a), S3Tracker (Lee et al. 2015), SumShift (Lee and Yu 2011), SODLT (Wang et al. 2015c), DAT (Possegger et al. 2015), MEEM (Zhang et al. 2014a), RobStruck (Bogun and Ribeiro 2015), OACF (Bertinetto et al. 2015), MCT (Duffner and Garcia 2015), HMMTxD (Vojir et al. 2015), ASMS (Vojir et al. 2014).
1.3 Baseline Trackers on the VOT 2016 Benchmark
C-COT (Danelljan et al. 2016), TCNN (Nam and Han 2016a), SSAT (Qi et al. 2016a), MLDF (Wang et al. 2016a), Staple (Bertinetto et al. 2016b), DDC (Gao et al. 2016), EBT (Zhu et al. 2016), SRBT (Lee and Kim 2016), STAPLE+ (Xu et al. 2016), DNT (Chi et al. 2016), SSKCF (Lee et al. 2016), SiamFC-R (Bertinetto et al. 2016a), DeepSRDCF (Danelljan et al. 2015), SHCT (Wen et al. 2016), MDNet-N (Nam and Han 2016b), FCF (Zhang et al. 2016), SRDCF (Danelljan et al. 2015), RFD-CF2 (Walsh and Mederios 2016), GGTv2 (Hu et al. 2016), DPT (Lukezic et al. 2016).
1.4 Baseline Trackers on the VOT 2017 Benchmark
LSART (Sun et al. 2017), CFWCR (He et al. 2017), CFCF (Gundogdu and Alatan 2017), ECO (Danelljan et al. 2017b), Gnet (Singh and Mishra 2017), MCCT (Wang et al. 2017a), CCOT (Danelljan et al. 2016), CSRDCF (Lukezic et al. 2017a), SiamDCF (Wang et al. 2017b), MCPF (Zhang et al. 2017b), CRT (Chen and Tao 2016), ECOhc (Danelljan et al. 2017a), DLST (Yang et al. 2017), CSRDCFf (Lukezic et al. 2017b), RCPF (Zhang et al. 2017a), UCT (Zhu et al. 2017), SPCT (Poostchi et al. 2017), ATLAS (Mocanu et al. 2017), MEEM (Zhang et al. 2014a), FSTC (Chen et al. 2017).
Rights and permissions
About this article
Cite this article
Sui, Y., Zhang, Z., Wang, G. et al. Exploiting the Anisotropy of Correlation Filter Learning for Visual Tracking. Int J Comput Vis 127, 1084–1105 (2019). https://doi.org/10.1007/s11263-019-01156-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-019-01156-6