ABSTRACT
In this paper, we propose a novel graph model, called weighted sparse representation regularized graph, to learn a robust object representation using multispectral (RGB and thermal) data for visual tracking. In particular, the tracked object is represented with a graph with image patches as nodes. This graph is dynamically learned from two aspects. First, the graph affinity (i.e., graph structure and edge weights) that indicates the appearance compatibility of two neighboring nodes is optimized based on the weighted sparse representation, in which the modality weight is introduced to leverage RGB and thermal information adaptively. Second, each node weight that indicates how likely it belongs to the foreground is propagated from others along with graph affinity. The optimized patch weights are then imposed on the extracted RGB and thermal features, and the target object is finally located by adopting the structured SVM algorithm. Moreover, we also contribute a comprehensive dataset for RGB-T tracking purpose. Comparing with existing ones, the new dataset has the following advantages: 1) Its size is sufficiently large for large-scale performance evaluation (total frame number: 210K, maximum frames per video pair: 8K). 2) The alignment between RGB-T video pairs is highly accurate, which does not need pre- and post-processing. 3) The occlusion levels are annotated for analyzing the occlusion-sensitive performance of different methods. Extensive experiments on both public and newly created datasets demonstrate the effectiveness of the proposed tracker against several state-of-the-art tracking methods.
- G.-A. Bilodeau, A. Torabi, and P.-L. St-Charles et al.. 2014. Thermal-visible registration of human silhouettes: A similarity measure performance evaluation. Infrared Physics & Technology Vol. 64 (2014), 79--86.Google ScholarCross Ref
- S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein. 2011. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers. Foundations and Trends in Machine Learning Vol. 3, 1 (2011), 1--122. Google ScholarDigital Library
- F. Bunyak, K. Palaniappan, S. K. Nath, and G. Seetharaman. 2007. Geodesic active contour based fusion of visible and infrared video for persistent object tracking Proceedings of IEEE Workshop on Applications of Computer Vision. Google ScholarDigital Library
- D. Comaniciu, V. Ramesh, and P. Meer. 2003. Kernel-based object tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence (2003). Google ScholarDigital Library
- C. O. Conaire, N. Connor, and A. Smeaton. 2007. Thermo-visual feature fusion for object tracking using multiple spatiogram trackers. Machine Vision and Applications Vol. 7 (2007), 1--12. Google ScholarDigital Library
- C. O Conaire, N. E. Connor, E. Cooke, and A. F. Smeaton. 2006. Comparison of fusion methods for thermo-visual surveillance tracking Proceedings of International Conference on Information Fusion.Google Scholar
- N. Cvejic, S. G. Nikolov, H. D. Knowles, A. Loza, A. Achim, D. R. Bull, and C. N. Canagarajah. 2007. The effect of pixel-level fusion on object tracking in multi-sensor surveillance video Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
- M. Danelljan, G. Hager, F. Khan, and M. Felsberg. 2014. Accurate Scale Estimation for Robust Visual Tracking Proceedings of British Machine Vision Conference.Google Scholar
- J. W. Davis and M. A. Keck. 2005. A Two-Stage Template Approach to Person Detection in Thermal Imagery Application of Computer Vision, 2005. WACV/MOTIONS '05 Volume 1. Seventh IEEE Workshops on. Google ScholarDigital Library
- J. W. Davis and V. Sharma. 2007. Background-subtraction using contour-based fusion of thermal and visible imagery. Computer Vision and Image Understanding Vol. 106, 2 (2007), 162--182. Google ScholarDigital Library
- D. L. Donoho. 2006. Compressed sensing. IEEE Transactions on Information Theory Vol. 52, 4 (2006), 1289--1306 Google ScholarDigital Library
- S. Duffner and C. Garcia. 2013. Pixeltrack: A fast adaptive algorithm for tracking non-rigid objects Proceedings of IEEE International Conference on Computer Vision. Google ScholarDigital Library
- E. Elhamifar and R. Vidal. 2009. Sparse subspace clustering. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
- M. Felsberg, A. Berg, and G. et al. Hager. 2015. The Thermal Infrared Visual Object Tracking VOT-TIR2015 Challenge Results Proceedings of IEEE International Conference on Computer Vision. Google ScholarDigital Library
- R. Gade and T. B. Moeslund. 2014. Thermal cameras and applications: a survey. Machine Vision and Applications Vol. 25 (2014), 245--262. Google ScholarDigital Library
- X. Guo. 2015. Robust Subspace Segmentation by Simultaneously Learning Data Representations and Their Affinity Matrix. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence. Google ScholarDigital Library
- S. Hare, A. Saffari, and P. H. S. Torr. 2011. Struck: Structured output tracking with kernels. Proceedings of IEEE International Conference on Computer Vision. Google ScholarDigital Library
- S. He, Q. Yang, R. Lau, J. Wang, and M.-H. Yang. 2013. Visual tracking via locality sensitive histograms. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarDigital Library
- Joao F. Henriques, Rui Caseiro, Pedro Martins, and Jorge Batista. 2015. High-Speed Tracking with Kernelized Correlation Filters. IEEE Transactions on Pattern Analysis and Machine Intelligence (2015).Google ScholarDigital Library
- S. Hwang, J. Park, and N. et al. Kim. 2015. Multispectral Pedestrian Detection: Benchmark Dataset and Baseline Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
- H.-U. Kim, D.-Y. Lee, J.-Y. Sim, and C.-S. Kim. 2015. SOWP: Spatially Ordered and Weighted Patch Descriptor for Visual Tracking Proceedings of IEEE International Conference on Computer Vision. Google ScholarDigital Library
- M. Kristan, J. Matas, A. Leonardis, and M. Felsberg et al.. 2015. The Visual Object Tracking VOT2015 challenge results Proceedings of IEEE International Conference on Computer Vision. Google ScholarDigital Library
- M. Kristan, R. Pflugfelder, A. Leonardis, J. Matas, and L. Cehovin et al.. 2014. The Visual Object Tracking VOT2014 challenge results Proceedings of European Conference on Computer Vision.Google Scholar
- S. J. Krotosky and M. M. Trivedi. 2007. On Color-, Infrared-, and Multimodal-Stereo Approaches to Pedestrian Detection. IEEE Trans. Intelligent Transportation Systems, Vol. 8, 4 (2007), 619--629. Google ScholarDigital Library
- X. Lan, A. J. Ma, and P. C. Yuen. 2014. Multi-cue visual tracking using robust feature-level fusion based on joint sparse representation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarDigital Library
- A. Leykin and R. Hammoud. 2010. Pedestrian tracking by fusion of thermal-visible surveillance videos. Machine Vision and Applications Vol. 21, 4 (2010), 587--595. Google ScholarDigital Library
- A Li, M Lin, Y Wu, MH Yang, and S Yan. 2016 b. NUS-PRO: A New Visual Tracking Challenge. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 38, 2 (2016), 335--349. Google ScholarDigital Library
- C. Li, H. Cheng, S. Hu, X. Liu, J. Tang, and L. Lin. 2016 a. Learning Collaborative Sparse Representation for Grayscale-thermal Tracking. IEEE Transactions on Image Processing Vol. 25, 12 (2016), 5743--5756. Google ScholarDigital Library
- C. Li, L. Lin, W. Zuo, and J. Tang. 2017 a. Learning Patch-Based Dynamic Graph for Visual Tracking Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4--9, 2017, San Francisco, California, USA. 4126--4132.Google ScholarCross Ref
- C. Li, X. Sun, X. Wang, L. Zhang, and J. Tang. 2017 b. Grayscale-thermal Object Tracking via Multi-task Laplacian Sparse Representation. IEEE Transactions on Systems, Man, and Cybernetics: Systems, Vol. 47, 4 (2017), 673--681.Google ScholarCross Ref
- C. Li, X. Wang, L. Zhang, J. Tang, H. Wu, and L. Lin. 2016 c. WELD: Weighted Low-rank Decomposition for Robust Grayscale-Thermal Foreground Detection. IEEE Transactions on Circuits and Systems for Video Technology (2016). Google ScholarDigital Library
- P. Liang, E. Blasch, and H. Ling. 2015. Encoding color information for visual tracking: Algorithms and benchmark. IEEE Transactions on Image Processing Vol. 24, 12 (2015), 5630--5644.Google ScholarDigital Library
- G. Liu, Z. Lin, S. Yan, J. Sun, Y. Yu, and Y. Ma. 2013. Robust Recovery of Subspace Structures by Low-Rank Representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35, 1 (2013), 171--184. Google ScholarDigital Library
- H. Liu and F. Sun. 2012. Fusion tracking in color and infrared images using joint sparse representation. Information Sciences, Vol. 55, 3 (2012), 590--599.Google Scholar
- C. Ma, X. Yang, C. Zhang, and M.-H. Yang. 2015. Long-term Correlation Tracking. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
- I. A. Matthews, T. Ishikawa, and S. Baker. 2004. The Template Update Problem. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 26, 6 (2004), 810--815. Google ScholarDigital Library
- A. Y. Ng, Michael I. Jordan, and Yair Weiss. 2001. On Spectral Clustering: Analysis and an algorithm. Proceedings of Neural Information Processing Systems. Google ScholarDigital Library
- J. Portmann, S. Lynen, M. Chli, and R. Siegwart. 2014. People detection and tracking from aerial thermal views Proceedings of IEEE International Conference on Robotics and Automation.Google Scholar
- K. Simonyan and A. Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR Vol. abs/1409.1556 (2014).Google Scholar
- A. Torabi, G. Masse, and G.-A. Bilodeau. 2012. An iterative integrated framework for thermal-visible image registration, sensor fusion, and people tracking for video surveillance applications. Computer Vision and Image Understanding Vol. 116, 2 (2012), 210--221. Google ScholarDigital Library
- I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun. 2005. Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research Vol. 6 (2005), 1453--1484. Google ScholarDigital Library
- Y. Wu, E. Blasch, G. Chen, L. Bai, and H. Ling. 2011. Multiple source data fusion via sparse representation for robust visual tracking Proceedings of International Conference on Information Fusion.Google Scholar
- Y. Wu, J. Lim, and M.-H. Yang. 2013. Online object tracking: A benchmark. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarDigital Library
- Y. Wu, J. Lim, and M.-H. Yang. 2015. Object tracking benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence (2015).Google ScholarDigital Library
- Z. Wu, N. Fuller, D. Theriault, and M. Betke. 2014. A Thermal Infrared Video Benchmark for Visual Analysis Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarDigital Library
- Y. Xu and W. Yin. 2013. A Block Coordinate Descent Method for Regularized Multiconvex Optimization with Applications to Nonnegative Tensor Factorization and Completion. SIAM Journal on Imaging Sciences Vol. 6, 3 (2013), 1758--1789.Google ScholarDigital Library
- S. Yan and H. Wang. 2009. Semi-supervised Learning by Sparse Representation. Proceedings of the SIAM International Conference on Data Mining.Google Scholar
- F. Yang, H. Lu, and M.-H. Yang. 2014. Robust Superpixel Tracking. IEEE Transactions on Image Processing Vol. 23, 4 (2014), 1639--1651. Google ScholarDigital Library
- J. Zhang, S. Ma, and S. Sclaroff. 2014. MEEM: robust tracking via multiple experts using entropy minimization Proceedings of European Conference on Computer Vision.Google Scholar
- D. Zhou, J. Weston, A. Gretton, O. Bousquet, and B. Scholkopf. 2004. Ranking on data manifolds. In Proceedings of Neural Information Processing Systems. Google ScholarDigital Library
- Weighted Sparse Representation Regularized Graph Learning for RGB-T Object Tracking
Recommendations
Robust object tracking based on sparse representation and incremental weighted PCA
Object tracking plays a crucial role in many applications of computer vision, but it is still a challenging problem due to the variations of illumination, shape deformation and occlusion. A new robust tracking method based on incremental weighted PCA ...
Object tracking via appearance modeling and sparse representation
This paper proposes a robust tracking method by the combination of appearance modeling and sparse representation. In this method, the appearance of an object is modeled by multiple linear subspaces. Then within the sparse representation framework, we ...
Robust visual tracking via nonlocal regularized multi-view sparse representation
Highlights- We propose a multi-view discriminant learning based sparse representation method to explore group similarity in the multi-feature space.
AbstractThe multi-view sparse representation based visual tracking has attracted increasing attention because the sparse representations of different object features can complement with each other. Since the robustness of different object ...
Comments