Skip to main content
Log in

Applying Detection Proposals to Visual Tracking for Scale and Aspect Ratio Adaptability

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

The newly proposed correlation filter based trackers can achieve appealing performance despite their great simplicity and superior speed. However, this kind of object trackers is not born with scale and aspect ratio adaptability, thus resulting in suboptimal tracking accuracy. To tackle this problem, this paper integrates the class-agnostic detection proposal method, which is widely adopted in object detection area, into a correlation filter tracker. In the tracker part, optimizations such as feature integration, robust model updating and proposal rejection are applied for efficient integration. As for proposal generation, through integrating and comparing four detection proposal generators along with two baseline methods, the quality of detection proposals is found to have considerable influence on tracking accuracy. Therefore, as the most promising proposal generator, EdgeBoxes is chosen and further enhanced with background suppression. Evaluations are mainly performed on a challenging 50-sequence dataset (OTB50) and its two subsets, 28 sequences with significant scale variation and 14 sequences with obvious aspect ratio change. Among the trackers equipped with different proposal generators, state-of-the-art trackers and existing correlation filter variants, our proposed tracker reports the highest accuracy while running efficiently at an average speed of 20.4 frames per second. Additionally, numerical performance analysis in per-sequence manner and experiment results on VOT2014 dataset are also presented to enable deeper insights into our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. We notice that in some sequences, the tracking bounding box of STC shrinks to extremely small size, resulting in even faster speed but unreliable results.

  2. Here “\(x\sim y\)” means that the variation is examined between each frame’s x-th and y-th previous frame.

  3. Here “\(x\sim y\)” means that the relative variation exceeds (1 / xx) but still remains within (1 / yy).

References

  • Alexe, B., Deselaers, T., & Ferrari, V. (2012). Measuring the objectness of image windows. TPAMI, 34(11), 2189–2202.

    Article  Google Scholar 

  • Arbelaez, P., Pont-Tuset, J., Barron, J., Marqués, F., & Malik, J. (2014). Multiscale combinatorial grouping. In CVPR (pp. 328–335).

  • Belagiannis, V., Schubert, F., Navab, N., & Ilic, S. (2012). Segmentation based particle filtering for real-time 2D object tracking. In ECCV (pp. 842–855).

  • Bolme, D. S., Beveridge, J. R., Draper, B. A., & Lui, Y. M. (2010). Visual object tracking using adaptive correlation filters. In CVPR (pp. 2544–2550).

  • Cai, Z., Wen, L., Yang, J., Lei, Z., & Li, S. (2012). Structured visual tracking with dynamic graph. In ACCV (pp. 86–97).

  • Carreira, J., & Sminchisescu, C. (2012). CPMC: Automatic object segmentation using constrained parametric min-cuts. TPAMI, 34(7), 1312–1328.

    Article  Google Scholar 

  • Cheng, M. M., Zhang, Z., Lin, W. Y., & Torr, P. H. S. (2014). BING: Binarized normed gradients for objectness estimation at 300fps. In CVPR (pp. 3286–3293).

  • Comaniciu, D., Ramesh, V., & Meer, P. (2003). Kernel-based object tracking. TPAMI, 25(5), 564–577.

    Article  Google Scholar 

  • Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In CVPR (pp. 886–893).

  • Danelljan, M., Häger, G., Khan, F. S., & Felsberg, M. (2014a). Accurate scale estimation for robust visual tracking. In BMVC.

  • Danelljan, M., Shahbaz Khan, F., Felsberg, M., & Van de Weijer, J. (2014b). Adaptive color attributes for real-time visual tracking. In CVPR (pp. 1090–1097).

  • Dollár, P., & Zitnick, C. L. (2013). Structured forests for fast edge detection. In ICCV (pp. 1841–1848).

  • Duffner, S., & Garcia, C. (2013). PixelTrack: A fast adaptive algorithm for tracking non-rigid objects. In ICCV (pp. 2480–2487).

  • Everingham, M., Eslami, S. M. A., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2015). The pascal visual object classes challenge: A retrospective. IJCV, 111(1), 98–136.

    Article  Google Scholar 

  • Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR (pp. 580–587).

  • Godec, M., Roth, P. M., & Bischof, H. (2011). Hough-based tracking of non-rigid objects. In ICCV (pp. 81–88).

  • Hare, S., Saffari, A., & Torr, P. H. S. (2011). Struck: Structured output tracking with kernels. In ICCV (pp. 263–270).

  • He, K., Zhang, X., Ren, S., Sun, J. (2014). Spatial pyramid pooling in deep convolutional networks for visual recognition. In ECCV (pp. 346–361).

  • Henriques, J. F., Caseiro, R., Martins, P., & Batista, J. (2012). Exploiting the circulant structure of tracking-by-detection with kernels. In ECCV (pp. 702–715).

  • Henriques, J. F., Caseiro, R., Martins, P., & Batista, J. (2015). High-speed tracking with kernelized correlation filters. TPAMI. doi:10.1109/TPAMI.2014.2345390.

  • Hosang, J., Benenson, R., & Schiele, B. (2014). How good are detection proposals, really?. In BMVC.

  • Hosang, J., Benenson, R., Dollár, P., & Schiele, B. (2015). What makes for effective detection proposals? TPAMI. doi:10.1109/TPAMI.2015.2465908.

  • Hua, Y., Alahari, K., & Schmid, C. (2015). Online object tracking with proposal selection. In ICCV, (pp. 3092–3100).

  • Huang, D., Luo, L., Wen, M., Chen, Z., & Zhang, C. (2015). Enable scale and aspect ratio adaptability in visual tracking with detection proposals. In BMVC.

  • Jia, X., Lu, H., & Yang, M. H. (2012). Visual tracking via adaptive structural local sparse appearance model. In CVPR (pp. 1822–1829).

  • Kalal, Z., Matas, J., & Mikolajczyk, K. (2010). P-N learning: Bootstrapping binary classifiers by structural constraints. In CVPR (pp. 49–56).

  • Krähenbühl, P., & Koltun, V. (2014). Geodesic object proposals. In ECCV (pp. 725–739).

  • Kristan, M., Pflugfelder, R., & Leonardis, A, et al. (2013). The visual object tracking VOT2013 challenge results. In ICCV workshop (pp. 98–111).

  • Kristan, M., Pflugfelder, R., & Leonardis, A, et al. (2014). The visual object tracking VOT2014 challenge results. http://votchallenge.net/vot2014/download/vot_2014_paper.pdf

  • Kwon, J., & Lee, K. M. (2010). Visual tracking decomposition. In CVPR (pp. 1269–1276).

  • Li, Y., & Zhu, J. (2014). A scale adaptive kernel correlation filter tracker with feature integration. In ECCV workshop, (pp. 254–265).

  • Liang, P., Pang, Y., Liao, C., Mei, X., & Ling, H. (2016). Adaptive objectness for object tracking. IEEE Signal Processing Letters, 23(7), 949–953.

    Article  Google Scholar 

  • Liu, B., Huang, J., Yang, L., & Kulikowsk, C. (2011). Robust tracking using local sparse appearance model and k-selection. In CVPR (pp. 1313–1320).

  • Liu, T., Wang, G., & Yang, Q. (2015). Real-time part-based visual tracking via adaptive correlation filters. In CVPR (pp. 4902–4912).

  • Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN: Towards real-time object detection with region proposal networks. In NIPS (pp. 91–99).

  • Uijlings, J. R. R., Van de Sande, K. E. A., Gevers, T., & Smeulders, A. W. M. (2013). Selective search for object recognition. IJCV, 104(2), 154–171.

    Article  Google Scholar 

  • Van de Weijer, J., Schmid, C., Verbeek, J., & Larlus, D. (2009). Learning color names for real-world applications. TIP, 18(7), 1512–1523.

    MathSciNet  Google Scholar 

  • Wang, A., Wan, G., Cheng, Z., & Li, S. (2009). An incremental extremely random forest classifier for online learning and tracking. In ICIP (pp. 1449–1452).

  • Wang, A., Cheng, Z., Martin, R. R., & Li, S. (2013). Multiple-cue-based visual object contour tracking with incremental learning. LNCS, 7544, 225–243.

    Google Scholar 

  • Wen, L., Du, D., Lei, Z., Li, S. Z., & Yang, M. H. (2015). JOTS: Joint online tracking and segmentation. In CVPR (pp. 2226–2234).

  • Wu, Y., Lim, J., & Yang, M. H. (2013). Online object tracking: A benchmark. In CVPR (pp. 2411–2418).

  • Zhang, K., Zhang, L., Zhang, D., & Yang, M. H. (2014). Fast visual tracking via dense spatio-temporal context learning. In ECCV (pp. 127–141).

  • Zhong, W., Lu, H., & Yang, M. H. (2012). Robust object tracking via sparsity-based collaborative model. In CVPR (pp. 1838–1845).

  • Zhou, T. (2015). Bing objectness proposal estimator matlab (mex-c) wrapper. https://github.com/tfzhou/BINGObjectness

  • Zhu, G., Porikli, F., & Li, H. (2016a). Beyond local search: Tracking objects everywhere with instance-specific proposals. In CVPR (pp. 943–951).

  • Zhu, G., Porikli, F., & Li, H. (2016b). Robust visual tracking with deep convolutional neural network based object proposals on PETS. In CVPR workshop (pp. 26–33).

  • Zhu, G., Wang, J., Wu, Y., Zhang, X., & Lu, H. (2016c). MC-HOG correlation tracking with saliency proposal. In AAAI (pp. 3690–3696).

  • Zitnick, C. L., & Dollár, P. (2014). Edge Boxes: Locating object proposals from edges. In ECCV (pp. 391–405).

Download references

Acknowledgements

The authors gratefully acknowledge the support from National Natural Science Foundation of China under No. 61272145, 61402504, and 863 Program of China under No. 2012-AA012706.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Luo.

Additional information

Communicated by Cordelia Schmid.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, D., Luo, L., Chen, Z. et al. Applying Detection Proposals to Visual Tracking for Scale and Aspect Ratio Adaptability. Int J Comput Vis 122, 524–541 (2017). https://doi.org/10.1007/s11263-016-0974-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-016-0974-6

Keywords

Navigation