Skip to main content

Siamese Tracking with Bilinear Features

  • Conference paper
  • First Online:
Pattern Recognition (ACPR 2021)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13189))

Included in the following conference series:

  • 896 Accesses

Abstract

Bilinear features arise in fine-grained visual recognition. They are advantageous to encode detailed representations and attributes to differentiate visually similar objects. The apparent similarity is challenging in visual tracking where background distractors interfere siamese trackers to localize the target object. Especially when distractors and the target belong to the same object category. To increase the discrimination between similar appearance objects, we propose an efficient bilinear encoding method for siamese tracking. The proposed method consists of a self-bilinear encoder and an cross-bilinear encoder. The bilinear features generated via the self-bilinear encoder and the cross-bilinear encoder represent target variations itself and target distractor difference, respectively. To this end, the proposed bilinear encoders advance siamese trackers to capture target appearance variations while differentiating the target and background distractors. Experiments on the benchmark datasets show the effectiveness of bilinear features. Our tracker performs favorably against state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-convolutional Siamese networks for object tracking. In: Hua, G., Jégou, H. (eds.) Computer Vision – ECCV 2016 Workshops: Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part II, pp. 850–865. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_56

    Chapter  Google Scholar 

  2. Bhat, G., Danelljan, M., Gool, L.V., Timofte, R.: Learning discriminative model prediction for tracking. In: ICCV, pp. 6182–6191 (October 2019)

    Google Scholar 

  3. Chen, K., Tao, W.: Convolutional regression for visual tracking. TIP 27(7), 3611–3620 (2018)

    MathSciNet  MATH  Google Scholar 

  4. Chen, Z., Zhong, B., Li, G., Zhang, S., Ji, R.: Siamese box adaptive network for visual tracking. In: CVPR, pp. 6668–6677 (2020)

    Google Scholar 

  5. Cui, Y., Zhou, F., Wang, J., Liu, X., Lin, Y., Belongie, S.: Kernel pooling for convolutional neural networks. In: CVPR (July 2017)

    Google Scholar 

  6. Dai, K., Wang, D., Lu, H., Sun, H., Li, J.: Visual tracking via adaptive spatially-regularized correlation filters. In: CVPR, June 2019 (2019)

    Google Scholar 

  7. Danelljan, M., Bhat, G., Khan, F.S., Felsberg, M.: ECO: efficient convolution operators for tracking. In: CVPR (2017)

    Google Scholar 

  8. Danelljan, M., Bhat, G., Khan, F.S., Felsberg, M.: ATOM: accurate tracking by overlap maximization. In: CVPR, June 2019 (2019)

    Google Scholar 

  9. Danelljan, M., Gool, L.V., Timofte, R.: Probabilistic regression for visual tracking. In: CVPR, pp. 7183–7192 (2020)

    Google Scholar 

  10. Danelljan, M., Robinson, A., Shahbaz Khan, F., Felsberg, M.: Beyond correlation filters: learning continuous convolution operators for visual tracking. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 472–488. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_29

    Chapter  Google Scholar 

  11. Dong, X., Shen, J.: Triplet loss in Siamese network for object tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 472–488. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01261-8_28

    Chapter  Google Scholar 

  12. Fan, H., et al.: LaSOT: a high-quality benchmark for large-scale single object tracking. In: CVPR, pp. 5374–5383 (2019)

    Google Scholar 

  13. Fan, H., Ling, H.: Siamese cascaded region proposal networks for real-time visual tracking. In: CVPR, June 2019 (2019)

    Google Scholar 

  14. Gao, J., Zhang, T., Xu, C.: Graph convolutional tracking. In: CVPR, June 2019 (2019)

    Google Scholar 

  15. Gao, Y., Beijbom, O., Zhang, N., Darrell, T.: Compact bilinear pooling. In: CVPR, June 2016 (2016)

    Google Scholar 

  16. Guo, D., Wang, J., Cui, Y., Wang, Z., Chen, S.: SiamCAR: Siamese fully convolutional classification and regression for visual tracking. In: CVPR, June 2020, pp. 6269–6277 (2020)

    Google Scholar 

  17. Han, B., Sim, J., Adam, H.: BranchOut: regularization for online ensemble tracking with convolutional neural networks. In: CVPR (2017)

    Google Scholar 

  18. He, K., Zhang, X., Ren, S., Jian, S.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

    Google Scholar 

  19. He, Z., Fan, Y., Zhuang, J., Dong, Y., Bai, H.: Correlation filters with weighted convolution responses. In ICCV, October 2017 (2017)

    Google Scholar 

  20. Huang, L., Zhao, X., Huang, K.: GlobalTrack: a simple and strong baseline for long-term tracking. In: AAAI, vol. 34, pp. 11037–11044 (2020)

    Google Scholar 

  21. Huang, L., Zhao, X., Huang, K.: GOT-10k: a large high-diversity benchmark for generic object tracking in the wild. TPAMI 43(5), 1562–1577 (2021)

    Article  Google Scholar 

  22. Kristan, M., et al.: The visual object tracking vot2017 challenge results. In: ICCV (2017)

    Google Scholar 

  23. Kristan, M., et al.: The seventh visual object tracking VOT2019 challenge results. In ICCV, October 2019 (2019)

    Google Scholar 

  24. Li, B., Wu, W., Wang, Q., Zhang, F., Xing, J., Yan, J.: SiamRPN++: evolution of Siamese visual tracking with very deep networks. In: CVPR, June 2019 (2019)

    Google Scholar 

  25. Li, B., Yan, J., Wu, W., Zhu, Z., Hu, X.: High performance visual tracking with Siamese region proposal network. In: CVPR, June 2018 (2018)

    Google Scholar 

  26. Li, P., Chen, B., Ouyang, W., Wang, D., Yang, X., Lu, H.: GradNET: gradient-guided network for visual object tracking. In: ICCV, pp. 6162–6171 (2019)

    Google Scholar 

  27. Li, P., Xie, J., Wang, Q., Gao, Z.: Towards faster training of global covariance pooling networks by iterative matrix square root normalization. In: CVPR, June 2018 (2018)

    Google Scholar 

  28. Li, X., Ma, C., Wu, B., He, Z., Yang, M.-H.: Target-aware deep tracking. In: CVPR, pp. 1369–1378 (2019)

    Google Scholar 

  29. Li, Y., Wang, N., Liu, J., Hou, X.: Factorized bilinear models for image recognition. In: ICCV, pp. 2079–2087 (2017)

    Google Scholar 

  30. Li, Y., Zhu, J., Hoi, S.C.: Reliable patch trackers: robust visual tracking by exploiting reliable patches. In: CVPR, June 2015 (2015)

    Google Scholar 

  31. Lin, T.-Y.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  32. Lin, T.-Y., Maji, S., Koniusz, P.: Second-order democratic aggregation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 639–656. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_38

    Chapter  Google Scholar 

  33. Lin, T.-Y., RoyChowdhury, A., Maji, S.: Bilinear CNN models for fine-grained visual recognition. In: ICCV, December 2015 (2015)

    Google Scholar 

  34. Lukezic, A., Matas, J., Kristan, M.: D3S - a discriminative single shot segmentation tracker. In: CVPR, pp. 7133–7142 (2020)

    Google Scholar 

  35. Ma, C., Huang, J.-B., Yang, X., Yang, M.-H.: Hierarchical convolutional features for visual tracking. In: ICCV (2015)

    Google Scholar 

  36. Ma, Z., Wang, L., Zhang, H., Lu, W., Yin, J.: RPT: learning point set representation for Siamese visual tracking. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12539, pp. 653–665. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-68238-5_43

    Chapter  Google Scholar 

  37. Müller, M., Bibi, A., Giancola, S., Alsubaihi, S., Ghanem, B.: TrackingNet: a large-scale dataset and benchmark for object tracking in the wild. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11205, pp. 310–327. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01246-5_19

    Chapter  Google Scholar 

  38. Nam, H., Han, B.: Learning multi-domain convolutional neural networks for visual tracking. In: CVPR (2016)

    Google Scholar 

  39. Pu, S., Song, Y., Ma, C., Zhang, H., Yang, M.-H.: Deep attentive tracking via reciprocative learning. In: NeurIPS (2018)

    Google Scholar 

  40. Smeulders, A.W., Chu, D.M., Cucchiara, R., Calderara, S., Dehghan, A., Shah, M.: Visual tracking: an experimental survey. TPAMI 36, 1442–1468 (2014)

    Article  Google Scholar 

  41. Song, Y., et al.: VITAL: visual tracking via adversarial learning. In: CVPR, pp. 8990–8999 (2018)

    Google Scholar 

  42. Voigtlaender, P., Luiten, J., Torr, P.H., Leibe, B.: Siam R-CNN: visual tracking by re-detection. In: CVPR, pp. 6578–6588 (2020)

    Google Scholar 

  43. Wang, G., Luo, G., Xiong, Z., Zeng, W.: SPM-tracker: series-parallel matching for real-time visual object tracking. In: CVPR, pp. 3643–3652, June 2019 (2019)

    Google Scholar 

  44. Wang, Q., Teng, Z., Xing, J., Gao, J., Hu, W., Maybank, S.: Learning attentions: residual attentional Siamese network for high performance online visual tracking. In: CVPR (2018)

    Google Scholar 

  45. Wang, Q., Zhang, L., Bertinetto, L., Hu, W., Torr, P.H.: Fast online object tracking and segmentation: a unifying approach. In: CVPR, June 2019 (2019)

    Google Scholar 

  46. Wei, X., Zhang, Y., Gong, Y., Zhang, J., Zheng, N.: Grassmann pooling as compact homogeneous bilinear pooling for fine-grained visual classification. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 365–380. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_22

    Chapter  Google Scholar 

  47. Wu, Y., Lim, J., Yang, M.-H.: Object tracking benchmark. TPAMI 37, 1834–1848 (2015)

    Article  Google Scholar 

  48. Xu, T., Feng, Z.-H., Wu, X.-J., Kittler, J.: Joint group feature selection and discriminative filter learning for robust visual object tracking. In: ICCV, October 2019 (2019)

    Google Scholar 

  49. Xu, Y., Wang, Z., Li, Z., Yuan, Y., Yu, G.: SiamFC++: towards robust and accurate visual tracking with target estimation guidelines. In: AAAI, vol. 34, pp. 12549–12556 (2020)

    Google Scholar 

  50. Yang, T., Chan, A.B.: Learning dynamic memory networks for object tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11213, pp. 153–169. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01240-3_10

    Chapter  Google Scholar 

  51. Yang, T., Xu, P., Hu, R., Chai, H., Chan, A.B.: ROAM: recurrently optimizing tracking model. In: CVPR, pp. 6718–6727, June 2020 (2020)

    Google Scholar 

  52. Yazdi, M., Bouwmans, T.: New trends on moving object detection in video images captured by a moving camera: a survey. Comput. Sci. Rev. 28, 157–177 (2018)

    Article  MathSciNet  Google Scholar 

  53. Yu, C., Zhao, X., Zheng, Q., Zhang, P., You, X.: Hierarchical bilinear pooling for fine-grained visual recognition. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11220, pp. 595–610. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01270-0_35

    Chapter  Google Scholar 

  54. Yue, K., Sun, M., Yuan, Y., Zhou, F., Ding, E., Xu, F.: Compact generalized non-local network. In: NeurIPS, November 2018 (2018)

    Google Scholar 

  55. Zhang, T., Xu, C., Yang, M.-H.: Multi-task correlation particle filter for robust object tracking. In: CVPR (2017)

    Google Scholar 

  56. Zhang, Y., Wang, L., Qi, J., Wang, D., Feng, M., Lu, H.: Structured Siamese network for real-time visual tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11213, pp. 355–370. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01240-3_22

    Chapter  Google Scholar 

  57. Zhang, Z., Peng, H.:L Deeper and wider Siamese networks for real-time visual tracking. In: CVPR, June 2019 (2019)

    Google Scholar 

  58. Zhu, Z., Wang, Q., Li, B., Wu, W., Yan, J., Hu, W.: Distractor-aware Siamese networks for visual object tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11213, pp. 103–119. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01240-3_7

    Chapter  Google Scholar 

  59. Zhu, Z., Wu, W., Zou, W., Yan, J.: End-to-end flow correlation tracking with spatial-temporal attention. In: CVPR (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhixiong Pi .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 633 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pi, Z., Gao, C., Sang, N. (2022). Siamese Tracking with Bilinear Features. In: Wallraven, C., Liu, Q., Nagahara, H. (eds) Pattern Recognition. ACPR 2021. Lecture Notes in Computer Science, vol 13189. Springer, Cham. https://doi.org/10.1007/978-3-031-02444-3_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-02444-3_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-02443-6

  • Online ISBN: 978-3-031-02444-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics