Skip to main content
Log in

A method of real-temporal object tracking combined the temporal information and spatial information

  • Optimization
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

The purpose of single target tracking is to accurately and continuously locate a specific object when it is moving. However, when the objects encounter with fast movement, severe occlusion, too small size, and the same local features, the tracking algorithm which based on correlation filter or convolutional neural network will appear the positioning error phenomenon. Aiming at the above problems, this paper designs a single target tracking algorithm: relative temporal spatial network (RTSnet). RTSnet is a multi-thread network that composed of Relative temporal Information Network (RTInet) and Relative Spatial Information Network (RSInet). RTInet is designed on the basis of LSTM, and it has the predictable characteristics of temporal. It mainly obtains the relative temporal information between the frames before and after the target. RSInet, an improved twin network based on the Triplet Network, has the effect of similarity determination which can to obtain the spatial information between the frames before and after the target. In the experiments, the RTSnet is trained by using LASOT data set and verified by using the LASOT test set and the OTB100 data set. In the test set of LASOT, the accuracy of RTSnet reaches 85.5%, Trans-T reaches 62.3% and STMTrack reaches 57.4%. Meanwhile, its tracking speed reaches 117.3fps due to the RTSnet adopts dual-thread operation. On the OTB100 data-set, the accuracy of RTSnet is 81.1%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s the Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s the Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

References

  • Bertinetto L, Valmadre J, Henriques J F, et al (2016a) Fully-convolutional siamese networks for object tracking. European conference on computer vision. pp 850-865. https://doi.org/10.1007/978-3-319-48881-3_56

  • Bertinetto L, Valmadre J, Golodetz S, et al (2016b) Staple: complementary learners for real-temporal tracking[C]. Computer vision and pattern recognition, pp 1401–1409

  • Bolme DS, Beveridge JR, Draper BA et al (2010) Visual object tracking using adaptive correlation filters[C]. Computer vision and pattern recognition, pp 2544–2550

  • Chen X, Yan B, Zhu J, Wang D, Lu H (2021) Transformer tracking. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)

  • Cheng S, Zhong B, Li G, Liu X, Tang Z, Li X et al. (2021). Learning to filter: siamese relation network for robust tracking

  • Danelljan M, Hager G, Khan F S, et al (2014) Accurate Scale Estimation for Robust Visual Tracking. British Machine Vision Conference.

  • Danelljan M, Hager G, Khan F S, et al (2015) Learning Spatially Regularized Correlation Filters for Visual Tracking[C]. International conference on computer vision, pp 4310–4318

  • Danelljan M, Bhat G, Khan F S, et al (2017) ECO: Efficient Convolution Operators for Tracking[C]. Computer vision and pattern recognition, pp 6931–6939

  • Fan H, Lin L, Yang F et al (2018) LaSOT: a high-quality benchmark for large-scale single object tracking. arXiv: Comp Vis Patt Recogn. https://doi.org/10.1007/s11263-020-01387-y

    Article  Google Scholar 

  • Fu Z, Liu Q, Fu Z, Y Wang (2021) Stmtrack: template-free visual tracking with space-time memory networks

  • Galoogahi H K, Fagg A, Lucey S, et al (2017) Learning Background-Aware Correlation Filters for Visual Tracking. International conference on computer vision, pp 1144–1152

  • Greve R, Jacobsen E J, Risi S, et al (2016) Evolving Neural Turing Machines for Reward-based Learning[C]. Genetic and evolutionary computation conference, pp 117–124.

  • Gulcehre C, Chandar S, Cho K, et al (2016) Dynamic neural turing machine with soft and hard addressing schemes [J]

  • Guo Q, Feng W, Zhou C, et al (2017) Learning dynamic siamese network for visual object tracking[C]. International conference on computer vision, pp 1781–1789

  • Hare S, Golodetz S, Saffari A et al (2015) Struck: structured output tracking with kernels. IEEE Trans Patt Anal Mach Intell 38(10):2096–2109

    Article  Google Scholar 

  • Held D, Thrun S, Savarese S, et al (2016) Learning to track at 100 FPS with deep regression networks. European conference on computer vision, pp 749–765. http://dpi.org/https://doi.org/10.1007/978-3-319-46448-0_45

  • Henriques JF, Caseiro R, Martins P et al (2015) High-speed tracking with kernelized correlation filters. IEEE Trans Patt Anal Mach Intell 37(3):583–596

    Article  Google Scholar 

  • Henriques J F, Caseiro R, Martins P, et al (2012) Exploiting the circulant structure of tracking-by-detection with kernels[C]. European Conference on computer vision, pp 702–715

  • Hoffer E, Ailon N (2015) Deep metric learning using triplet network [J]. http://dpi.org/https://doi.org/10.1007/978-3-319-24261-3_7

  • Iandola F N, Han S, Moskewicz M W, et al (2016) SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size [J]. https://doi.org/10.1007/978-1-4842-6168-2_7.

  • Li B, Yan J, Wu W, et al (2018) High performance visual tracking with siamese region proposal network. Computer vision and pattern recognition, pp 8971–8980

  • Li Y, Zhu J (2014) A scale adaptive kernel correlation filter tracker with feature integration. Eur Conf Computer Vision. https://doi.org/10.1007/978-3-319-16181-5_18

    Article  Google Scholar 

  • Lukezic A, Vojir T, Zajc L C, et al (2017) Discriminative correlation filter with channel and spatial reliability[C]. Computer vision and pattern recognition, pp 4847–4856. https://doi.org/10.1007/s11263-017-1061-3

  • Ma N, Zhang X, Zheng H T, et al (2018) ShuffleNet V2: practical guidelines for efficient CNN architecture design [J]. https://doi.org/10.1007/978-3-030-01264-9_8

  • Nam H, Han B (2016) Learning multi-domain convolutional neural networks for visual tracking[C]. Computer vision and pattern recognition, pp 4293–4302

  • Possegger H, Mauthner T, Bischof H, et al (2015) In defense of color-based model-free tracking[C]. Computer vision and pattern recognition, pp 2113–2120.

  • Ren S, He K, Girshick RB et al (2017) Faster R-CNN: towards real-temporal object detection with region proposal networks. IEEE Trans Patt Anal Mach Intell 39(6):1137–1149

    Article  Google Scholar 

  • Sandler M, Howard A, Zhu M, et al (2018) Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation [J].

  • Simonyan K, Zisserman A (2014) Very Deep Convolutional Networks for Large-Scale Image Recognition [J].

  • Song Y, Ma C, Wu X, et al (2018) VITAL: visual tracking via adversarial learning[C]. Computer vision and pattern recognition, pp 8990–8999

  • Szegedy C, Liu W, Jia Y, et al (2015) Going deeper with convolutions[C]. Computer vision and pattern recognition, pp 1–9.

  • Valmadre J, Bertinetto L, Henriques J F, et al (2017) End-to-End Representation Learning for Correlation Filter Based Tracking[C]. Computer vision and pattern recognition, pp 5000–5008.

  • Wang M, Liu Y, Huang Z, et al (2017) Large Margin Object Tracking with Circulant Feature Maps[C]. Computer vision and pattern recognition, pp 4800–4808.

  • Wu Y, Lim J, Yang MH (2015) Object tracking benchmark. IEEE Trans Patt Anal Mach Intell 37(9):1834–1848

    Article  Google Scholar 

  • Yang DD, Cai YZ, Mao N et al (2016) Long-term object tracking based on kernelized correlation filters. Optics Prec Eng 24(8):2037–2049

    Article  Google Scholar 

  • Zhang Y, Wang L, Qi J, et al (2018) Structured siamese network for real-temporal visual tracking[C]. European conference on computer vision, pp 355–370

  • Zhou J, Xu W (2015) End-to-end learning of semantic role labeling using recurrent neural networks[C]. International joint conference on natural language processing, pp 1127–1137

Download references

Acknowledgements

This work is supported by The Natural Science Foundation of Guangdong Province under the Grant No.2020A1515010784, The National Natural Science Foundation of China under the Grant No.61976063 and Natural Science Program of Guangdong University of Science and Technology under the Grant No. GKY-2021KYQNK-2.

Funding

The work was supported by the Natural Science Foundation of Guangdong Province (No. 2020A1515010784).

Author information

Authors and Affiliations

Authors

Contributions

XJ conceived the algorithms, conducted experimental demonstrations, and wrote the paper; ZL wrote the paper; KL wrote the paper; SZ wrote the paper.

Corresponding author

Correspondence to Xiaoshuo Jia.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jia, X., Li, Z., Li, K. et al. A method of real-temporal object tracking combined the temporal information and spatial information. Soft Comput 26, 8689–8698 (2022). https://doi.org/10.1007/s00500-022-07154-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-022-07154-0

Keywords

Navigation