SSL-MOT: self-supervised learning based multi-object tracking

Kim, Sangwon; Lee, Jimi; Ko, Byoung Chul

doi:10.1007/s10489-022-03473-9

SSL-MOT: self-supervised learning based multi-object tracking

Published: 22 April 2022

Volume 53, pages 930–940, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

840 Accesses
4 Citations
3 Altmetric
Explore all metrics

Abstract

Although the use of a Siamese network is the most popular approach in object tracking, it creates an undesirable trivial solution and requires a large amount of training data reflecting changes in the object’s shape in every frame. To solve this problem, in this paper, a self-supervised learning method for multi-object tracking (SSL-MOT) based on a contrastive structure is proposed. Unlike the existing SSL, we adopt a generative adversarial network as a preprocessing step to generate various pose changes of tracking objects. A positive pair composed of the augmented image and pose data is applied to the SSL network to learn an encoder that can generate a non-collapsed output vector. To improve the discrimination power of the encoder output features, we propose an affinity correlation distance, which combines invariance and redundancy terms as a loss function for learning. During the test, because only the dot product between two output vectors of the tracker and detection was used for a data association, the computation time was significantly reduced, and thus real-time online tracking about 12 fps was possible. The proposed method is the first attempt to apply SSL to an online MOT. Experimental results on the MOT16, 17, and 20 challenge datasets proved that the proposed method is a fast and reasonable tracking method that occupies less memory and achieves an excellent tracking performance compared to other state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning task-specific discriminative representations for multiple object tracking

Article 07 December 2022

An efficient method to fool and enhance object tracking with adversarial perturbations

Article 14 March 2023

Multi-Model UNet: An Adversarial Defense Mechanism for Robust Visual Tracking

Article Open access 01 April 2024

References

Shu G, Dehghan A, Oreifej O, Hand E, Shah M (2012) Part-based multiple-person tracking with partial occlusion handling. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 1815–1821
Kuhn H W (1955) The hungarian method for the assignment problem. Naval Res Logist Quart 2(1-2):83–97
Article MathSciNet MATH Google Scholar
Kim H-U, Koh Y J, Kim C-S (2020) Online multiple object tracking based on open-set few-shot learning. IEEE Access 8:190312–190326
Article Google Scholar
Leal-Taixé L, Canton-Ferrer C, Schindler K (2016) Learning by tracking: Siamese cnn for robust target association. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp 33–40
Chu P, Ling H (2019) Famnet: Joint learning of feature, affinity and multi-dimensional assignment for online multiple object tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 6172–6181
Lee S, Kim E (2018) Multiple object tracking via feature pyramid siamese networks. IEEE Access 7:8181–8194
Article Google Scholar
Lee J, Kim S, Ko B C (2020) Online multiple object tracking using rule distillated siamese random forest. IEEE Access 8:182828–182841
Article Google Scholar
Zhang Z, Zhang Y, Cheng X, Lu G (2021) Siamese network for object tracking with multi-granularity appearance representations. Pattern Recogn 118:108003
Article Google Scholar
Shuai B, Berneshawi A, Li X, Modolo D, Tighe J (2021) Siammot: Siamese multi-object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 12372–12382
Papakis I, Sarkar A, Karpatne A (2020) Gcnnmatch: Graph convolutional neural networks for multi-object tracking via sinkhorn normalization. arXiv:2010.00067
Ristani E, Tomasi C (2018) Features for multi-target multi-camera tracking and re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6036–6046
Son J, Baek M, Cho M, Han B (2017) Multi-object tracking with quadruplet convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5620–5629
Chen X, He K (2021) Exploring simple siamese representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 15750–15758
Dai P, Weng R, Choi W, Zhang C, He Z, Ding W (2021) Learning a proposal classifier for multiple object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2443–2452
He J, Huang Z, Wang N, Zhang Z (2021) Learnable graph matching: Incorporating graph partitioning with deep feature learning for multiple object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5299–5309
Stadler D, Beyerer J (2021) Improving multiple pedestrian tracking by track management and occlusion handling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 10958–10967
Grill J-B, Strub F, Altché F, Tallec C, Richemond P H, Buchatskaya E, Doersch C, Pires B A, Guo Z D, Azar M G et al (2020) Bootstrap your own latent: A new approach to self-supervised learning. arXiv:2006.07733
Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning. PMLR, pp 1597–1607
Zbontar J, Jing L, Misra I, LeCun Y, Deny S (2021) Barlow twins: Self-supervised learning via redundancy reduction. arXiv:2103.03230
Qian X, Fu Y, Xiang T, Wang W, Qiu J, Wu Y, Jiang Y-G, Xue X (2018) Pose-normalized image generation for person re-identification. In: Proceedings of the European conference on computer vision (ECCV), pp 650–667
Lu Y, Lu C, Tang C-K (2017) Online video object detection using association lstm. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2344–2352
Liu H, Zhang H, Mertz C (2019) Deepda: Lstm-based deep data association network for multi-targets tracking in clutter. In: 2019 22th International Conference on Information Fusion (FUSION). IEEE, pp 1–8
Kim C, Fuxin L, Alotaibi M, Rehg J M (2021) Discriminative appearance modeling with multi-track pooling for real-time multi-object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 9553–9562
Ge W (2018) Deep metric learning with hierarchical triplet loss. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 269–285
Hermans A, Beyer L, Leibe B (2017) In defense of the triplet loss for person re-identification. arXiv:1703.07737
Zou H, Cui J, Kong X, Zhang C, Liu Y, Wen F, Li W (2020) F-siamese tracker: A frustum-based double siamese network for 3d single object tracking. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, pp 8133–8139
Caron M, Misra I, Mairal J, Goyal P, Bojanowski P, Joulin A (2020) Unsupervised learning of visual features by contrasting cluster assignments. arXiv:2006.09882
Bahri D, Jiang H, Tay Y, Metzler D (2021) Scarf: Self-supervised contrastive learning using random feature corruption. arXiv:2106.15147
Cao Z, Simon T, Wei S-E, Sheikh Y (2017) Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7291–7299
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: A benchmark. In: Proceedings of the IEEE international conference on computer vision, pp 1116– 1124
Milan A, Leal-Taixé L, Reid I, Roth S, Schindler K (2016) Mot16: A benchmark for multi-object tracking. arXiv:1603.00831
MOT Benchmarks https://motchallenge.net/data/MOT17/
Dendorfer P, Rezatofighi H, Milan A, Shi J, Cremers D, Reid I, Roth S, Schindler K, Leal-Taixé L (2020) Mot20: A benchmark for multi object tracking in crowded scenes. arXiv:2003.09003
Li J, Gao X, Jiang T (2020) Graph networks for multiple object tracking. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 719–728
Yang J, Ge H, Yang J, Tong Y, Su S (2021) Online multi-object tracking using multi-function integration and tracking simulation training. Appl Intell:1–21
Saleh F, Aliakbarian S, Rezatofighi H, Salzmann M, Gould S (2021) Probabilistic tracklet scoring and inpainting for multiple object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 14329– 14339
Zhou X, Koltun V, Krähenbühl P (2020) Tracking objects as points. In: European Conference on Computer Vision. Springer, pp 474–490
Si T, He F, Wu H, Duan Y (2022) Spatial-driven features based on image dependencies for person re-identification. Pattern Recogn 124:108462
Article Google Scholar
Pan Y, He F, Yu H (2020) Learning social representations with deep autoencoder for recommender system. World Wide Web 23(4):2259–2279
Article Google Scholar
Liang Y, He F, Zeng X (2020) 3d mesh simplification with feature preservation based on whale optimization algorithm and differential evolution. Integr Comput-Aided Eng (Preprint):1–19

Download references

Acknowledgements

This work was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF), Ministry of Education, under Grant 2019R1I1A3A01042506.

Author information

Authors and Affiliations

Computer Engineering, Keimyung University, 1095, Dalgubeol-daero, Daegu, 42601, South Korea
Sangwon Kim, Jimi Lee & Byoung Chul Ko

Authors

Sangwon Kim
View author publications
You can also search for this author in PubMed Google Scholar
Jimi Lee
View author publications
You can also search for this author in PubMed Google Scholar
Byoung Chul Ko
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Byoung Chul Ko.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, S., Lee, J. & Ko, B.C. SSL-MOT: self-supervised learning based multi-object tracking. Appl Intell 53, 930–940 (2023). https://doi.org/10.1007/s10489-022-03473-9

Download citation

Accepted: 04 March 2022
Published: 22 April 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s10489-022-03473-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSL-MOT: self-supervised learning based multi-object tracking

Abstract

Access this article

Similar content being viewed by others

Learning task-specific discriminative representations for multiple object tracking

An efficient method to fool and enhance object tracking with adversarial perturbations

Multi-Model UNet: An Adversarial Defense Mechanism for Robust Visual Tracking

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

SSL-MOT: self-supervised learning based multi-object tracking

Abstract

Access this article

Similar content being viewed by others

Learning task-specific discriminative representations for multiple object tracking

An efficient method to fool and enhance object tracking with adversarial perturbations

Multi-Model UNet: An Adversarial Defense Mechanism for Robust Visual Tracking

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation