VGT-MOT: visibility-guided tracking for online multiple-object tracking

Wang, Shuai; Li, Wei-Xi; Wang, Lu; Xu, Li-Sheng; Deng, Qing-Xu

doi:10.1007/s00138-023-01398-y

VGT-MOT: visibility-guided tracking for online multiple-object tracking

Original Paper
Published: 13 May 2023

Volume 34, article number 50, (2023)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Shuai Wang¹,
Wei-Xi Li¹,
Lu Wang¹,
Li-Sheng Xu² &
…
Qing-Xu Deng¹

368 Accesses
2 Altmetric
Explore all metrics

Abstract

Multi-object tracking (MOT) is an important task of computer vision which has a wide range of applications. Existing multi-object tracking methods mostly employ the Kalman filter to predict the object location in the next frame. However, if the video is captured by a camera with significant motion variation or contains objects moving at non-constant speed, the Kalman filter may fail. In addition, although object occlusion has been studied extensively in MOT, it has not been well addressed yet. To deal with these problems, a joint detection and tracking method named visibility-guided tracking for MOT (VGT-MOT) is proposed in this paper. Specifically, to cope with the difficulty of accurate object position estimation caused by drastic camera or object motion variation, VGT-MOT utilizes an adjacent-frame object location prediction network with inter-frame attention to predict the target position in the next frame. To handle object occlusion, VGT-MOT employs the object visibility as a dynamic weight to adaptively fuse the motion and appearance similarities and update the object appearance representation. The proposed VGT-MOT has been evaluated on the MOT16, MOT17 and MOT20 datasets. The results show that VGT-MOT compares favorably against state-of-the-art MOT approaches. The source code of the proposed method is available at https://github.com/wang-ironman/VGT-MOT.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-object Tracking Combines Motion and Visual Information

Know Your Surroundings: Exploiting Scene Information for Object Tracking

A Survey of Multi-object Video Tracking Algorithms

References

Janai, J., Guney, F., Behl, A., et al.: Computer vision for autonomous vehicles: problems, datasets and state of the art. Comput. Graph. Vis. 12(1–3), 1–308 (2020)
Google Scholar
Sun, P., Kretzschmar, H., Dotiwalla, X., et al.: Scalability in perception for autonomous driving: waymo open dataset. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Oh, S., Hoogs, A., Perera, A., et al.: A large scale benchmark dataset for event recognition in surveillance video. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2011)
Bewley, A., Ge, Z., Ott, L., et al.: Simple online and realtime tracking. In: IEEE International Conference on Image Processing (ICIP) (2016)
Tang, S., Andriluka, M., Andres, B., et al.: Multiple people tracking by lifted multicut and person re-identification. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Xu, J., Cao, Y., Zhang, Z., et al.: Spatial temporal relation networks for multi-object tracking. In: International Conference on Computer Vision (ICCV) (2019)
Porzi, L., Hofinger, M., Ruiz, I., et al.: Learning multi-object tracking and segmentation from automatic annotations. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Meinhardt, T., Kirillov, A., Leal-Taixe, L., et al.: Trackformer: multi-object tracking with transformers. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Zhou, X., Koltun, V., Krahenbuhl, P.: Tracking objects as points. In: European Conference on Computer Vision (ECCV) (2020)
Peng, J., Wang, C., Wan, F., et al.: Chained-tracker: chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking. In: European Conference on Computer Vision (ECCV) (2020)
Lu, L., Rathod, V., Votel, R., et al.: Retinatrack: online single stage joint detection and tracking. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Wang, Z., Zheng, L., Liu, Y., et al.: Towards real-time multi-object tracking. In: European Conference on Computer Vision (ECCV) (2020)
Milan, A., Leal-Taixe, L., Reid, I., et al.: MOT16: a benchmark for multi-object tracking. arXiv:1603.00831
Dendorfer, P., Rezatofighi, H., Milan, A., et al.: MOT20: a benchmark for multi object tracking in crowded scenes. arXiv:2003.09003 (2020)
Bewley, A., Ge, Z., Ott, L., et al.: Simple online and realtime tracking. In: IEEE International Conference on Image Processing (ICIP) (2016)
Bochinski, E., Eiselein, V., Sikora, T.: High-speed tracking-by-detection without using image information. In: IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) (2017)
Yu, F., Li, W., Li, Q., et al: POI: multiple object tracking with high performance detection and appearance feature. In: European Conference on Computer Vision (ECCV) (2016)
Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. In: IEEE International Conference on Image Processing (ICIP) (2017)
Bochinski, E., Senst, T, Sikora, T.: Extending IoU based multi-object tracking by visual information. In: IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) (2018)
Zhang, Y., Wang, C., Wang, X., et al.: FairMOT: on the fairness of detection and re-identification in multiple object tracking. Int. J. Comput. Vis. 129, 3069–3087 (2021)
Article Google Scholar
Zhou, X., Wang, D., Krähenbühl, P.: Objects as points. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Wu, J., Cao, J., Song, L., et al.: Track to detect and segment: an online multi-object tracker. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Yu, F., Wang, D., Shelhamer, E., et al.: Deep layer aggregation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Lin, T.-Y., Goyal, P., Girshick, R., et al.: Focal loss for dense object detection. In: International Conference on Computer Vision (ICCV) (2017)
Lin, T.-Y., Maire, M., Belongie, S., et al.: Microsoft COCO: common objects in context. In: European Conference on Computer Vision (ECCV) (2014)
Pang, B., Li, Y., Zhang, Y., et al.: Tubetk: adopting tubes to track multi-object in a one-step training model. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Han, S., Huang, P., Wang, H., et al.: MAT: motion-aware multi-object tracking. Neurocomputing 476, 75–86 (2022)
Article Google Scholar
Pang, J., Qiu, L., Li, X., et al.: Quasi-dense similarity learning for multiple object tracking. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Zeng, F., Dong, B., Wang, T., et al.: MOTR: end-to-end multiple-object tracking with transformer. In: European Conference on Computer Vision (ECCV) (2022)
Zhang, Y., Sheng, H., Wu, Y., et al.: Multiplex labeling graph for near-online tracking in crowded scenes. IEEE Internet Things J. 7, 7892–7902 (2020)
Article Google Scholar
Xu, Y., Ban, Y., Delorme, G., et al.: Transcenter: transformers with dense queries for multiple-object tracking. arXiv:2103.15145 (2021)
Yu, E., Li, Z., Han, S., et al.: Relationtrack: relation-aware multiple object tracking with decoupled representation. IEEE Trans. Multimed. (2022)
Shao, S., Zhao, Z., Li, B., et al.: Crowdhuman: a benchmark for detecting human in a crowd. arXiv:1805.00123 (2018)
Fabbri, M., Lanzi, F., Calderara, S., et al.: Learning to detect and track visible and occluded body joints in a virtual world. In: European Conference on Computer Vision (ECCV) (2018)

Download references

Author information

Authors and Affiliations

School of Computer Science and Engineering, Northeastern University, Hunnan District, Shenyang, 110169, Liaoning, China
Shuai Wang, Wei-Xi Li, Lu Wang & Qing-Xu Deng
College of Medicine and Biological Information Engineering, Northeastern University, Hunnan, Shenyang, 110169, Liaoning, China
Li-Sheng Xu

Authors

Shuai Wang
View author publications
You can also search for this author in PubMed Google Scholar
Wei-Xi Li
View author publications
You can also search for this author in PubMed Google Scholar
Lu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Li-Sheng Xu
View author publications
You can also search for this author in PubMed Google Scholar
Qing-Xu Deng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lu Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, S., Li, WX., Wang, L. et al. VGT-MOT: visibility-guided tracking for online multiple-object tracking. Machine Vision and Applications 34, 50 (2023). https://doi.org/10.1007/s00138-023-01398-y

Download citation

Received: 30 October 2022
Revised: 31 March 2023
Accepted: 12 April 2023
Published: 13 May 2023
DOI: https://doi.org/10.1007/s00138-023-01398-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

VGT-MOT: visibility-guided tracking for online multiple-object tracking

Abstract

Access this article

Similar content being viewed by others

Multi-object Tracking Combines Motion and Visual Information

Know Your Surroundings: Exploiting Scene Information for Object Tracking

A Survey of Multi-object Video Tracking Algorithms

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

VGT-MOT: visibility-guided tracking for online multiple-object tracking

Abstract

Access this article

Similar content being viewed by others

Multi-object Tracking Combines Motion and Visual Information

Know Your Surroundings: Exploiting Scene Information for Object Tracking

A Survey of Multi-object Video Tracking Algorithms

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation