Multi-object tracking using context-sensitive enhancement via feature fusion

Zhou, Yan; Chen, Junyu; Wang, Dongli; Zhu, Xiaolin

doi:10.1007/s11042-023-16027-z

Multi-object tracking using context-sensitive enhancement via feature fusion

Published: 27 July 2023

Volume 83, pages 19465–19484, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Yan Zhou ORCID: orcid.org/0000-0002-2372-4947¹,
Junyu Chen¹,
Dongli Wang¹ &
…
Xiaolin Zhu²

152 Accesses
Explore all metrics

Abstract

Multi-object tracking (MOT) is one of the most challenging tasks in the field of computer vision. Most MOT methods generally face the problem of not being able to handle pedestrian features such as size and appearance well, which can easily lead to the problem of missed detection and occlusion. Considering this, an end-to-end multi-target tracking network with feature fusion and feature enhancement is proposed. The network framework integrates feature extraction, object detection, and data association. Using two adjacent frames as input chain nodes, based on Inception convolution as the backbone network, which has special pre-training weights that increase the perceptual domain of the network for multiple targets. In addition, the three-times repetitive overlay weighted bidirectional pyramid structure in the feature fusion module, which can focus more on key features and enhance the adaptability to target deformation. In order to solve the phenomenon of crowding in complex scenes, a context-sensitive prediction modules are added, which contain deeper and wider convolution to enhance the key information between targets. After the above processing, three loss function branches are formed, where the classification branch and the identity branch together form the attention multiplied by the regression branch to ensure the accuracy of regression. In MOT16 and MOT17 dataset experiments, our model MOTA metrics reach 67.9 and 67.7, with frame rates up to 30 FPS on a single GPU, with improved visualization results beyond Chain-Tracker.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CNA-DeepSORT algorithm for multi-target tracking

Article 29 May 2023

Pedestrian Multi-object Tracking Algorithm Based on Attention Feature Fusion

Joint Detection and Association for End-to-End Multi-object Tracking

Article 24 November 2023

Data Availability

Data sharing is not applicable to this article as no datasets were generated or analysed during the current study.

References

Adame BO, Salau AO, Subbanna BC, Tirupal T, Sultana SF (2020) Multimodal medical image fusion based on intuitionistic fuzzy sets. In: 2020 IEEE International Women in Engineering (WIE) Conference on Electrical and Computer Engineering (WIECON-ECE), IEEE, pp 131–134
Aharon N, Orfaig R, Bobrovsky BZ (2022) Bot-sort: Robust associations multi-pedestrian tracking. arXiv preprint arXiv:2206.14651
Badal T, Nain N, Ahmed M (2018) Online multi-object tracking: multiple instance based target appearance model. Multimedia Tools and Applications 77(19):25199–25221
Article Google Scholar
Bergmann P, Meinhardt T, Leal-Taixe L (2019) Tracking without bells and whistles. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 941–951
Bewley A, Ge Z, Ott L, Ramos F, Upcroft B (2016) Simple online and realtime tracking. In: 2016 IEEE international conference on image processing (ICIP), IEEE, pp 3464–3468
Bochinski E, Eiselein V, Sikora T (2017) High-speed tracking-by-detection without using image information. In: 2017 14th IEEE international conference on advanced video and signal based surveillance (AVSS), IEEE, pp 1–6
Bouraffa T, Feng Z, Yan L, Xia Y, Xiao B (2022) Multi-feature fusion tracking algorithm based on peak-context learning. Image Vis Comput 123(104):468
Google Scholar
Brasó G, Leal-Taixé L (2020) Learning a neural solver for multiple object tracking. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6247–6257
Chen L, Lou J, Xu F, Ren M (2020) Grid-based multi-object tracking with siamese cnn based appearance edge and access region mechanism. Multimedia Tools and Applications 79(47):35333–35351
Article Google Scholar
Chu P, Wang J, You Q, Ling H, Liu Z (2023) Transmot: Spatial-temporal graph transformer for multiple object tracking. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 4870–4880
Elayaperumal D, Joo YH (2021) Robust visual object tracking using context-based spatial variation via multi-feature fusion. Inf Sci 577:467–482
Article MathSciNet Google Scholar
Fang K, Xiang Y, Li X, Savarese S (2018) Recurrent autoregressive networks for online multi-object tracking. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), IEEE, pp 466–475
Faster R (2015) Towards real-time object detection with region proposal networks. Advances in neural information processing systems 9199(10.5555):2969239–2969250
Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645
Article PubMed Google Scholar
Fu Lh, Ding Y, Du YB, Zhang B, Wang LY, Wang D (2020) Siammn: Siamese modulation network for visual object tracking. Multimedia Tools and Applications 79(43):32623–32641
Article Google Scholar
Gao X, Shen Z, Yang Y (2022) Multi-object tracking with siamese-rpn and adaptive matching strategy. SIViP 16(4):965–973
Article Google Scholar
Guo S, Wang J, Wang X, Tao D (2021) Online multiple object tracking with cross-task synergy. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 8136–8145
Hornakova A, Henschel R, Rosenhahn B, Swoboda P (2020) Lifted disjoint paths with application in multiple object tracking. In: International conference on machine learning, PMLR, pp 4364–4375
Jain S, Salau AO (2021) Multimodal image fusion employing discrete cosine transform. In: 2021 IEEE International Women in Engineering (WIE) Conference on Electrical and Computer Engineering (WIECON-ECE), IEEE, pp 5–8
Karunasekera H, Wang H, Zhang H (2019) Multiple object tracking with attention to appearance, structure, motion and size. IEEE Access 7:104423–104434
Article Google Scholar
Kim C, Li F, Ciptadi A, Rehg JM (2015) Multiple hypothesis tracking revisited. In: Proceedings of the IEEE international conference on computer vision, pp 4696–4704
Kim C, Li F, Rehg JM (2018) Multi-object tracking with neural gating using bilinear lstm. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 200–215
Kim C, Fuxin L, Alotaibi M, Rehg JM (2021) Discriminative appearance modeling with multi-track pooling for real-time multi-object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 9553–9562
Kim DY, Vo BN, Vo BT, Jeon M (2019) A labeled random finite set online multi-object tracker for video data. Pattern Recogn 90:377–389
Article ADS Google Scholar
Li J, Gao X, Jiang T (2020) Graph networks for multiple object tracking. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 719–728
Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
Liu J, Li C, Liang F, Lin C, Sun M, Yan J, Ouyang W, Xu D (2021) Inception convolution with efficient dilation search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 11486–11495
Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8759–8768
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: Single shot multibox detector. In: European conference on computer vision, Springer, pp 21–37
Lu Z, Rathod V, Votel R, Huang J (2020) Retinatrack: Online single stage joint detection and tracking. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14668–14678
Mahmoudi N, Ahadi SM, Rahmati M (2019) Multi-target tracking using cnn-based features: Cnnmtt. Multimedia Tools and Applications 78(6):7077–7096
Article Google Scholar
Pang B, Li Y, Zhang Y, Li M, Lu C (2020a) Tubetk: Adopting tubes to track multi-object in a one-step training model. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 6308–6318
Pang Y, Li F, Qiao X, Gilman A (2020b) Real-time tracking based on deep feature fusion. Multimedia Tools and Applications 79(37):27229–27255
Peng J, Wang C, Wan F, Wu Y, Wang Y, Tai Y, Wang C, Li J, Huang F, Fu Y (2020) Chained-tracker: Chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking. In: European conference on computer vision, Springer, pp 145–161
Qin W, Du H, Zhang X Ma Z, Ren X, Luo T (2021) Joint prediction and association for deep feature multiple object tracking. In: Journal of Physics: Conference Series, IOP Publishing, p 012021
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems 28
Salau AO, Jain S, Eneh JN (2021) A review of various image fusion types and transform. Indonesian Journal of Electrical Engineering and Computer Science 24(3):1515–1522
Article Google Scholar
Sanchez-Matilla R, Poiesi F, Cavallaro A (2016) Online multi-target tracking with strong and weak detections. In: European Conference on Computer Vision, Springer, pp 84–99
Shuai B, Berneshawi A, Li X, Modolo D, Tighe J (2021) Siammot: Siamese multi-object tracking. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12372–12382
Song Ym, Jeon M (2016) Online multiple object tracking with the hierarchically adopted gm-phd filter using motion and appearance. In: 2016 IEEE International conference on consumer electronics-Asia (ICCE-Asia), IEEE, pp 1–4
Sun S, Akhtar N, Song H, Mian A, Shah M (2019) Deep affinity network for multiple object tracking. IEEE Trans Pattern Anal Mach Intell 43(1):104–119
Google Scholar
Takala V, Pietikainen M (2007) Multi-object tracking using color, texture and motion. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp 1–7
Tan M, Pang R, Le QV (2020) Efficientdet: Scalable and efficient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10781–10790
Tang X, Du DK, He Z, Liu J (2018) Pyramidbox: A context-assisted single shot face detector. In: Proceedings of the European conference on computer vision (ECCV), pp 797–813
Tokmakov P, Li J, Burgard W, Gaidon A (2021) Learning to track with object permanence. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 10,860–10,869
Wan J, Zhang H, Zhang J, Ding Y, Yang Y, Li Y, Li X (2022) Dsrrtracker: Dynamic search region refinement for attention-based siamese multi-object tracking. arXiv preprint arXiv:2203.10729
Wang L, Xu L, Kim MY, et al (2017) Online multiple object tracking via flow and convolutional features. In: 2017 IEEE International Conference on Image Processing (ICIP), IEEE, pp 3630–3634
Wang Y, Kitani K, Weng X (2021) Joint object detection and multi-object tracking with graph neural networks. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), IEEE, pp 13,708–13,715
Wang Z, Zheng L, Liu Y, et al (2020) Towards real-time multi-object tracking. In: European Conference on Computer Vision, Springer, pp 107–122
Wojke N, Bewley A, Paulus D (2017) Simple online and real-time tracking with a deep association metric. In: 2017 IEEE international conference on image processing (ICIP), IEEE, pp 3645–3649
Xing D, Evangeliou N, Tsoukalas A, Tzes A (2022) Siamese transformer pyramid networks for real-time uav tracking. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 2139–2148
Xu J, Cao Y, Zhang Z, Hu H (2019) Spatial-temporal relation networks for multi-object tracking. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3988–3998
Yang F, Choi W, Lin Y (2016) Exploit all the layers: Fast and accurate cnn object detector with scale dependent pooling and cascaded rejection classifiers. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2129–2137
Yang M, Jia Y (2016) Temporal dynamic appearance modeling for online multi-person tracking. Comput Vis Image Underst 153:16–28
Article Google Scholar
Yu F, Li W, Li Q, Liu Y, Shi X, Yan J (2016) Poi: Multiple object tracking with high performance detection and appearance feature. In: European Conference on Computer Vision, Springer, pp 36–42
Zeng F, Dong B, Wang T, Chen C, Zhang X, Wei Y. Motr: End-to-end multiple-object tracking with transformer. arxiv 2021. arXiv preprint arXiv:2105.03247
Zhang T, Sun R, Wan Y et al (2023) Msffal: Few-shot object detection via multi-scale feature fusion and attentive learning. Sensors 23(7):3609
Article PubMed PubMed Central ADS Google Scholar
Zhang Y, Sun P, Jiang Y, Yu D, Weng F, Yuan Z, Luo P, Liu W, Wang X (2022) Bytetrack: Multi-object tracking by associating every detection box. In: Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXII, Springer, pp 1–21
Zhou X, Koltun V, Krähenbühl P (2020) Tracking objects as points. In: European Conference on Computer Vision, Springer, pp 474–490
Zhou Z, Xing J, Zhang M, Hu W (2018) Online multi-target tracking with tensor-based high-order graph matching. In: 2018 24th International Conference on Pattern Recognition (ICPR), IEEE, pp 1809–1814
Zou Z, Huang J, Luo P (2022) Compensation tracker: reprocessing lost object for multi-object tracking. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 307–317

Download references

Author information

Authors and Affiliations

School of Automation and Electronic Information, Xiangtan University, Xiangtan, 411105, China
Yan Zhou, Junyu Chen & Dongli Wang
School of Mathematics and Computational Science, Xiangtan University, Xiangtan, 411105, China
Xiaolin Zhu

Authors

Yan Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Junyu Chen
View author publications
You can also search for this author in PubMed Google Scholar
Dongli Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaolin Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yan Zhou.

Ethics declarations

Competing interests

The authors state that they have no conflicting financial interests or personal connections that may have influenced the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhou, Y., Chen, J., Wang, D. et al. Multi-object tracking using context-sensitive enhancement via feature fusion. Multimed Tools Appl 83, 19465–19484 (2024). https://doi.org/10.1007/s11042-023-16027-z

Download citation

Received: 09 October 2022
Revised: 14 May 2023
Accepted: 11 June 2023
Published: 27 July 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s11042-023-16027-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-object tracking using context-sensitive enhancement via feature fusion

Abstract

Access this article

Similar content being viewed by others

CNA-DeepSORT algorithm for multi-target tracking

Pedestrian Multi-object Tracking Algorithm Based on Attention Feature Fusion

Joint Detection and Association for End-to-End Multi-object Tracking

Data Availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-object tracking using context-sensitive enhancement via feature fusion

Abstract

Access this article

Similar content being viewed by others

CNA-DeepSORT algorithm for multi-target tracking

Pedestrian Multi-object Tracking Algorithm Based on Attention Feature Fusion

Joint Detection and Association for End-to-End Multi-object Tracking

Data Availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation