skip to main content
research-article

Multitarget Tracking Using Siamese Neural Networks

Published: 18 May 2021 Publication History

Abstract

In this article, we detect and track visual objects by using Siamese network or twin neural network. The Siamese network is constructed to classify moving objects based on the associations of object detection network and object tracking network, which are thought of as the two branches of the twin neural network. The proposed tracking method was designed for single-target tracking, which implements multitarget tracking by using deep neural networks and object detection. The contributions of this article are stated as follows. First, we implement the proposed method for visual object tracking based on multiclass classification using deep neural networks. Then, we attain multitarget tracking by combining the object detection network and the single-target tracking network. Next, we uplift the tracking performance by fusing the outcomes of the object detection network and object tracking network. Finally, we speculate on the object occlusion problem based on IoU and similarity score, which effectively diminish the influence of this issue in multitarget tracking.

References

[1]
D. S. Bolme, J. R. Beveridge, B. A. Draper, and Y. M. Lui. 2010. Visual object tracking using adaptive correlation filters. In Proceedings of IEEE CVPR. 2544–2550.
[2]
J. F. Henriques, R. Caseiro, P. Martins, and J. Batista. 2014. High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 3 (2014), 583–596.
[3]
M. Danelljan, G. Hager, F. Shahbaz Khan, and M. Felsberg. 2015. Learning spatially regularized correlation filters for visual tracking. In Proceedings of IEEE ICCV. 4310–4318.
[4]
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, and A. C. Berg. 2016. SSD: Single shot multibox detector. In Proceedings of ECCV. 21–37.
[5]
B. Li, J. Yan, W. Wu, Z. Zhu, and X. Hu. 2018. High performance visual tracking with Siamese region proposal network. In Proceedings of IEEE CVPR. 8971–8980.
[6]
R. Girshick, J. Donahue, T. Darrell, and J. Malik. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of IEEE CVPR. 580–587.
[7]
R. Girshick. 2015. Fast R-CNN. In Proceedings of IEEE ICCV. 1440–1448.
[8]
M. Danelljan, A. Robinson, F. S. Khan, and M. Felsberg. 2016. Beyond correlation filters: Learning continuous convolution operators for visual tracking. In Proceedings of ECCV. 472–488.
[9]
N. Wojke, A. Bewley, and D. Paulus. 2017. Simple online and realtime tracking with a deep association metric. In Proceedings of IEEE ICIP. 3645–3649.
[10]
M. Danelljan, G. Bhat, F. Shahbaz Khan, and M. Felsberg. 2017. ECO: Efficient convolution operators for tracking. In Proceedings of IEEE CVPR. 6638–6646.
[11]
L. Bertinetto, J. Valmadre, J. F. Henriques, A. Vedaldi, and P. H. Torr. 2016. Fully-convolutional Siamese networks for object tracking. In Proceedings of ECCV. 850–865.
[12]
S. Hochreiter and J. Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735–1780.
[13]
M. C. Lee and W. G. Chen. 1999. U.S. Patent No. 5,970,173. Washington, DC: U.S. Patent and Trademark Office.
[14]
J. Zhu, H. Yang, N. Liu, M. Kim, W. Zhang, and M. H. Yang. 2018. Online multi-object tracking with dual matching attention networks. In Proceedings of ECCV. 366–382.
[15]
Q. Chu, W. Ouyang, H. Li, X. Wang, B. Liu, and N. Yu. 2017. Online multi-object tracking using CNN-based single object tracker with spatial-temporal attention mechanism. In Proceedings of IEEE ICCV. 4836–4845.
[16]
Z. Huang, J. Zhan, H. Zhao, K. Lin, P. Zheng, and J. Lv. 2019. Real-time visual tracking base on SiamRPN with generalized intersection over union. In Proceedings of BICS. 96–105.
[17]
S. Cui, S. Tian, and X. Yin. 2019. Combined correlation filters with Siamese region proposal network for visual tracking. In Proceedings of ICONIP. 128–138.
[18]
W. Feng, Z. Hu, W. Wu, J. Yan, and W. Ouyang. 2019. Multi-object tracking with multiple cues and switcher-aware classification. arXiv:1901.06129
[19]
A. Milan, L. Leal-Taixé, I. Reid, S. Roth, and K. Schindler. 2016. MOT16: A benchmark for multi object tracking. arXiv:1603.00831
[20]
L. Wen, D. Du, Z. Cai, Z. Lei, M. C. Chang, H. Qi, and S. Lyu. 2015. UA-DETRAC: A new benchmark and protocol for multi-object detection and tracking. arXiv:1511.04136
[21]
S. S. Deutsch. 2019. Siamese Networks for Visual Object Tracking. Ph.D. Dissertation. Universitat Politècnica de Catalunya, Escola Tècnica Superior d'Enginyeria de Telecomunicació de Barcelona, Spain.
[22]
M. Z. Alom, T. M. Taha, C. Yakopcic, S. Westberg, P. Sidike, M. S. Nasrin, and V. K. Asari. 2018. The history began from AlexNet: A comprehensive survey on deep learning approaches. arXiv:1803.01164
[23]
Z. Huang, J. Zhan, H. Zhao, K. Lin, P. Zheng, and J. Lv. 2019. Real-time visual tracking base on SiamRPN with generalized intersection over union. In Proceedings of BICS. 96–105.
[24]
Z. Zhang and H. Peng. 2019. Deeper and wider Siamese networks for real-time visual tracking. In Proceedings of IEEE CVPR. 4591–4600.
[25]
B. Li, W. Wu, Q. Wang, F. Zhang, J. Xing, and J. Yan. 2019. SiamRPN++: Evolution of Siamese visual tracking with very deep networks. In Proceedings of IEEE CVPR. 4282–4291.
[26]
B. Li, J. Yan, W. Wu, Z. Zhu, and X. Hu. 2018. High performance visual tracking with Siamese region proposal network. In Proceedings of IEEE CVPR. 8971–8980.
[27]
D. Li, X. Wang, and Y. Yu. 2019. Siamese visual tracking with deep features and robust feature fusion. In Proceedings of IEEE ICCE-Asia. 16–34.
[28]
L. Zheng, M. Tang, Y. Chen, J. Wang, and H. Lu. 2020. Siamese deformable cross-correlation network for real-time visual tracking. Neurocomputing 401 (2020), 36–47.
[29]
R. D. Keane and R. J. Adrian. 1992. Theory of cross-correlation analysis of PIV images. Applied Scientific Research 49, 3 (1992), 191–215.
[30]
N. Dehak, R. Dehak, J. R. Glass, D. A. Reynolds, and P. Kenny. 2010. Cosine similarity scoring without score normalization techniques. In Proceedings of Odyssey. 15.
[31]
B. Li, W. Wu, Q. Wang, F. Zhang, J. Xing, and J. Yan. 2019. SiamRPN++: Evolution of Siamese visual tracking with very deep networks. In Proceedings of IEEE CVPR. 4282–4291.
[32]
L. I. Kuncheva. 2010. Full-class set classification using the Hungarian algorithm. International Journal of Machine Learning and Cybernetics 1, 1-4 (2010), 53–61.
[33]
R. T. Collins, A. J. Lipton, T. Kanade, H. Fujiyoshi, D. Duggins, Y. Tsin, and L. Wixson. 2000. A System for Video Surveillance and Monitoring. Final Report. VSAM.
[34]
F. Bashir and F. Porikli. 2006. Performance evaluation of object detection and tracking systems. In Proceedings of IEEE PETS. 7–14.
[35]
A. S. Abdel-Aziz, A. E. Hassanien, A. T. Azar, and S. E. O. Hanafi. 2013. Machine learning techniques for anomalies detection and classification. In Proceedings of SecNet. 219–229.
[36]
E. Bochinski, T. Senst, and T. Sikora. 2018. Extending IoU based multi-object tracking by visual information. In Proceedings of IEEE AVSS. 1–6.
[37]
G. Chandan, A. Jain, and H. Jain. 2018. Real time object detection and tracking using deep learning and OpenCV. In Proceedings of ICIRCA. 1305–1308.
[38]
W. Lotter, G. Kreiman, and D. Cox. 2015. Unsupervised learning of visual structure using predictive generative networks. arXiv:1511.06380
[39]
M. J. Shafiee, B. Chywl, F. Li, and A. Wong. 2017. Fast YOLO: A fast you only look once system for real-time embedded object detection in video. arXiv:1709.05943
[40]
R. R. Varior, B. Shuai, J. Lu, D. Xu, and G. Wang. 2016. A Siamese long short-term memory architecture for human re-identification. In Proceedings of ECCV. 135–153.
[41]
R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra. 2017. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of IEEE ICCV. 618–626.
[42]
M. D. Zeiler and R. Fergus. 2014. Visualizing and understanding convolutional networks. In Proceedings of ECCV. 818–833.
[43]
L. Lin, G. Wang, W. Zuo, X. Feng, and L. Zhang. 2016. Cross-domain visual matching via generalized similarity measure and feature learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 6 (2016), 1089–1102.
[44]
R. Jonker and T. Volgenant. 1986. Improving the Hungarian assignment algorithm. Operations Research Letters 5, 4 (1986), 171–175.
[45]
S. C. Wong, A. Gatt, V. Stamatescu, and M. D. McDonnell. 2016. Understanding data augmentation for classification: When to warp? In Proceedings of DICTA. 1–6.
[46]
Y. Wu, J. Lim, and M. H. Yang. 2015. Object tracking benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 9 (2015), 1834–1848.
[47]
Z. Zhang, S. Qiao, C. Xie, W. Shen, B. Wang, and A. L. Yuille. 2018. Single-shot object detection with enriched semantics. In Proceedings of IEEE CVPR. 5813–5821.
[48]
J. Zhu, H. Yang, N. Liu, M. Kim, W. Zhang, and M. H. Yang. 2018. Online multi-object tracking with dual matching attention networks. In Proceedings of ECCV. 366–382.
[49]
S. Tang, M. Andriluka, B. Andres, and B. Schiele. 2017. Multiple people tracking by lifted multicut and person re-identification. In Proceedings of IEEE CVPR. 3539–3548.
[50]
C. Shen, Z. Jin, Y. Zhao, Z. Fu, R. Jiang, Y. Chen, and X. S. Hua. 2017. Deep Siamese network with multi-level similarity perception for person re-identification. In Proceedings of ACM MM. 1942–1950.
[51]
A. Milan, S. H. Rezatofighi, A. Dick, I. Reid, and K. Schindler. 2017. Online multi-target tracking using recurrent neural networks. In Proceedings of AAAI. 4225—4232.
[52]
Z. He, J. Li, D. Liu, H. He, and D. Barber. 2019. Tracking by animation: Unsupervised learning of multi-object attentive trackers. In Proceedings of IEEE CVPR. 1318–1327.
[53]
Y. C. Yoon, D. Y. Kim, K. Yoon, Y. M. Song, and M. Jeon. 2019. Online multiple pedestrian tracking using deep temporal appearance matching association. arXiv:1907.00831
[54]
W. Feng, Z. Hu, W. Wu, J. Yan, and W. Ouyang. 2019. Multi-object tracking with multiple cues and switcher-aware classification. arXiv:1901.06129
[55]
C. Yan, B. Gong, Y. Wei, and Y. Gao. 2020. Deep multi-view enhancement hashing for image retrieval. arXiv:2002.00169
[56]
A. Milan, L. Leal-Taixé, I. Reid, S. Roth, and K. Schindler. 2016. MOT16: A benchmark for multi-object tracking. arXiv:1603.00831
[57]
W. Luo, J. Xing, A. Milan, X. Zhang, W. Liu, X. Zhao, and T. K. Kim. 2014. Multiple object tracking: A literature review. arXiv:1409.7618
[58]
Y. Zhang, D. Wang, L. Wang, J. Qi, and H. Lu. 2018. Learning regression and verification networks for long-term visual tracking. arXiv:1809.04320
[59]
A. Sadeghian, A. Alahi, and S. Savarese. 2017. Tracking the untrackable: Learning to track multiple cues with long-term dependencies. In Proceedings of ICCV. 300–311.
[60]
J. Yin, W. Wang, Q. Meng, R. Yang, and J. Shen. 2020. A unified object motion and affinity model for online multi-object tracking. In Proceedings of CVPR. 6768–6777.
[61]
P. Chu, H. Fan, C. C. Tan, and H. Ling. 2019. Online multi-object tracking with instance-aware tracker and dynamic model refreshment. In Proceedings of IEEE WACV. 161–170.
[62]
P. Chu and H. Ling. 2019. FAMNet: Joint learning of feature, affinity and multi-dimensional assignment for online multiple object tracking. In Proceedings of ICCV. 6172–6181.
[63]
N. An. 2020. Anomalies Detection and Tracking Using Siamese Neural Networks. Master's Thesis. Auckland University of Technology, New Zealand.
[64]
W. Yan. 2020. Computational Methods for Deep Learning. Springer.
[65]
W. Yan. 2019. Introduction to Intelligent Surveillance—Data Capture, Transmission, and Analytics (3rd ed.). Springer.

Cited By

View all
  • (2024)Optimizing Camera Motion with MCTS and Target Motion Modeling in Multi-Target Active Object TrackingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/364836920:7(1-19)Online publication date: 16-May-2024
  • (2024)Multi-object Tracking with Spatial-Temporal Tracklet AssociationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363515520:5(1-21)Online publication date: 11-Jan-2024
  • (2024)Kiwifruit Counting Using Kiwidetector and KiwitrackerIntelligent Systems and Applications10.1007/978-3-031-47724-9_41(629-640)Online publication date: 19-Apr-2024
  • Show More Cited By

Index Terms

  1. Multitarget Tracking Using Siamese Neural Networks

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Multimedia Computing, Communications, and Applications
    ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 17, Issue 2s
    June 2021
    349 pages
    ISSN:1551-6857
    EISSN:1551-6865
    DOI:10.1145/3465440
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 May 2021
    Accepted: 01 December 2020
    Revised: 01 December 2020
    Received: 01 July 2020
    Published in TOMM Volume 17, Issue 2s

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. SSD
    2. SiamRPN
    3. SiamFC
    4. ResNet50
    5. AlexNet

    Qualifiers

    • Research-article
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)85
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 03 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Optimizing Camera Motion with MCTS and Target Motion Modeling in Multi-Target Active Object TrackingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/364836920:7(1-19)Online publication date: 16-May-2024
    • (2024)Multi-object Tracking with Spatial-Temporal Tracklet AssociationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363515520:5(1-21)Online publication date: 11-Jan-2024
    • (2024)Kiwifruit Counting Using Kiwidetector and KiwitrackerIntelligent Systems and Applications10.1007/978-3-031-47724-9_41(629-640)Online publication date: 19-Apr-2024
    • (2023)A Mixture Model for Fruit Ripeness Identification in Deep LearningHandbook of Research on AI and ML for Intelligent Machines and Systems10.4018/978-1-6684-9999-3.ch016(1-21)Online publication date: 27-Nov-2023
    • (2023)Detection of Multiple Respiration Patterns Based on 1D SNN from Continuous Human Breathing Signals and the Range Classification Method for Each Respiration PatternSensors10.3390/s2311527523:11(5275)Online publication date: 1-Jun-2023
    • (2023)Multiple object tracking with behavior detection in crowded scenes using deep learningJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-22351644:3(5107-5121)Online publication date: 1-Jan-2023
    • (2023)Research on Target Tracking Algorithm of Micro-UAV Based on Monocular VisionJournal of Robotics10.1155/2023/66571202023Online publication date: 1-Jan-2023
    • (2023)Feedback Driven Multi Stereo Vision System for Real-Time Event AnalysisProceedings of the 2023 ACM International Conference on Interactive Media Experiences10.1145/3573381.3597220(230-236)Online publication date: 29-Aug-2023
    • (2023)JDAN: Joint Detection and Association Network for Real-Time Online Multi-Object TrackingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/353325319:1s(1-17)Online publication date: 3-Feb-2023
    • (2023)Triangular Topology Sequence-Based Multi-Target Association for Aerial-Ground Unmanned SystemsUnmanned Systems10.1142/S230138502550001313:01(7-21)Online publication date: 15-Sep-2023
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media