Skip to main content
Log in

UMTSS: a unifocal motion tracking surveillance system for multi-object tracking in videos

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Multiple object detection and tracking play a very crucial role in solving several elementary problems in real-time surveillance video analysis and computer vision. However, it is a challenging problem because real-time surveillance videos are typically affected by a variety of adverse environmental effects. In this work, we propose a novel surveillance framework, called a unifocal motion tracking surveillance system (UMTSS), for multi-object tracking in real-time videos. The proposed UMTSS combines two significant steps. First, a Faster-RCNN with inception-v2 model is employed here to detect multi-objects efficiently in each video frame. Then, a unifocal feature-based KLT (Kanade-Lucas-Tomasi) method is proposed for tracking objects across the video frames based on region proposals generated by the object detector in the previous phase. Also, we have proposed a new tracking parameter, called dynamic tracking accuracy (DTA), to quantify the performance of the tracking algorithms. The performance of our UMTSS has been evaluated on five standard crowd video databases, namely CrowdHuman, PETS, UCSD, AGORASET and CRCV, and compared with state-of-the-art methods in terms of different qualitative and quantitative measures. It has been observed that our UMTSS outperforms the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability

Data sharing is not applicable to this article as no datasets were generated or analysed during the current study.

References

  1. Abdulghafoor NH, Abdullah HN (2022) A novel real-time multiple objects detection and tracking framework for different challenges. Alexandria Eng J 61(12):9637–9647

    Article  Google Scholar 

  2. Abdulghafoor NH, Abdullah HN (2022) Enhancement performance of multiple objects detection and tracking for real-time and online applications. Int J Intell Eng Syst 13:533–545

    Google Scholar 

  3. Allain P, Courty N, Corpetti T (2012) AGORASET: a dataset for crowd video analysis. In: 1st ICPR international workshop on pattern recognition and crowd analysis, pp 1–6

  4. Ait Abdelali H, Essannouni F, Essannouni L, Aboutajdine D (2016) An adaptive object tracking using Kalman filter and probability product kernel. Model Simul Eng 2016

  5. Bernardin K, Stiefelhagen R (2008) Evaluating multiple object tracking performance: the clear mot metrics. EURASIP J Image Video Process 2008:1–10

    Article  Google Scholar 

  6. Bochkovskiy A, Wang CY, Liao HYM (2020) Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934

  7. Buddubariki V, Tulluri SG, Mukherjee S (2015) Multiple object tracking by improved KLT tracker over SURF features. In: 2015 fifth national conference on computer vision, pattern recognition, image processing and graphics (ncvpripg). IEEE, pp 1–4

  8. Čehovin L, Leonardis A, Kristan M (2016) Visual object tracking performance measures revisited. IEEE Trans Image Process 25(3):1261–1274

    MathSciNet  MATH  Google Scholar 

  9. Couturier R, Noura HN, Salman O, Sider A (2021) A deep learning object detection method for an efficient clusters initialization. arXiv preprint arXiv:2104.13634

  10. Dai J, Li Y, He K, Sun J (2016) R-fcn: object detection via region-based fully convolutional networks. Adv Neural Inf Process Syst 29:379–387

    Google Scholar 

  11. Dai J, Qi H, Xiong Y, Li Y, Zhang G, Hu H, Wei Y (2017) Deformable convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 764–773

  12. Ellis A, Ferryman J (2010) PETS2010 and PETS2009 evaluation of results using individual ground truthed single views. In: 2010 7th IEEE international conference on advanced video and signal based surveillance. IEEE, pp 135–142

  13. Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645

    Article  Google Scholar 

  14. Fu CY, Liu W, Ranga A, Tyagi A, Berg AC (2017) Dssd: deconvolutional single shot detector. arXiv preprint arXiv:1701.06659

  15. Fu C, Duan R, Kayacan E (2019) Visual tracking with online structural similarity-based weighted multiple instance learning. Inf Sci 481:292–310

    Article  Google Scholar 

  16. Gani MO, Kuiry S, Das A, Nasipuri M, Das N (2021), January Multispectral object detection with deep learning. In: International conference on computational intelligence in communications and business analytics. Springer, Cham, pp 105–117

  17. Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448

  18. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587

  19. He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916

    Article  Google Scholar 

  20. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  21. He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969

  22. Idrees H, Saleemi I, Seibert C, Shah M (2013) Multi-source multi-scale counting in extremely dense crowd images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2547–2554

  23. Jha S, Seo C, Yang E, Joshi GP (2021) Real-time object detection and trackingsystem for video surveillance system. Multimedia Tools Appl 80(3):3981–3996

    Article  Google Scholar 

  24. Jiao L, Zhang F, Liu F, Yang S, Li L, Feng Z, Qu R (2019) A survey of deep learning-based object detection. IEEE Access 7:128837–128868

    Article  Google Scholar 

  25. Jiménez-Bravo DM, Murciego ÁL, Mendes AS, Blás S, Bajo J (2022) Multi-object tracking in traffic environments: a systematic literature review. Neurocomputing

  26. Khan MA, Mittal M, Goyal LM, Roy S (2021) A deep survey on supervised learning based human detection and activity classification methods. Multimedia Tools and Applications 80(18):27867–27923

    Article  Google Scholar 

  27. Kumar A, Walia GS, Sharma K (2020) A novel approach for multi-cue feature fusion for robust object tracking. Appl Intell 50(10):3201–3218

    Article  Google Scholar 

  28. Lee B, Erdenee E, Jin S, Nam MY, Jung YG, Rhee PK (2016) Multi-class multi-object tracking using changing point detection. In: European conference on computer vision. Springer, Cham, pp 68–83

  29. Li Z, Zhang J, Zhang K, Li Z (2018) Visual tracking with weighted adaptive local sparse appearance model via spatio-temporal context learning. IEEE Trans Image Process 27(9):4478–4489

    Article  MathSciNet  MATH  Google Scholar 

  30. Li T, Wu P, Ding F, Yang W (2020) Parallel dual networks for visual object tracking. Appl Intell 50(12):4631–4646

    Article  Google Scholar 

  31. Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125

  32. Liu J, Zhang S, Wang S, Metaxas DN (2016) Multispectral deep neural networks for pedestrian detection. arXiv preprint arXiv:1611.02644

  33. Lu Y, Chen Y, Zhao D, Li H (2018) Hybrid deep learning based moving object detection via motion prediction. 2018 Chinese Automation Congress (CAC). IEEE, pp 1442–1447

  34. Luna E, San Miguel JC, Ortego D, Martínez JM (2018) Abandoned object detection in video-surveillance: survey and comparison. Sensors 18(12):4290

    Article  Google Scholar 

  35. Mahadevan V, Li W, Bhalodia V, Vasconcelos N (2010) Anomaly detection in crowded scenes. In: 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 1975–1981

  36. Mentzelopoulos M, Psarrou A (2004) Key-frame extraction algorithm using entropy difference. In: Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval, pp 39–45

  37. Mukilan P, Semunigus W (2022) Human and object detection using hybrid deep convolutional neural network. Signal Image Video Process 1–11

  38. Pal SK, Bhoumik D, Bhunia Chakraborty D (2020) Granulated deep learning and Z-numbers in motion detection and object recognition. Neural Comput Appl 32(21):16533–16548

    Article  Google Scholar 

  39. Pal SK, Pramanik A, Maiti J, Mitra P (2021) Deep learning in multi-object detection and tracking: state of the art. Appl Intell 51(9):6400–6429

    Article  Google Scholar 

  40. Park Y, Dang LM, Lee S, Han D, Moon H (2021) Multiple object tracking in deep learning approaches: a survey. Electronics 10(19):2406

  41. Pramanik A, Pal SK, Maiti J, Mitra P (2021) Granulated RCNN and multi-class deep sort for multi-object detection and tracking. IEEE Trans Emerg Top Comput Intell 6(1):171–181

  42. Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271

  43. Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767

  44. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788

  45. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 28

  46. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, … Fei-Fei L (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

  47. Shao S, Zhao Z, Li B, Xiao T, Yu G, Zhang X, Sun J (2018) Crowdhuman: a benchmark for detecting human in a crowd. arXiv preprint arXiv:1805.00123

  48. Sharma P, Kokare PM, Kolekar MH (2019) Performance comparison of KLT and CAMSHIFT algorithms for video object tracking. Recent trends in communication, computing, and electronics. Springer, Singapore, pp 323–331

    Chapter  Google Scholar 

  49. Sharma V, Mir RN (2020) A comprehensive and systematic look up into deep learning based object detection techniques: a review. Comput Sci Rev 38:100301

  50. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826

  51. Shi J (1994) Good features to track. In: 1994 Proceedings of IEEE conference on computer vision and pattern recognition. IEEE, pp 593–600

  52. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  53. Takumi K, Watanabe K, Ha Q, Tejero-De-Pablos A, Ushiku Y, Harada T (2017) Multispectral object detection for autonomous vehicles. In: Proceedings of the on thematic workshops of ACM multimedia 2017, pp 35–43

  54. Wang G, Wang Y, Zhang H, Gu R, Hwang JN (2019) Exploit the connectivity: multi-object tracking with trackletnet. In: Proceedings of the 27th ACM International Conference on Multimedia, pp 482–490

  55. Wang Z, Zheng L, Liu Y, Li Y, Wang S (2020) Towards real-time multi-object tracking. In: European conference on computer vision. Springer, Cham, pp 107–122

  56. Xu Y, Osep A, Ban Y, Horaud R, Leal-Taixé L, Alameda-Pineda X (2020) How to train your deep multi-object tracker. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6787–6796

  57. Xu Y, Li Z, Wang S, Li W, Sarkodie-Gyan T, Feng S (2021) A hybrid deep-learning model for fault diagnosis of rolling bearings. Measurement 169:108502

    Article  Google Scholar 

  58. Zhang S, Wen L, Bian X, Lei Z, Li SZ (2018) Single-shot refinement neural network for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4203–4212

  59. Zhang Y, Wang C, Wang X, Zeng W, Liu W (2021) Fairmot: on the fairness of detection and re-identification in multiple object tracking. Int J Comput Vision 129(11):3069–3087

Download references

Acknowledgements

This project work is supported by DRDO under the title ‘Object Identification through Syntactic as well as Semantic Interpretation from given Spatio-Temporal Scenarios’. The project was reviewed and sanctioned by the revıew committee under the project grant number ERIP/ER/1404742/M/01/1661. We would like to express our sincere gratitude to DRDO members for this opportunity.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Banani Saha.

Ethics declarations

Competing Interests

The authors state that they have no conflicting financial interests or personal connections that may have influenced the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hazra, S., Mandal, S., Saha, B. et al. UMTSS: a unifocal motion tracking surveillance system for multi-object tracking in videos. Multimed Tools Appl 82, 12401–12422 (2023). https://doi.org/10.1007/s11042-022-13780-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13780-5

Keywords

Navigation