Skip to main content

Advertisement

Log in

Motion-compensated online object tracking for activity detection and crowd behavior analysis

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

It is a nontrivial task to manage crowds in public places and recognize unacceptable behavior (such as violating social distancing norms during the COVID-19 pandemic). In such situations, people should avoid loitering (unnecessary moving out in public places without apparent purpose) and maintain a sufficient physical distance. In this study, a multi-object tracking algorithm has been introduced to improve short-term object occlusion, detection errors, and identity switches. The objects are tracked through bounding box detection and with linear velocity estimation of the object using the Kalman filter frame by frame. The predicted tracks are kept alive for some time, handling the missing detections and short-term object occlusion. ID switches (mainly due to crossing trajectories) are managed by explicitly considering the motion direction of the objects in real time. Furthermore, a novel approach to detect unusual behavior of loitering with a severity level is proposed based on the tracking information. An adaptive algorithm is also proposed to detect physical distance violation based on the object dimensions for the entire length of the track. At last, a mathematical approach to calculate actual physical distance is proposed by using the height of a human as a reference object which adheres more specific distancing norms. The proposed approach is evaluated in traffic and pedestrian movement scenarios. The experimental results demonstrate a significant improvement in the results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. https://www.who.int/emergencies/diseases/novel-coronavirus-2019.

  2. https://github.com/cheind/py-motmetrics.

References

  1. Andriyenko, A., Schindler, K., Roth, S.: Discrete-continuous optimization for multi-target tracking. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1926–1933. IEEE (2012)

  2. Arroyo, R., Yebes, J.J., Bergasa, L.M., Daza, I.G., Almazán, J.: Expert video-surveillance system for real-time detection of suspicious behaviors in shopping malls. Expert Syst. Appl. 42(21), 7991–8005 (2015)

    Article  Google Scholar 

  3. Bae, S.H., Yoon, K.J.: Robust online multi-object tracking based on tracklet confidence and online discriminative appearance learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1218–1225 (2014)

  4. Basly, H., Ouarda, W., Sayadi, F.E., Ouni, B., Alimi, A.M.: Dtr-har: deep temporal residual representation for human activity recognition. Vis. Comput. 1–21 (2021)

  5. Benfold, B., Reid, I.: Stable multi-target tracking in real-time surveillance video. In: CVPR 2011, pp. 3457–3464. IEEE (2011)

  6. Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the clear mot metrics. EURASIP J. Image Video Process. 2008, 1–10 (2008)

    Article  Google Scholar 

  7. Betke, M., Hirsh, D.E., Bagchi, A., Hristov, N.I., Makris, N.C., Kunz, T.H.: Tracking large variable numbers of objects in clutter. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)

  8. Bewley, A., Ge, Z., Ott, L., Ramos, F., Upcroft, B.: Simple online and realtime tracking. In: 2016 IEEE international conference on image processing (ICIP), pp. 3464–3468. IEEE (2016)

  9. Brau, E., Dunatunga, D., Barnard, K., Tsukamoto, T., Palanivelu, R., Lee, P.: A generative statistical model for tracking multiple smooth trajectories. In: CVPR 2011, pp. 1137–1144. IEEE (2011)

  10. Chang, Y., Tu, Z., Xie, W., Yuan, J.: Clustering driven deep autoencoder for video anomaly detection. In: European Conference on Computer Vision, pp. 329–345. Springer (2020)

  11. Collins, R.T.: Multitarget data association with higher-order motion models. In: 2012 IEEE conference on computer vision and pattern recognition, pp. 1744–1751. IEEE (2012)

  12. Dai, J., Li, Y., He, K., Sun, J.: R-fcn: Object detection via region-based fully convolutional networks. arXiv preprint arXiv:1605.06409 (2016)

  13. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), vol. 1, pp. 886–893. Ieee (2005)

  14. Dawn, D.D., Shaikh, S.H.: A comprehensive survey of human action recognition with spatio-temporal interest point (stip) detector. Vis. Comput. 32(3), 289–306 (2016)

    Article  Google Scholar 

  15. Dehghan, A., Assari, S.M., Shah, M.: Gmmcp tracker: Globally optimal generalized maximum multi clique problem for multiple object tracking. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4091–4099 (2015). https://doi.org/10.1109/CVPR.2015.7299036

  16. Dicle, C., Camps, O.I., Sznaier, M.: The way they move: Tracking multiple targets with similar appearance. In: Proceedings of the IEEE international conference on computer vision, pp. 2304–2311 (2013)

  17. Erhan, D., Szegedy, C., Toshev, A., Anguelov, D.: Scalable object detection using deep neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2155–2162 (2014). https://doi.org/10.1109/CVPR.2014.276

  18. Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2009)

    Article  Google Scholar 

  19. Feng, W., Hu, Z., Wu, W., Yan, J., Ouyang, W.: Multi-object tracking with multiple cues and switcher-aware classification. arXiv preprint arXiv:1901.06129 (2019)

  20. Fernández-Ramírez, J., Álvarez-Meza, A., Pereira, E., Orozco-Gutiérrez, A., Castellanos-Dominguez, G.: Video-based social behavior recognition based on kernel relevance analysis. Vis. Comput. 36(8), 1535–1547 (2020)

    Article  Google Scholar 

  21. Ferryman, J.: Pets 2006 benchmark data. http://www.cvg.reading.ac.uk/PETS2006/data.html

  22. Fortmann, T., Bar-Shalom, Y., Scheffe, M.: Sonar tracking of multiple targets using joint probabilistic data association. IEEE J. Ocean. Eng. 8(3), 173–184 (1983)

    Article  Google Scholar 

  23. Fu, C.Y., Liu, W., Ranga, A., Tyagi, A., Berg, A.: Dssd : Deconvolutional single shot detector. ArXiv:1701.06659 (2017)

  24. Girshick, R.: Fast r-cnn. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1440–1448 (2015). https://doi.org/10.1109/ICCV.2015.169

  25. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580–587 (2014)

  26. Gupta, S., Kapil, R., Kanahasabai, G., Joshi, S.S., Joshi, A.S.: Sd-measure: A social distancing detector. In: 2020 12th International Conference on Computational Intelligence and Communication Networks (CICN), pp. 306–311. IEEE (2020)

  27. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. IEEE Trans. Pattern Anal. Mach. Intell. 42(2), 386–397 (2020). https://doi.org/10.1109/TPAMI.2018.2844175

    Article  Google Scholar 

  28. He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)

    Article  Google Scholar 

  29. Isard, M., Blake, A.: Condensation-conditional density propagation for visual tracking. Int. J. Comput. Vis. 29(1), 5–28 (1998)

    Article  Google Scholar 

  30. Kalman, R.E., Bucy, R.S.: New results in linear filtering and prediction theory (1961)

  31. Kim, C., Li, F., Ciptadi, A., Rehg, J.M.: Multiple hypothesis tracking revisited. In: Proceedings of the IEEE international conference on computer vision, pp. 4696–4704 (2015)

  32. Leibe, B., Schindler, K., Van Gool, L.: Coupled detection and trajectory estimation for multi-object tracking. In: 2007 IEEE 11th International Conference on Computer Vision, pp. 1–8. IEEE (2007)

  33. Li, Y., Huang, C., Nevatia, R.: Learning to associate: Hybridboosted multi-target tracker for crowded scene. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2953–2960 (2009). https://doi.org/10.1109/CVPR.2009.5206735

  34. Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)

  35. Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: common objects in context. In: European Conference on Computer Vision, pp. 740–755. Springer (2014)

  36. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: Single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37. Springer (2016)

  37. Luo, W., Xing, J., Milan, A., Zhang, X., Liu, W., Kim, T.K.: Multiple object tracking: a literature review. Artificial Intelligence p. 103448 (2020)

  38. Mabrouk, A.B., Zagrouba, E.: Spatio-temporal feature using optical flow based distribution for violence detection. Pattern Recognit. Lett. 92, 62–67 (2017)

    Article  Google Scholar 

  39. Mabrouk, A.B., Zagrouba, E.: Abnormal behavior recognition for intelligent video surveillance systems: a review. Expert Syst. Appl. 91, 480–491 (2018)

    Article  Google Scholar 

  40. Mercaldo, F., Martinelli, F., Santone, A.: A proposal to ensure social distancing with deep learning-based object detection. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–5 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534231

  41. Milan, A., Leal-Taixe, L., Reid, I., Roth, S., Schindler, K.: Mot16: A benchmark for multi-object tracking (2016)

  42. Morris, B.T., Trivedi, M.M.: A survey of vision-based trajectory learning and analysis for surveillance. IEEE Trans. Circuits Syst. Video Technol. 18(8), 1114–1127 (2008). https://doi.org/10.1109/TCSVT.2008.927109

    Article  Google Scholar 

  43. Najibi, M., Rastegari, M., Davis, L.S.: G-cnn: An iterative grid based object detector. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2369–2377 (2016). https://doi.org/10.1109/CVPR.2016.260

  44. Nam, Y.: Loitering detection using an associating pedestrian tracker in crowded scenes. Multimed. Tools Appl. 74(9), 2939–2961 (2013). https://doi.org/10.1007/s11042-013-1763-7

    Article  Google Scholar 

  45. (NCD-RisC), N.R.F.C.: A century of trends in adult human height. eLife 5, e13410 (2016). https://doi.org/10.7554/eLife.13410

  46. Patel, A.S., Merlino, G., Bruneo, D., Puliafito, A., Vyas, O., Ojha, M.: Video representation and suspicious event detection using semantic technologies. Semantic Web 12(3), 467–491 (2021). https://doi.org/10.3233/sw-200393

    Article  Google Scholar 

  47. Patel, A.S., Vyas, O.P., Ojha, M.: Vehicle tracking and monitoring in surveillance video. In: 2019 IEEE Conference on Information and Communication Technology, pp. 1–6 (2019). https://doi.org/10.1109/CICT48419.2019.9066256

  48. Patino, L., Cane, T., Vallee, A., Ferryman, J.: Pets 2016: Dataset and challenge. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1240–1247 (2016). https://doi.org/10.1109/CVPRW.2016.157

  49. Rai, H., Kolekar, M.H., Keshav, N., Mukherjee, J.: Trajectory based unusual human movement identification for video surveillance system. In: Progress in Systems Engineering, pp. 789–794. Springer (2015)

  50. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788 (2016). https://doi.org/10.1109/CVPR.2016.91

  51. Redmon, J., Farhadi, A.: Yolo9000: Better, faster, stronger. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6517–6525 (2017). https://doi.org/10.1109/CVPR.2017.690

  52. Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)

  53. Reid, D.: An algorithm for tracking multiple targets. IEEE Trans. Autom. Control 24(6), 843–854 (1979)

    Article  Google Scholar 

  54. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017). https://doi.org/10.1109/TPAMI.2016.2577031

    Article  Google Scholar 

  55. Rezatofighi, S.H., Milan, A., Zhang, Z., Shi, Q., Dick, A., Reid, I.: Joint probabilistic data association revisited. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3047–3055 (2015)

  56. Ristani, E., Solera, F., Zou, R., Cucchiara, R., Tomasi, C.: Performance measures and a data set for multi-target, multi-camera tracking. In: Hua, G., Jégou, H. (eds.) Computer Vision—ECCV 2016 Workshops, pp. 17–35. Springer, Cham (2016)

  57. S, A., R, N.S.: Versatile loitering detection based on non-verbal cues using dense trajectory descriptors. Multimedia Tools and Applications 78(8), 10933-10963 (2018). https://doi.org/10.1007/s11042-018-6618-9

  58. Saponara, S., Elhanashi, A., Gagliardi, A.: Implementing a real-time, ai-based, people detection and social distancing measuring system for Covid-19. J. Real-Time Image Process. 1–11 (2021)

  59. Shen, Z., Liu, Z., Li, J., Jiang, Y., Chen, Y., Xue, X.: Dsod: Learning deeply supervised object detectors from scratch. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1937–1945 (2017). https://doi.org/10.1109/ICCV.2017.212

  60. Son, J., Baek, M., Cho, M., Han, B.: Multi-object tracking with quadruplet convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5620–5629 (2017)

  61. Sugianto, N., Tjondronegoro, D., Stockdale, R., Yuwono, E.I.: Privacy-preserving ai-enabled video surveillance for social distancing: responsible design and deployment for public spaces. Information Technology & People (2021)

  62. Tu, Z., Li, H., Zhang, D., Dauwels, J., Li, B., Yuan, J.: Action-stage emphasized spatiotemporal vlad for video action recognition. IEEE Trans. Image Process. 28(6), 2799–2812 (2019). https://doi.org/10.1109/TIP.2018.2890749

    Article  MathSciNet  MATH  Google Scholar 

  63. Vishwakarma, S., Agrawal, A.: A survey on activity recognition and behavior understanding in video surveillance. Vis. Comput. 29(10), 983–1009 (2013)

    Article  Google Scholar 

  64. Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. In: 2017 IEEE international conference on image processing (ICIP), pp. 3645–3649. IEEE (2017)

  65. Wu, B., Nevatia, R.: Detection and tracking of multiple, partially occluded humans by bayesian combination of edgelet based part detectors. Int. J. Comput. Vision 75(2), 247–266 (2007)

    Article  Google Scholar 

  66. Wu, Z., Kunz, T.H., Betke, M.: Efficient track linking methods for track graphs using network-flow and set-cover techniques. In: CVPR 2011, pp. 1185–1192. IEEE (2011)

  67. Xing, J., Ai, H., Lao, S.: Multi-object tracking through occlusions by local tracklets filtering and global tracklets association with detection responses. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1200–1207. IEEE (2009)

  68. Yoo, D., Park, S., Lee, J., Paek, A.S., Kweon, I.S.: Attentionnet: Aggregating weak directions for accurate object detection. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 2659–2667 (2015). https://doi.org/10.1109/ICCV.2015.305

  69. Zhang, L., Li, Y., Nevatia, R.: Global data association for multi-object tracking using network flows. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008)

  70. Zuo, F., Gao, J., Kurkcu, A., Yang, H., Ozbay, K., Ma, Q.: Reference-free video-to-real distance approximation-based urban social distancing analytics amid covid-19 pandemic. J. Transp. Health 21, 101032 (2021)

    Article  Google Scholar 

Download references

Funding

This study does not receive any funding.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ashish Singh Patel.

Ethics declarations

Conflict of interest

All authors have participated in the design and approval of the final manuscript version. This manuscript has not been submitted to, nor is under review at, another journal or other publishing venue. The contents of this manuscript are not copyrighted, submitted, or published elsewhere, while acceptance by the Journal is under consideration. There are no directly related manuscripts or abstracts, published or unpublished, by any authors of this paper. All authors declares that he/she has no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Patel, A.S., Vyas, R., Vyas, O.P. et al. Motion-compensated online object tracking for activity detection and crowd behavior analysis. Vis Comput 39, 2127–2147 (2023). https://doi.org/10.1007/s00371-022-02469-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-022-02469-3

Keywords