Skip to main content
Log in

Concept drift adaptation in video surveillance: a systematic review

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The world we live in is dynamic by nature. Frequently, the environment changes in ways we cannot predict. In machine learning, the phenomenon that occurs when a model has its prediction effectiveness degraded due to unforeseen changes is known as concept drift. Applications of smart video surveillance tend to suffer from concept drift due to changes in illumination, weather, and scene structure. This work differs from previous ones as it brings focus to the problem of concept drift from a surveillance video perspective which presents additional challenges compared to other sources of data, such as high dimensionality, spatial and temporal relations between data, and real-time constraints. The approaches and algorithms used to cope with concept drift are compared and discussed. We also present datasets and metrics used to evaluate the effectiveness of the algorithms.

As contributions, we present a new classification of concept drift adaptation methods, delineate the characteristics and limitations of techniques that deal with concept drift, and analyze practical aspects, such as real-time processing and memory constraints. Moreover, we conclude that informed concept drift adaptation methods have been employed 90% less than continuous adaptation ones.

Research directions include using established concept drift detection techniques applied to surveillance video data, exploring datasets for concept drift in surveillance, strategies to deal with the high dimensionality and volume of surveillance video data when adapting existing models, and the creation of frameworks to manage drift adaptation while applying computer vision tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Source: Office of Information Technology of USP (STI-USP)

Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

The authors declare that the data supporting the findings of this study are available within the article.

Notes

  1. https://ieeexplore.ieee.org/Xplore

  2. https://dl.acm.org

  3. https://www.scopus.com

References

  1. Alahakoon D, Halgamuge SK, Srinivasan B (2000) Dynamic self-organizing maps with controlled growth for knowledge discovery. IEEE Trans Neural Networks 11(3):601–614

    Google Scholar 

  2. Alcantara MF, Moreira TP, Pedrini H (2016) Real-time action recognition using a multilayer descriptor with variable size. J Electron Imaging 25(1):013020

    Google Scholar 

  3. Ali S, Bouguila N (2020) Online learning for beta-liouville hidden markov models: Incremental variational learning for video surveillance and action recognition. In 2020 IEEE International Conference on Image Processing (ICIP). IEEE, 10

  4. Anoopa S, Salim A, Beevi SN (2022) Advanced video anomaly detection using 2d cnn and stacked lstm with deep active learning-based model. Kuwait J Sci 6

  5. Baena-Garcia M, del Campo-Ávila J, Fidalgo R, Bifet A, Gavaldã R, Morales-Bueno R (2017) Early drift detection method. 4th ECML PKDD International Workshop on Knowledge Discovery from Data Streams, 6

  6. Bakliwal P, Hegde GM,  Jawahar CV (2017) Collaborative contributions for better annotations. In VISIGRAPP 2017 Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications volume 6

  7. Baltieri D, Vezzani R, and Cucchiara R (2011) 3dpes: 3d people dataset for surveillance and forensics. In MM’11 Proceedings of the 2011 ACM Multimedia Conference and Co-Located Workshops JHGBU 2011 Workshop, J-HGBU’11

  8. Barddal JP, Gomes HM, Enembreck FC, fahringer BP (2017) A survey on feature drift adaptation: Definition, benchmark, challenges and future directions. J Syst Softw 127:278–294. https://doi.org/10.1016/j.jss.2016.07.005

    Article  Google Scholar 

  9. Barekatain M, Marti M, Shih HF, Murray S, Nakayama K, Matsuo Y, Prendinger H (2017) Okutama-action: An aerial view video dataset for concurrent human action detection. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, volume 2017

  10. Barrow HG, Tenenbaum JM (1981) Computational vision. Proc IEEE 69(5):572–595

    Google Scholar 

  11. Bastani V, Marcenaro L, Regazzoni CS (2016) Online nonparametric bayesian activity mining and analysis from surveillance video. IEEE Trans Image Process 25(5):2089–2102

    MathSciNet  Google Scholar 

  12. Bialkowski A, Denman S, Sridharan S, Fookes C, Lucey P (2012) A database for person re-identification in multi-camera surveillance networks. In 2012 International Conference on Digital Image Computing Techniques and Applications DICTA

  13. Bifet A, Gavaldá R (2007) Learning from time-changing data with adaptive windowing. In Proceedings of the 7th SIAM International Conference on Data Mining

  14. Blank M, Gorelick L, Shechtman E, Irani M, Basri R (2005) Actions as space-time shapes. In Proceedings of the 7th SIAM International Conference on Data Mining, volume II

  15. Bernhard E. Boser, Guyon IM, Vapnik VN (1992) Training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory

  16. Campo D, Slavic G, Baydoun M, Marcenaro L, Regazzoni C (2020) Continual learning of predictive models in video sequences via variational autoencoders. In Proceedings International Conference on Image Processing Process ICIP, volume-October

  17. Cao W, Han H, Sun XK, Fang ZJ (2017) Target re-identification based on adaptive incremental kiss measure learning. Memetic Comput 9(1):23–30

    Google Scholar 

  18. Cao Z, Qin Y, Li Y, Xie Z, Guo J, Jia L (2022) Face detection for rail transit passengers based on single shot detector and active learning. Multimed Tools Appl 8(29):42433–42456

    Google Scholar 

  19. Chaovalit P, Zhou L (2005) Movie review mining: a comparison between supervised and unsupervised classification approaches. In Proc 38th Annual Hawaii Proceedings of the 38th Annual Hawaii International Conference on System Sciences, pp 112c–112c

  20. Chen H, Zhao X, Wang T, Tan M, Sun S (2016) Spatial-temporal context-aware abnormal event detection based on incremental sparse combination learning. In 2016 12th World Con Intell Contr Autom (WCICA). IEEE, 6

  21. Choi W, Shahid K, Savarese S (2009) What are they doing?: Collective activity classification using spatio-temporal relationship among people. In 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops

  22. COCO Consortium (2019) Coco detection evaluation metrics

  23. Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, volume 2016-December

  24. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In Proceedings- 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005

  25. De Silva D, Alahakoon D (2010) Incremental knowledge acquisition and self learning from text. In 2010 International Joint Conference on Neural Networks (IJCNN) 1–8

  26. Ding S, Zhu H, Jia W, Chunyang Su (2012) A survey on feature extraction for pattern recognition. Artif Intell Rev 37:3

    Google Scholar 

  27. Disabato S, Roveri M (2019) Learning convolutional neural networks in presence of concept drift. In Proceedings of the International Joint Conference on Neural Networks

  28. Ditzler G, Roveri M, Alippi C, Polikar R (2015) Learning in nonstationary environments: A survey. IEEE Comput Intell Mag 10(4):12–25

    Google Scholar 

  29. Dongre PB, Malik LG (2014) A review on real time data stream classification and adapting to various concept drift scenarios. In 2014 IEEE International Advance Computing Conference (IACC), 533–537

  30. Doshi K, Yilmaz Y (2020) Continual learning for anomaly detection in surveillance videos. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops

  31. Doshi K, YilmazY (2022) Multi-task learning for video surveillance with limited data. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), IEEE, 6, pp. 3888–3898

  32. Doshi K, Yilmaz Y (2022) Rethinking video anomaly detection - a continual learning approach. In Proceedings 2022 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2022, 1

  33. Du S, Tao Y, Martinez AM (2014) Compound facial expressions of emotion. Proc Nat Acad Sci United States Am 111(15):E1454

    Google Scholar 

  34. Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338

    Google Scholar 

  35. Fang SC, Venkatesh SS (1995) On batch learning in a binary weight setting. In Proceedings of 1995 IEEE International Symposium on Information Theory, pp 170

  36. Ferryman J, Shahrokni A 2009. Pets: Dataset and challenge. In, 2009 Twelfth IEEE International Workshop on Performance Evaluation of Tracking and Surveillance IEEE 12

  37. Fisher R, Santos-Victor J, Crowley J (2007) Caviar: Context aware vision using image-based active recognition

  38. Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139

    MathSciNet  Google Scholar 

  39. Gama J, Fernandes R, Rocha R (2006) Decision trees for mining data streams. Intell Data Anal 10(1):23–45

    Google Scholar 

  40. Gama J, Medas P, Castillo G, Rodrigues P (2004) Learning with drift detection. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3171

  41. Gama J, Zliobaite I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv 46(4):1–37

    Google Scholar 

  42. Gözüack A, Can F (2020) Concept learning using one-class classifiers for implicit drift detection in evolving data streams. Artif Intell Rev

  43. Geiger A, Lenz P, Stiller C, Urtasun R (2013) Vision meets robotics: The kitti dataset. Int J Robot Res 32(11):1231–1237

    Google Scholar 

  44. Gepperth A, Hamme B (2016) Incremental learning algorithms and applications. In European Symposium on Artificial Neural Networks (ESANN), Bruges, Belgium

  45. Girshick R (2015) Fast r-cnn. In 2015 IEEE International Conference on Computer Vision (ICCV), pp 1440–1448

  46. Goller C, Kuechler A (1996) Learning task-dependent distributed representations by backpropagation through structure. In IEEE International Conference on Neural Networks Conference Proceedings, 1

  47. Gonzalez J, Prevost L (2021) Personalizing emotion recognition using incremental random forests. In 2021 29th European Signal Processing Conference (EUSIPCO), IEEE, 8 pp. 781–7852021

  48. Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In Advances in Neural Information Processing Systems, 3

  49. Goyette N, Jodoin PM, Porikli F, Konrad J, Ishwar P (2012) Changedetection.net: A new change detection benchmark dataset. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops

  50. Gray D, Brennan S, Tao H (2007) Evaluating appearance models for recognition, reacquisition, and tracking. 10th International Workshop on Performance Evaluation for Tracking and Surveillance (PETS), 3

  51. Grimmeisen B, Theissler A (2020) The machine learning model as a guide: Pointing users to interesting instances for labeling through visual cues. In ACM International Conference Proceeding Series

  52. Gu C, Sun C, Ross DA, Vondrick C, Pantofaru C, Li Y, Vijayanarasimhan S, Toderici G, Ricco S, Sukthankar R, Schmid C, Malik J. (2018) Ava: A video dataset of spatio-temporally localized atomic visual actions. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

  53. Hamdoun O, Moutarde F, Stanciulescu B, Steux B (2008) Person re-identification in multi-camera system by signature based on interest point descriptors collected on short video sequences. In 2008 2nd ACM/IEEE International Conference on Distributed Smart Cameras, ICDSC

  54. Hampapur A, Brown L, Connell J, Pankanti S, Senior A, Tian Y (2003) Smart surveillance: applications, technologies and implications. In Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint, volume 2, vol.2 pp. 1133–1138

  55. Hasan M, Paul S, Mourikis AI, Roy-Chowdhury AK (2020) Context-aware query selection for active learning in event recognition. IEEE Trans Pattern Anal Mach Intell 42(3):554–567

    Google Scholar 

  56. Hilsenbeck B, Munch D, Grosselfinger AK, Habner W, Arens M (2017) Action recognition in the longwave infrared and the visible spectrum using hough forests. In Proceedings 2016 IEEE International Symposium on Multimedia, ISM

  57. Hirzer M, Beleznai C, Roth PM, Bischof H (2011) Person re-identification by descriptive and discriminative classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume 6688 LNCS

  58. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Google Scholar 

  59. Hoogs A, Perera AGA (2008) Video activity recognition in the real world. In Proceedings of the National Conference on Artificial Intelligence, 3

  60. Hossin M, Sulaiman MN (2015) A review on evaluation metrics for data classification evaluations. Int J Data Min Knowl Manag Proc 5(2):01–11

    Google Scholar 

  61. Huang X, Xu J, Guo G (2018) Incremental kernel null foley-sammon transform for person re-identification. In Proceedings International Conference on Pattern Recognition, volume 2018-August

  62. Huang Z, Shan S, Wang R, Zhang H, Lao S, Kuerban A, Chen X (2015) A benchmark and comparative study of video-based face recognition on cox face database. IEEE Trans Image Proc 24(12):5967–5981

    MathSciNet  Google Scholar 

  63. Hu B, Yang C, Shao Y, Yang S (2019) Video-based person re-identification. Nanjing Hangkong Hangtian Daxue Xuebao/Journal of Nanjing University of Aeronautics and Astronautics, 51

  64. Ismail MH, Pakhriazad HZ, Shahrin MF (2009) Evaluating supervised and unsupervised techniques for land cover mapping using remote sensing data. Geografia : Malaysian Journal of Society and Space, 01

  65. Jhuang H, Gall J, Zuffi S, Schmid C, Black MJ (2013) Towards understanding action recognition. In Proceedings of the IEEE International Conference on Computer Vision

  66. Joy F, Vijayakumar V (2021) Multiple object detection in surveillance video with domain adaptive incremental fast rcnn algorithm. Ind J Comput Sci Eng, 12

  67. Keele S (2007) Guidelines for performing systematic literature reviews in software engineering: Technical report. EBSE Technical Report EBSE-2007–01

  68. Khamassi I, Sayed-Mouchaweh M, Hammami M, Ghadira K (2018) Discussion and review on evolving data streams and concept drift adapting. Evolv Syst 9(1):1–23

    Google Scholar 

  69. Khan A, Zhang J, Wang Y (2010) Appearance-based re-identification of people in video. In Proceedings 2010 Digital Image Computing: Techniques and Applications, DICTA

  70. Kharabe SR, Raghu B (2016) Matching of video objects taken from different camera views by using multi-feature fusion and evolutionary learning methods. In Proceedings of the 10th INDIACom; 2016 3rd International Conference on Computing for Sustainable Global Development, INDIACom

  71. Khoshrou S, Cardoso JS, Teixeira LF (2015) Learning from evolving video streams in a multi-camera scenario. Mach Learn 100(2–3):609–633

    MathSciNet  Google Scholar 

  72. Kim W, Tanaka M, Okutomi M, Sasak Y (2021) Adaptive future frame prediction with ensemble network. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume 12667 LNCS

  73. Krishna MM, Neelima M, Harshali M, Rao MVG (2018) Image classification using deep learning. Int J Eng Technol (UAE) 7(2):614

    Google Scholar 

  74. Kuehne H, Jhuang H, Garrote E, Poggio T, Serre T (2011) Hmdb: A large video database for human motion recognition. In Proceedings of the IEEE International Conference on Computer Vision

  75. Kumari P, Saini M (2020) Multivariate adaptive gaussian mixture for scene level anomaly modeling. In 2020 IEEE Sixth International Conference on Multimedia Big Data (BigMM). IEEE, 9

  76. Kwon B, Kim T (2022) Toward an online continual learning architecture for intrusion detection of video surveillance. IEEE Access 10:89732–89744

    Google Scholar 

  77. Lawrence S, Giles CL, Tsoi AC, Back AD (1997) Face recognition: A convolutional neural-network approach. IEEE Trans Neur Netw, 8:98–113

  78. Lecun Y, Leon Bottou Y, Bengio PH (1998) Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11):2278–2324

    Google Scholar 

  79. Lim CP Harrison RF (1995) Probabilistic fuzzy artmap: an autonomous neural network architecture for bayesian probability estimation. In 1995 Fourth International Conference on Artificial Neural Networks, pp 148–153

  80. Lin H, Deng JD, Woodford BJ, Shahi A (2016) Online weighted clustering for real-time abnormal event detection in video surveillance. In MM 2016 Proceedings of the 2016 ACM Multimedia Conference

  81. Lin H, Deng JD, Woodford BJ (2015) Anomaly detection in crowd scenes via online adaptive one-class support vector machines. In Proceedings International Conference on Image Processing, ICIP, volume 2015-December

  82. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: Single shot multibox detector. Lecture Notes in Computer Science, pp. 21–37

  83. Li T, Fong S, Wong KKL, Ying W, Yang XS, Li X (2020) Fusing wearable and remote sensing data streams by fast incremental learning with swarm decision table for human activity recognition. Information Fusion 60:41–64. https://doi.org/10.1016/j.inffus.2020.02.001

    Article  Google Scholar 

  84. Li W, Wang X (2013) Locally aligned feature transforms across views. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

  85. Li W, Zhao R, Xiao T, Wang X (2014) Deepreid: Deep filter pairing neural network for person re-identification. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

  86. Lopez-Lopez E, Regueiro CV, Pardo XM, Franco A, Lumini A (2021) Towards a self-sufficient face verification system. Expert Syst Appl, 174

  87. Lopez-Lopez E, Regueiro CV, Pardo XM. (2021) An adaptive video-to-video face identification system based on self-training. In 2020 25th International Conference on Pattern Recognition (ICPR), pp 2590–2596. IEEE, 1

  88. Loy CC, Xiang T, Gong S (2009) Multi-camera activity correlation analysis. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 6

  89. Lucey P, Cohn JF, Kanade T, Saragih J, Ambadar Z, Matthews I (2010) The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2010

  90. Luo W, Liu W, Gao S (2017) A revisit of sparse coding based anomaly detection in stacked rnn framework. In Proceedings of the IEEE International Conference on Computer Vision

  91. Lu C, Shi J, Jia J (2013) Abnormal event detection at 150 fps in matlab. In Proceedings of the IEEE International Conference on Computer Vision

  92. Lu D, Weng Q (2007) A survey of image classification methods and techniques for improving classification performance. Int J Remote Sensing 28(5):823–870

    Google Scholar 

  93. Lu J, Liu A, Dong F, Gu F, Gama J, Zhang G (2019) Learning under concept drift: A review. IEEE Transactions on Knowledge and Data Engineering, 31

  94. Lv J, Chen W, Li Q, Yang C (2018) Unsupervised cross-dataset person re-identification by transfer learning of spatial-temporal patterns. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

  95. Mahadevan V, Li W, Bhalodia V, Vasconcelos N (2010) Anomaly detection in crowded scenes. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 6

  96. Mairal J, Bach F, Ponce J, Sapiro G (2009) Online dictionary learning for sparse coding. In Proceedings of the 26th Annual International Conference on Machine Learning, ICML ’09, page 689-696, New York, NY, USA, 2009. Association for Computing Machinery

  97. Martos G, Muñoz A, González J (2013) On the generalization of the mahalanobis distance. In José Ruiz-Shulcloper and Gabriella Sanniti di Baja, editors, Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, pp. 125–132, Berlin, Heidelberg, 2013. Springer Berlin Heidelberg

  98. McCloskey M, Cohen NJ (1989) Catastrophic interference in connectionist networks: The sequential learning problem. In Gordon H. Bower, editor, Psychology of Learning and Motivation, volume 24, pages 109–165. Academic Press

  99. McCulloch WS, Pitts W (1988) A Logical Calculus of the Ideas Immanent in Nervous Activity, pp 15-27. MIT Press, Cambridge, MA, USA

    Google Scholar 

  100. Mehran R, Oyama A, Shah M (2011) Umn dataset

  101. Menze M, Geiger A (2015) Object scene flow for autonomous vehicles. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

  102. Messing R, Pal C, Kautz H (2009) Activity recognition using the velocity histories of tracked keypoints. In Proceedings of the IEEE International Conference on Computer Vision

  103. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space

  104. Nallaperuma D, Nawaratne R, Bandaragoda T, Adikari A, Nguyen S, Kempitiya T, De Silva D, Alahakoon D, Pothuhera D (2019) Online incremental machine learning platform for big data-driven smart traffic management. IEEE Trans Intell Transport Syst 20(12):4679–4690

    Google Scholar 

  105. Nawaratne R, Alahakoon D, De Silva D, Yu X (2020) Spatiotemporal anomaly detection using deep learning for real-time video surveillance. IEEE Transactions on Industrial Informatics 16(1):393–402

    Google Scholar 

  106. Nawaratne R, Bandaragoda T, Adikari A, Alahakoon D, De Silva D, Yu X (2017) Incremental knowledge acquisition and self-learning for autonomous video surveillance. In Proceedings IECON 2017 43rd Annual Conference of the IEEE Industrial Electronics Society, 2017-January

  107. Nguyen-Meidine LT, Kiran M, Pedersoli M, Dolz J, Blais-Morin LA, Granger E (2022) Incremental multi-target domain adaptation for object detection with efficient domain transfer. Pattern Recognit 129:108771. https://doi.org/10.1016/j.patcog.2022.108771

    Article  Google Scholar 

  108. Nguyen DB (2016) Context-based classifier grids learning for object detection in surveillance systems. In Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST, volume 165

  109. Norman TL (2017) Chapter 6 electronics elements: A detailed discussion - originally from integrated security systems design. thomas norman: Butterworth-heinemann, 2015. updated by the editor, elsevier, 2016. In Lawrence J. Fennelly, editor, Effective Physical Security (Fifth Edition), pages 95 137. Butterworth-Heinemann, fifth edition edition

  110. UCF University of Central Florida. (2011) Ucf aerial dataset

  111. Oh S, Hoogs A, Perera A, Cuntoor N, Chen CC, Lee JT, Mukherjee S, Aggarwal JK, Lee H, Davis L, Swears E, Wang X, Ji Q, Reddy K, Shah M, Vondrick C, Pirsiavash H, Ramanan D, Yuen J, Torralba A, Song B, Fong A, Chowdhury AR, Desai M (2011) A large-scale benchmark dataset for event recognition in surveillance video. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

  112. Pagano C, Granger E, Sabourin R, Marcialis GL, Roli F (2015) Adaptive classification for person re-identification driven by change detection. In ICPRAM 2015 4th International Conference on Pattern Recognition Applications and Methods, Proceedings, 1

  113. Page ES (1954) Continuous inspection schemes. Biometrika, 41

  114. Pérez-Sánchez B, Fontenla-Romero O, Guijarro-Berdiñas B (2018) A review of adaptive online learning for artificial neural networks. Artif Intell Rev 49(2):281–299

    Google Scholar 

  115. Patino L, Ferryman J (2014) Pets 2014: Dataset and challenge. In 11th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS

  116. Patron-Perez A, Marszalek M, Zisserman A, Reid I (2010) High five: Recognising human interactions in tv shows. In British Machine Vision Conference, BMVC 2010 Proceedings

  117. Pei M, Jia Y, Zhu SC (2011) Parsing video events with goal inference and intent prediction. In Proceedings of the IEEE International Conference on Computer Vision

  118. Pillai GV, Sen D (2021) Anomaly detection in nonstationary videos using time-recursive differencing network-based prediction. IEEE Geoscience and Remote Sensing Letters

  119. Quinlan JR (1986) Induction of decision trees. Machine Learning, 1

  120. Rabiner LR (1989) A tutorial on hidden markov models and selected applications in speech recognition. Proc IEEE 77(2):257–286

    Google Scholar 

  121. Raj SS, Prasad MVNK, Balakrishnan R (2020) Deep manifold clustering based optimal pseudo pose representation (dmc-oppr) for unsupervised person re-identification. Image and Vision Computing, 101,103-956. https://doi.org/10.1016/j.imavis.2020.103956

  122. Ramchandran A, Sangaiah AK (2019) Unsupervised deep learning system for local anomaly event detection in crowded scenes. Multimed Tools Appl 79(47–48):35275–35295

    Google Scholar 

  123. Reddy KK, Shah M (2013) Recognizing 50 human action categories of web videos. Mach Vis Appl 24(5):971–981

    Google Scholar 

  124. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

  125. Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv: 1804.02767

  126. Ren S, He Y, Wang X, Guo K, Barra S, Li J (2022) Ciod: an intelligent class-incremental object detection system with nearest mean of exemplars. J Ambient Intell Human Comput, 7

  127. Robicquet A, Sadeghian A, Alahi A, Savarese S (2016) Learning social etiquette: Human trajectory understanding in crowded scenes. In European Conference on Computer Vision (ECCV) 549–565

  128. Rodriguez-Moreno I, Martinez-Otzeta JM, Sierra B, Rodriguez I, Jauregi E (2019) Video activity recognition: State-of-the-art. Sensors (Switzerland) 19:7

    Google Scholar 

  129. Rodriguez MD, Ahmed J, Shah M (2008) Action mach: A spatio-temporal maximum average correlation height filter for action recognition. In 26th IEEE 26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR, CVPR

  130. Rohrbach M, Amin S, Andriluka M, Schiele B (2012) A database for fine grained activity detection of cooking activities. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

  131. Russakovsky O, Deng J, Hao S, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

    MathSciNet  Google Scholar 

  132. RyooMS, Aggarwal JK. (2009) Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities. In Proc IEEE International Conference on Computer Vision

  133. Saito T, Rehmsmeier M (2009) The precision-recall plot is more informative than the roc plot when evaluating binary classifiers on imbalanced datasets. PLOS ONE 10(3):1–21

    Google Scholar 

  134. Saligrama V, Konrad J, Jodoin PM (2010) Video anomaly identification. IEEE Signal Proc Mag 27(5):18–33

    Google Scholar 

  135. Saunier N (2010) A public video dataset for road transportation applications. In 93rd Annu. Meeting Transp Res Board, 1–12

  136. Schüldt C, Caputo B, Sch C, Barbara L (2017) Recognizing human actions : A local svm approach recognizing human actions. Pattern Recognit, 2004. ICPR 2004. Proc 17th Int Conf, 3

  137. Schlimmer JC, Granger RH (1986) Incremental learning from noisy data. Mach Learn 1(3):317–354

    Google Scholar 

  138. Settles B (2011) From theories to queries: Active learning in practice. In Isabelle Guyon, Gavin Cawley, Gideon Dror, Vincent Lemaire, and Alexander Statnikov, editors, Active Learning and Experimental Design workshop In conjunction with AISTATS 2010, volume 16 of Proceedings of Machine Learning Research, pages 1–18, Sardinia, Italy, 16 May 2011. JMLR Workshop and Conference Proceedings

  139. Shin DK, Ahmed MU, Rhee PK (2018) Incremental deep learning for robust object detection in unknown cluttered environments. IEEE Access 6:61748–61760. https://doi.org/10.1109/ACCESS.2018.2875720

    Article  Google Scholar 

  140. Shobha BS, Deepu R (2018) A review on video based vehicle detection, recognition and tracking. In 2018 3rd Int Conf Comput Syst Inform Technol Sustain Solut (CSITSS), pages 183–186

  141. Silverman BW, Jones MC (1951) E. fix and j.l. hodges: An important contribution to nonparametric discriminant analysis and density estimation: Commentary on fix and hodges (1951). Int Stat Review / Revue Int de Statistiq 57:1989

    Google Scholar 

  142. Singh S, Velastin SA, Ragheb H (2010) Muhavi: A multicamera human action video dataset for the evaluation of action recognition methods. In Proc IEEE International Conference on Advanced Video and Signal Based Surveillance, AVS

  143. Soomro K, Idrees H, Shah M (2019) Online localization and prediction of actions and interactions. IEEE Trans Pattern Anal Mach Intell 41(2):459–472

    Google Scholar 

  144. Soomro K, Zamir AR, Shah M (2012) Ucf101: A dataset of 101 human actions classes from videos in the wild

  145. Soomro K, Zamir AR (2014) Action recognition in realistic sports videos. Adv Comput Vision Pattern Recognit 71:181–208. https://doi.org/10.1007/978-3-319-09396-3_9

    Article  Google Scholar 

  146. Stein S, McKenna SJ (2013) Combining embedded accelerometers with computer vision for recognizing food preparation activities. In UbiComp 2013 - Proceedings of the 2013 ACM International Joint Conference on Pervasive and Ubiquitous Computing 

  147. Sugianto N, Tjondronegoro D, Sorwar G, Chakraborty P, Yuwono EI (2019) Continuous learning without forgetting for person re-identification. In 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance, AVSS

  148. Suprem A, Arulraj J, Calton P, Ferreira J (2020) Odin: Automated drift detection and recovery in video analytics. Proc VLDB Endow 13(12):2453–2465

    Google Scholar 

  149. Teng E, Falcao JD, Huang R, Iannucci B (2018) Clickbait: Click-based accelerated incremental training of convolutional neural networks. In Proceedings Applied Imagery Pattern Recognition Workshop

  150. Thirde D, Li L, Ferryman F (2006) Overview of the pets2006 challenge. In 9th IEEE IEEE Int. Workshop Perform. Eval.Tracking Surveill. (PETS) 47–50

  151. Martinez Torres D, Correa HL, Bravo EC (2006) Online learning of contexts for detecting suspicious behaviors in surveillance videos. Image Vis Comput 89:1–26. https://doi.org/10.1007/BFb0053993

    Article  Google Scholar 

  152. Tsoi AC (1998) Recurrent neural network architectures: An overview, pages 1–26. Springer Berlin Heidelberg, Berlin, Heidelberg

  153. Turaga P, Chellappa R, Subrahmanian VS, Udrea O (2008) Machine recognition of human activities: A survey. IEEE Trans Circuits Syst Vid Technol 18(11):1473–1488

    Google Scholar 

  154. Ullah A, Muhammad K, Haq IU, Baik SW (2019) Action recognition using optimized deep autoencoder and cnn for surveillance data streams of non-stationary environments. Future Gener Comput Syst 96:386–397. https://doi.org/10.1016/j.future.2019.01.029

    Article  Google Scholar 

  155. Wang H, Yan Y, Hua J, Yang Y, Wang X, Li XL, Deller JR, Zhang G, Bao H (2017) Pedestrian recognition in multi-camera networks using multilevel important salient feature and multicategory incremental learning. Pattern Recognit 67:340–352. https://doi.org/10.1016/j.patcog.2017.01.033

    Article  Google Scholar 

  156. Wang X, Hu Y, Radwin RG, Lee JD (2018) Frame-sub sampled, drift-resilient long-term video object tracking. In 2018 IEEE International Conference on Multimedia and Expo (ICME), 1–6

  157. Weinland D, Ronfard R, Boyer E (2006) Free viewpoint action recognition using motion history volumes. Comput Vis Image Understan 104(2–3):249–257

    Google Scholar 

  158. Wei L, Zhang S, Gao W, Tian Q (2018) Person transfer gan to bridge domain gap for person re-identification. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

  159. Widmer G (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23(1):69–101

    MathSciNet  Google Scholar 

  160. Wolf L, Hassner T, Maoz I (2011) Face recognition in unconstrained videos with matched background similarity. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

  161. Xiang T, Gong S (2008) Video behavior profiling for anomaly detection. IEEE Trans Patt Anal Mach Intell 30(5):893–908

    MathSciNet  Google Scholar 

  162. Xiao Y, Tian Z, Jiachen Yu, Zhang Y, Liu S, Shaoyi Du, Lan X (2020) A review of object detection based on deep learning. Multimed Tools Appl 79(33–34):23729–23791

    Google Scholar 

  163. Yang P, Xiong N, Ren J (2020) Data security and privacy protection for cloud storage: A survey. IEEE Access 8:131723–131740

    Google Scholar 

  164. Yuan J, Liu Z, Ying W (2011) Discriminative video pattern search for efficient action detection. IEEE Trans Patt Anal Mach Intell 33(9):1728–143

    Google Scholar 

  165. Yu F, Xian W, Chen Y, Liu F, Liao M, Madhavan V, Darrell T (2018) Bdd100k: A diverse driving video database with scalable annotation tooling. Arxiv

  166. Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2017) Scalable person re-identification : A benchmark scalable person re-identification : A benchmark. The IEEE International Conference on Computer Vision (ICCV)

  167. ZhengZ, Zheng L, Yang Y (2017) Unlabeled samples generated by gan improve the person re-identification baseline in vitro. In Proceedings of the IEEE International Conference on Computer Vision

Download references

Acknowledgements

The authors would like to thank the Office of Information Technology of USP (STI-USP).

Funding

This project is supported by the São Paulo Research Foundation (FAPESP) (grant #2020/06950–4), the Research Dean—PRP-USP (grant #668/2018), the Brazilian National Council of Scientific and Technological Development (CNPq) (grant #309030/2019-6), and the National Institute of Science and Technology in Medicine Assisted by Scientific Computing (INCT-MACC) (grant #157535/2017-7).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vinicius P. M. Goncalves.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Goncalves, V.P.M., Silva, L.P., Nunes, F.L.S. et al. Concept drift adaptation in video surveillance: a systematic review. Multimed Tools Appl 83, 9997–10037 (2024). https://doi.org/10.1007/s11042-023-15855-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15855-3

Keywords

Navigation