Abstract
Online learning algorithms have been widely used to address vision-related issues such as object detection and tracking. However, a robust online learning object detection system that can continuously improve performance through self-learning continues to elude designers. This study proposes a novel online learning framework, which combines detection and verification modules to train a scene-specific head detector on a fly. For the detection module, a proposed online bootstrap cascade classifier is employed as the object detector of the framework. The cascade decision strategy is used to integrate a number of weak online classifiers. The resulting system contains sufficient weak classifiers and maintains a low computation cost. During the online learning process, the complexity of the cascade structure adapts to the difficulty of the detection task. For the verification module, a simple yet effective particle filter tracking algorithm, based on information fusion, is used to automatically label online learning samples produced by detection responses. With this method, the object detector improves detection performance by autonomously learning the samples. The online head detection framework is ported to the NVIDIA Jetson TK1 embedded platform, which enables the platform to recognize different head postures through self-learning. Experimental results on three video datasets demonstrate the effectiveness of the framework.

















Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Bewley A, Ott L, Ramos F, Upcroft B (2016) Alextrac: affinity learning by exploring temporal reinforcement within association chains. In: Proceedings of the IEEE international conference on robotics and automation (ICRA)
Bruzzone L, Prieto DF (1999) An incremental-learning neural network for the classification of remote-sensing images. Pattern Recogn Lett 20(11):1241–1248
Cancela B, Ortega M, Fernández A, Penedo MG (2013) Hierarchical framework for robust and fast multiple-target tracking in surveillance scenarios. Expert Syst Appl 40(4):1116–1131
Chan AB, Liang Z-SJ, Vasconcelos N (2008) Privacy preserving crowd monitoring: counting people without people models or tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 1–7
Cinbis RG, Verbeek J, Schmid C (2017) Weakly supervised object localization with multi-fold multiple instance learning. IEEE Trans Pattern Anal Mach Intell 39 (1):189–203
Cong Y, Liu J, Yuan J, Luo J (2013) Self-supervised online metric learning with low rank constraint for scene categorization. IEEE Trans Image Process 22(8):3179–3191
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), vol 1, pp 886–893
Diehl CP, Cauwenberghs G (2003) Svm incremental learning, adaptation and optimization. In: Proceedings of the international joint conference on neural networks, vol 4. IEEE, pp 2685–2690
Dollár P, Appel R, Belongie S, Perona P (2014) Fast feature pyramids for object detection. IEEE Trans Pattern Anal Mach Intell 36(8):1532–1545
Dutta JK, Liu J, Kurup U, Shah M (2018) Effective building block design for deep convolutional neural networks using search. arXiv:1801.08577
Erzin E, Yemez Y, Tekalp AM (2005) Multimodal speaker identification using an adaptive classifier cascade based on modality reliability. IEEE Trans Multimed 7 (5):840–852
Fern A, Givan R (2003) Online ensemble learning: an empirical study. Mach Learn 53(1-2):71–109
Ferrer Troyano F, Aguilar Ruiz JS, Riquelme JC (2005) Incremental rule learning based on example nearness from numerical data streams. In: ACM symposium on applied computing. ACM, pp 568–572
Gepperth A, Hammer B (2016) Incremental learning algorithms and applications. In: Proceedings of the European symposium on artificial neural networks (ESANN)
Grabner H, Bischof H (2006) On-line boosting and vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), vol 1. IEEE, pp 260–267
Hochstein S, Ahissar M (2002) View from the top: hierarchies and reverse hierarchies in the visual system. Neuron 36(5):791–804
Huang C, Ai H, Yamashita T, Lao S, Kawade M (2007) Incremental learning of boosted face detector. In: Proceedings of the IEEE international conference on computer vision (ICCV). IEEE, pp 1–8
Huerta I, Pedersoli M, Gonzàlez J, Sanfeliu A (2015) Combining where and what in change detection for unsupervised foreground learning in surveillance. Pattern Recogn 48(3):709–719
Impedovo S, Mangini FM, Barbuzzi D (2014) A novel prototype generation technique for handwriting digit recognition. Pattern Recogn 47(3):1002–1010
Javed O, Ali S, Shah M (2005) Online detection and classification of moving objects using progressively improving detectors. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), vol 1. IEEE, pp 696–701
Juan CH, Walsh V (2003) Feedback to v1: a reverse hierarchy in vision. Exp Brain Res 150(2):259–263
Kalal Z, Matas J, Mikolajczyk K (2010) Pn learning: bootstrapping binary classifiers by structural constraints. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 49–56
Kalal Z, Mikolajczyk K, Matas J (2012) Tracking-learning-detection. IEEE Trans Pattern Anal Mach Intell 34(7):1409–1422
Kalogeratos A, Likas A (2012) Dip-means: an incremental clustering method for estimating the number of clusters. In: Advances in neural information processing systems, pp 2393–2401
Kang S-K, Chung K-Y, Lee J-H (2014) Development of head detection and tracking systems for visual surveillance. Personal and Ubiquitous Computing 18 (3):515–522
Kumar Singh K, Xiao F, Jae Lee Y (2016) Track and transfer: watching videos to simulate strong human supervision for weakly-supervised object detection. In: proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3548–3556
Laroze M, Dambreville R, Friguet C, Kijak E, Lefèvre S (2018) Active learning to assist annotation of aerial images in environmental surveys. In: 2018 international conference on content-based multimedia indexing (CBMI), pp 1–6
Levin A, Viola PA, Freund Y (2003) Unsupervised improvement of visual detectors using co-training. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 626–633
Li Z, Tang J (2015) Unsupervised feature selection via nonnegative spectral analysis and redundancy control. IEEE Trans Image Process 24(12):5343–5355
Li Z, Tang J, He X (2017) Robust structured nonnegative matrix factorization for image representation. IEEE Trans Neural Netw Learn Sys 29(99):1–14
Lu WL, Okuma K, Little JJ (2009) Tracking and recognizing actions of multiple hockey players using the boosted particle filter. Image Vis Comput 27(1):189–205
Mukherjee S, Saha B, Jamal I, Leclerc R, Ray N (2011) A novel framework for automatic passenger counting. In: Proceedings of the IEEE international conference on image processing (ICIP). IEEE, pp 2969–2972
Nair V, Clark JJ (2004) An unsupervised, online learning framework for moving object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), vol 2. IEEE, pp II–II
Nvidia jetson tk1 embedded development kit, http://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.htmlhttp://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.html
Oza NC (2005) Online bagging and boosting. In: Proceedings of the IEEE international conference on systems, man and cybernetics, vol 3. IEEE, pp 2340–2345
Pang J, Huang Q, Yan S, Jiang S, Qin L (2011) Transferring boosted detectors towards viewpoint and scene adaptiveness. IEEE Trans Image Process 20 (5):1388–1400
Pham MT, Cham TJ (2007) Online learning asymmetric boosted classifiers for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 1–8
Polikar R, Upda L, Upda SS, Honavar V (2001) Learn++: an incremental learning algorithm for supervised neural networks. IEEE Trans Syst Man Cybern 31 (4):497–508
Qi Z, Xu Y, Wang L, Song Y (2011) Online multiple instance boosting for object detection. Neurocomputing 74(10):1769–1775
Rohekar RYY, Nisimov S, Koren G, Gurwicz Y, Novik G (2018) Constructing deep neural networks by bayesian network structure learning. arXiv:1806.09141
Roth PM, Grabner H, Bischof H, Skocaj D, Leonardist A (2005) On-line conservative learning for person detection. In: Proceedings of the IEEE international workshop on visual surveillance and performance evaluation of tracking and surveillance. IEEE, pp 223–230
Sharma P, Nevatia R (2013) Efficient detector adaptation for object detection in a video. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3254–3261
Schapire RE, Singer Y (1999) Improved boosting algorithms using confidence-rated predictions. Mach Learn 37(3):297–336
Schlimmer JC, Fisher D (1986) A case study of incremental concept induction. In: Proceedings of the national conference on artificial intelligence (AAAI), pp 496–501
Tanaka K (1997) Mechanisms of visual object recognition: monkey and human studies. Current Opinion in Neurobiology 7(4):523–529
Verma RC, Schmid C, Mikolajczyk K (2003) Face detection and tracking in a video by propagating detection probabilities. IEEE Trans Pattern Anal Mach Intell 25(10):1215–1228
Villamizar M, Andrade-Cetto J, Sanfeliu A, Moreno-Noguer F (2012) Bootstrapping boosted random ferns for discriminative and efficient object classification. Pattern Recogn 45(9):3141–3153
Viola P, Jones M (2001) Robust real-time object detection. In: Second international workshop on statistical and computational theories of vision – modeling, learning, computing and sampling
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), vol 1, pp 511–518
Vondrick C, Shrivastava A, Fathi A, Guadarrama S, Murphy K (2018) Tracking emerges by colorizing videos. In: Proceedings of the European conference on computer vision (ECCV), pp 402–419
Wang M, Li W, Wang X (2012) Transferring a generic pedestrian detector towards specific scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 3274–3281
Wang X, Hua G, Han TX (2012) Detection by detections: non-parametric detector adaptation for a video. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 350–357
Wang D, Lu H, Yang M-H (2013) Online object tracking with sparse prototypes. IEEE Trans Image Process 22(1):314–325
Wu Y, Hua G, Yu T (2003) Switching observation models for contour tracking in clutter. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), vol 1. IEEE, pp I–I
Wu X, Różycki P, Wilamowski BM (2015) A hybrid constructive algorithm for single-layer feedforward networks learning. IEEE Trans Neural Netw Learn Sys 26 (8):1659–1668
Yang Y, Shu G, Shah M (2013) Semi-supervised learning of feature hierarchies for object detection in a video. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 1650–1657
Zehnder P, Koller-Meier E, Van Gool LJ (2008) An efficient shared multi-class detection cascade. In: Proceedings of the British machine vision conference (BMVC), pp 1–10
Zhang YL, Zhou J, Zheng W, Ji F, Li L, Liu Z, Ming L, Zhang Z, Chen C, Li X (2018) Distributed deep forest and its application to automatic detection of cash-out fraud. arXiv:1805.04234
Zhou ZH, Feng J (2017) Deep forest: towards an alternative to deep neural networks. In: Proceedings of the international joint conference on artificial intelligence (IJCAI), Melbourne, Australia, pp 3553–3559
Acknowledgements
This work was supported by the National Natural Science Foundation of China (61302137 and 61603357), Wuhan “Huanghe Elite Project”, Fundamental Research Funds for the Central Universities Young Teacher Promotion Program-Outstanding Youth Foundation, China University of Geosciences (Wuhan)(CUGL170210).
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Luo, D., Mou, Q., Zeng, Z. et al. Bio-inspired head detection framework based on online learning algorithm. Multimed Tools Appl 79, 19509–19536 (2020). https://doi.org/10.1007/s11042-020-08744-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-08744-6