Skip to main content

Advertisement

Log in

Bio-inspired head detection framework based on online learning algorithm

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Online learning algorithms have been widely used to address vision-related issues such as object detection and tracking. However, a robust online learning object detection system that can continuously improve performance through self-learning continues to elude designers. This study proposes a novel online learning framework, which combines detection and verification modules to train a scene-specific head detector on a fly. For the detection module, a proposed online bootstrap cascade classifier is employed as the object detector of the framework. The cascade decision strategy is used to integrate a number of weak online classifiers. The resulting system contains sufficient weak classifiers and maintains a low computation cost. During the online learning process, the complexity of the cascade structure adapts to the difficulty of the detection task. For the verification module, a simple yet effective particle filter tracking algorithm, based on information fusion, is used to automatically label online learning samples produced by detection responses. With this method, the object detector improves detection performance by autonomously learning the samples. The online head detection framework is ported to the NVIDIA Jetson TK1 embedded platform, which enables the platform to recognize different head postures through self-learning. Experimental results on three video datasets demonstrate the effectiveness of the framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

  1. Bewley A, Ott L, Ramos F, Upcroft B (2016) Alextrac: affinity learning by exploring temporal reinforcement within association chains. In: Proceedings of the IEEE international conference on robotics and automation (ICRA)

  2. Bruzzone L, Prieto DF (1999) An incremental-learning neural network for the classification of remote-sensing images. Pattern Recogn Lett 20(11):1241–1248

    Google Scholar 

  3. Cancela B, Ortega M, Fernández A, Penedo MG (2013) Hierarchical framework for robust and fast multiple-target tracking in surveillance scenarios. Expert Syst Appl 40(4):1116–1131

    Google Scholar 

  4. Chan AB, Liang Z-SJ, Vasconcelos N (2008) Privacy preserving crowd monitoring: counting people without people models or tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 1–7

  5. Cinbis RG, Verbeek J, Schmid C (2017) Weakly supervised object localization with multi-fold multiple instance learning. IEEE Trans Pattern Anal Mach Intell 39 (1):189–203

    Google Scholar 

  6. Cong Y, Liu J, Yuan J, Luo J (2013) Self-supervised online metric learning with low rank constraint for scene categorization. IEEE Trans Image Process 22(8):3179–3191

    Google Scholar 

  7. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), vol 1, pp 886–893

  8. Diehl CP, Cauwenberghs G (2003) Svm incremental learning, adaptation and optimization. In: Proceedings of the international joint conference on neural networks, vol 4. IEEE, pp 2685–2690

  9. Dollár P, Appel R, Belongie S, Perona P (2014) Fast feature pyramids for object detection. IEEE Trans Pattern Anal Mach Intell 36(8):1532–1545

    Google Scholar 

  10. Dutta JK, Liu J, Kurup U, Shah M (2018) Effective building block design for deep convolutional neural networks using search. arXiv:1801.08577

  11. Erzin E, Yemez Y, Tekalp AM (2005) Multimodal speaker identification using an adaptive classifier cascade based on modality reliability. IEEE Trans Multimed 7 (5):840–852

    Google Scholar 

  12. Fern A, Givan R (2003) Online ensemble learning: an empirical study. Mach Learn 53(1-2):71–109

    MATH  Google Scholar 

  13. Ferrer Troyano F, Aguilar Ruiz JS, Riquelme JC (2005) Incremental rule learning based on example nearness from numerical data streams. In: ACM symposium on applied computing. ACM, pp 568–572

  14. Gepperth A, Hammer B (2016) Incremental learning algorithms and applications. In: Proceedings of the European symposium on artificial neural networks (ESANN)

  15. Grabner H, Bischof H (2006) On-line boosting and vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), vol 1. IEEE, pp 260–267

  16. Hochstein S, Ahissar M (2002) View from the top: hierarchies and reverse hierarchies in the visual system. Neuron 36(5):791–804

    Google Scholar 

  17. Huang C, Ai H, Yamashita T, Lao S, Kawade M (2007) Incremental learning of boosted face detector. In: Proceedings of the IEEE international conference on computer vision (ICCV). IEEE, pp 1–8

  18. Huerta I, Pedersoli M, Gonzàlez J, Sanfeliu A (2015) Combining where and what in change detection for unsupervised foreground learning in surveillance. Pattern Recogn 48(3):709–719

    Google Scholar 

  19. Impedovo S, Mangini FM, Barbuzzi D (2014) A novel prototype generation technique for handwriting digit recognition. Pattern Recogn 47(3):1002–1010

    Google Scholar 

  20. Javed O, Ali S, Shah M (2005) Online detection and classification of moving objects using progressively improving detectors. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), vol 1. IEEE, pp 696–701

  21. Juan CH, Walsh V (2003) Feedback to v1: a reverse hierarchy in vision. Exp Brain Res 150(2):259–263

    Google Scholar 

  22. Kalal Z, Matas J, Mikolajczyk K (2010) Pn learning: bootstrapping binary classifiers by structural constraints. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 49–56

  23. Kalal Z, Mikolajczyk K, Matas J (2012) Tracking-learning-detection. IEEE Trans Pattern Anal Mach Intell 34(7):1409–1422

    Google Scholar 

  24. Kalogeratos A, Likas A (2012) Dip-means: an incremental clustering method for estimating the number of clusters. In: Advances in neural information processing systems, pp 2393–2401

  25. Kang S-K, Chung K-Y, Lee J-H (2014) Development of head detection and tracking systems for visual surveillance. Personal and Ubiquitous Computing 18 (3):515–522

    Google Scholar 

  26. Kumar Singh K, Xiao F, Jae Lee Y (2016) Track and transfer: watching videos to simulate strong human supervision for weakly-supervised object detection. In: proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3548–3556

  27. Laroze M, Dambreville R, Friguet C, Kijak E, Lefèvre S (2018) Active learning to assist annotation of aerial images in environmental surveys. In: 2018 international conference on content-based multimedia indexing (CBMI), pp 1–6

  28. Levin A, Viola PA, Freund Y (2003) Unsupervised improvement of visual detectors using co-training. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 626–633

  29. Li Z, Tang J (2015) Unsupervised feature selection via nonnegative spectral analysis and redundancy control. IEEE Trans Image Process 24(12):5343–5355

    MathSciNet  MATH  Google Scholar 

  30. Li Z, Tang J, He X (2017) Robust structured nonnegative matrix factorization for image representation. IEEE Trans Neural Netw Learn Sys 29(99):1–14

    MathSciNet  Google Scholar 

  31. Lu WL, Okuma K, Little JJ (2009) Tracking and recognizing actions of multiple hockey players using the boosted particle filter. Image Vis Comput 27(1):189–205

    Google Scholar 

  32. Mukherjee S, Saha B, Jamal I, Leclerc R, Ray N (2011) A novel framework for automatic passenger counting. In: Proceedings of the IEEE international conference on image processing (ICIP). IEEE, pp 2969–2972

  33. Nair V, Clark JJ (2004) An unsupervised, online learning framework for moving object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), vol 2. IEEE, pp II–II

  34. Nvidia jetson tk1 embedded development kit, http://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.htmlhttp://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.html

  35. Oza NC (2005) Online bagging and boosting. In: Proceedings of the IEEE international conference on systems, man and cybernetics, vol 3. IEEE, pp 2340–2345

  36. Pang J, Huang Q, Yan S, Jiang S, Qin L (2011) Transferring boosted detectors towards viewpoint and scene adaptiveness. IEEE Trans Image Process 20 (5):1388–1400

    MathSciNet  MATH  Google Scholar 

  37. Pham MT, Cham TJ (2007) Online learning asymmetric boosted classifiers for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 1–8

  38. Polikar R, Upda L, Upda SS, Honavar V (2001) Learn++: an incremental learning algorithm for supervised neural networks. IEEE Trans Syst Man Cybern 31 (4):497–508

    Google Scholar 

  39. Qi Z, Xu Y, Wang L, Song Y (2011) Online multiple instance boosting for object detection. Neurocomputing 74(10):1769–1775

    Google Scholar 

  40. Rohekar RYY, Nisimov S, Koren G, Gurwicz Y, Novik G (2018) Constructing deep neural networks by bayesian network structure learning. arXiv:1806.09141

  41. Roth PM, Grabner H, Bischof H, Skocaj D, Leonardist A (2005) On-line conservative learning for person detection. In: Proceedings of the IEEE international workshop on visual surveillance and performance evaluation of tracking and surveillance. IEEE, pp 223–230

  42. Sharma P, Nevatia R (2013) Efficient detector adaptation for object detection in a video. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3254–3261

  43. Schapire RE, Singer Y (1999) Improved boosting algorithms using confidence-rated predictions. Mach Learn 37(3):297–336

    MATH  Google Scholar 

  44. Schlimmer JC, Fisher D (1986) A case study of incremental concept induction. In: Proceedings of the national conference on artificial intelligence (AAAI), pp 496–501

  45. Tanaka K (1997) Mechanisms of visual object recognition: monkey and human studies. Current Opinion in Neurobiology 7(4):523–529

    MathSciNet  Google Scholar 

  46. Verma RC, Schmid C, Mikolajczyk K (2003) Face detection and tracking in a video by propagating detection probabilities. IEEE Trans Pattern Anal Mach Intell 25(10):1215–1228

    Google Scholar 

  47. Villamizar M, Andrade-Cetto J, Sanfeliu A, Moreno-Noguer F (2012) Bootstrapping boosted random ferns for discriminative and efficient object classification. Pattern Recogn 45(9):3141–3153

    Google Scholar 

  48. Viola P, Jones M (2001) Robust real-time object detection. In: Second international workshop on statistical and computational theories of vision – modeling, learning, computing and sampling

  49. Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), vol 1, pp 511–518

  50. Vondrick C, Shrivastava A, Fathi A, Guadarrama S, Murphy K (2018) Tracking emerges by colorizing videos. In: Proceedings of the European conference on computer vision (ECCV), pp 402–419

  51. Wang M, Li W, Wang X (2012) Transferring a generic pedestrian detector towards specific scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 3274–3281

  52. Wang X, Hua G, Han TX (2012) Detection by detections: non-parametric detector adaptation for a video. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 350–357

  53. Wang D, Lu H, Yang M-H (2013) Online object tracking with sparse prototypes. IEEE Trans Image Process 22(1):314–325

    MathSciNet  MATH  Google Scholar 

  54. Wu Y, Hua G, Yu T (2003) Switching observation models for contour tracking in clutter. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), vol 1. IEEE, pp I–I

  55. Wu X, Różycki P, Wilamowski BM (2015) A hybrid constructive algorithm for single-layer feedforward networks learning. IEEE Trans Neural Netw Learn Sys 26 (8):1659–1668

    MathSciNet  Google Scholar 

  56. Yang Y, Shu G, Shah M (2013) Semi-supervised learning of feature hierarchies for object detection in a video. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 1650–1657

  57. Zehnder P, Koller-Meier E, Van Gool LJ (2008) An efficient shared multi-class detection cascade. In: Proceedings of the British machine vision conference (BMVC), pp 1–10

  58. Zhang YL, Zhou J, Zheng W, Ji F, Li L, Liu Z, Ming L, Zhang Z, Chen C, Li X (2018) Distributed deep forest and its application to automatic detection of cash-out fraud. arXiv:1805.04234

  59. Zhou ZH, Feng J (2017) Deep forest: towards an alternative to deep neural networks. In: Proceedings of the international joint conference on artificial intelligence (IJCAI), Melbourne, Australia, pp 3553–3559

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (61302137 and 61603357), Wuhan “Huanghe Elite Project”, Fundamental Research Funds for the Central Universities Young Teacher Promotion Program-Outstanding Youth Foundation, China University of Geosciences (Wuhan)(CUGL170210).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Longsheng Wei or Xiangli Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luo, D., Mou, Q., Zeng, Z. et al. Bio-inspired head detection framework based on online learning algorithm. Multimed Tools Appl 79, 19509–19536 (2020). https://doi.org/10.1007/s11042-020-08744-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-08744-6

Keywords