Abstract
Image based human detection remains as a challenging problem. Most promising detectors rely on classifiers trained with labelled samples. However, labelling is a manual labor intensive step. To overcome this problem we propose to collect images of pedestrians from a virtual city, i.e., with automatic labels, and train a pedestrian detector with them. The resulting detector performs correctly when such virtual-world data are similar to testing one, i.e., real-world pedestrians in urban areas. When testing data is acquired in different conditions than training ones, e.g., human detection in personal photo albums, dataset shift appears. In previous work, we treat this problem as one of domain adaptation and solve it with an active learning procedure. In this work, we focus on the same problem but evaluate a different set of faster to compute features, i.e., Haar, EOH and their combination. In particular, we train a classifier with virtual-world data, using such features and Real AdaBoost as learning machine. This classifier is applied to real-world training images. Then, a human oracle interactively corrects the wrong detections, i.e., few miss detections are manually annotated and some false ones are pointed out too. A low amount of manual annotation is fixed as restriction. Real- and virtual-world difficult samples are combined within what we call cool world and we retrain the classifier with this data. Our experiments show that this adapted classifier is equivalent to the one trained with only real-world data but requiring 90 annotations.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., Vaughan, J.W.: A theory of learning from different domains. Machine Learning 79(1), 151–175 (2009)
Benenson, R., Mathias, M., Timofte, R., Van Gool, L.: Pedestrian detection at 100 frames per second. In: IEEE Conf. on Computer Vision and Pattern Recognition, Providence, RI, USA (2012)
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer (2006)
Dalal, N.: Finding People in Images and Videos. PhD thesis, Institut National Polytechnique de Grenoble, Advisors: Cordelia Schmid and William J. Triggs (2006)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conf. on Computer Vision and Pattern Recognition, San Diego, CA, USA (2005)
Dollár, P., Tu, Z., Perona, P., Belongie, S.: Integral channel features. In: British Machine Vision Conference, London, UK (2009)
Dollár, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: an evaluation of the state of the art. IEEE Trans. on Pattern Analysis and Machine Intelligence 34(4), 743–761 (2012)
Enzweiler, M., Gavrila, D.M.: Monocular pedestrian detection: survey and experiments. IEEE Trans. on Pattern Analysis and Machine Intelligence 31(12), 2179–2195 (2009)
Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: IEEE Conf. on Computer Vision and Pattern Recognition, Anchorage, AK, USA (2008)
Gerónimo, D., Sappa, A.D., López, A.M., Ponsa, D.: Pedestrian detection using adaboost learning of features and vehicle pitch estimation. In: IASTED Int. Conference on Visualization, Imaging and Image Processing, Palma de Mallorca, Spain (2006)
Gerónimo, D., López, A.M., Sappa, A.D., Graf, T.: Survey of pedestrian detection for advanced driver assistance systems. IEEE Trans. on Pattern Analysis and Machine Intelligence 32(7), 1239–1258 (2010)
Laptev, I.: Improving object detection with boosted histograms. Image and Vision Computing, 27(5), 535–544 (2009)
Levi, K., Weiss, Y.: Learning object detection from a small number of examples: the importance of good features. In: IEEE Conf. on Computer Vision and Pattern Recognition, Washington, DC, USA (2004)
Lienhart, R., Maydt, J.: An extended set of Haar-like features for rapid object detection. In: IEEE Int. Conf. on Image Processing, Rochester, NY, USA (2002)
Sinha, P., Osuna, E., Oren, M., Papageorgiou, C., Poggio, T.: Pedestrian detection using wavelet templates. In: IEEE Conf. on Computer Vision and Pattern Recognition, San Juan, PR, USA (1997)
Marin, J., Vázquez, D., Gerónimo, D., López, A.M.: Learning appearance in virtual scenarios for pedestrian detection. In: IEEE Conf. on Computer Vision and Pattern Recognition, San Francisco, CA, USA (2010)
Ponsa, D., López, A.: Cascade of Classifiers for Vehicle Detection. In: Blanc-Talon, J., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2007. LNCS, vol. 4678, pp. 980–989. Springer, Heidelberg (2007)
Yeh, M., Zhu, Q., Avidan, S., Cheng, K.: Fast human detection using a cascade of histograms of oriented gradients. In: IEEE Conf. on Computer Vision and Pattern Recognition, San Diego, CA, USA (2005)
Schapire, R.E., Singer, Y.: Improved boosting using confidencerated predictions. Machine Learning 37(3), 297–336 (1999)
Sudowe, P., Leibe, B.: Efficient Use of Geometric Constraints for Sliding-Window Object Detection in Video. In: Crowley, J.L., Draper, B.A., Thonnat, M. (eds.) ICVS 2011. LNCS, vol. 6962, pp. 11–20. Springer, Heidelberg (2011)
Vázquez, D., López, A.M., Ponsa, D., Marin, J.: Cool world: domain adaptation of virtual and real worlds for human detection using active learning. In: Advances in Neural Information Processing Systems. Domain Adaptation Workshop: Theory and Application, Granada, Spain (2011)
Vázquez, D., López, A.M., Ponsa, D., Marin, J.: Virtual worlds and active learning for human detection. In: ACM International Conference on Multimodal Interaction, Alicante, Spain (2011)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: IEEE Conf. on Computer Vision and Pattern Recognition, Kauai, HI, USA (2001)
Viola, P., Jones, M.: Robust real-time face detection. Int. Journal on Computer Vision 57(2), 137–154 (2004)
Viola, P., Jones, M., Snow, D.: Detecting pedestrians using patterns of motion and appearance. Int. Journal on Computer Vision 63(2), 153–161 (2005)
Walk, S., Majer, N., Schindler, K., Schiele, B.: New features and insights for pedestrian detection. In: IEEE Conf. on Computer Vision and Pattern Recognition, San Francisco, CA, USA (2010)
Wang, X., Han, T.X., Yan, S.: An HOG-LBP human detector with partial occlusion handling. In: Int. Conf. on Computer Vision, Kyoto, Japan (2009)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Vázquez, D., López, A.M., Ponsa, D., Gerónimo, D. (2013). Interactive Training of Human Detectors. In: Multimodal Interaction in Image and Video Applications. Intelligent Systems Reference Library, vol 48. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35932-3_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-35932-3_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35931-6
Online ISBN: 978-3-642-35932-3
eBook Packages: EngineeringEngineering (R0)