Abstract
Pedestrian detection is a fundamental problem in video surveillance and has achieved great progress in recent years. However, training a generic detector performing well in a great variety of scenes has been approved to be very difficult. On the other hand, exhausting manual labeling effort for each specific scene to achieve high accuracy of detection is not acceptable especially for video surveillance applications. In order to alleviate the manual labeling effort without scarifying accuracy of detection, we propose a transfer learning framework to automatically train a scene-specific pedestrian detector starting from a pre-trained generic detector. In our framework, sparse coding is proposed to calculate similarities between source samples and a small set of selected target samples by using the former as dictionary. The similarities are later used to calculate weights of source samples. The weights of initially detected target samples are calculated in a similar way but using the selected target dataset as dictionary. By using these weighted samples during re-training process, our framework can efficiently get a scene-specific pedestrian detector. Our experiments on VIRAT dataset show that our trained scene-specific pedestrian detector performs well and it is comparable with the detector trained on a large number of training samples manually labeled from the target scene.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Dollar, P., et al.: Pedestrian Detection: An Evaluation of the State of the Art. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(4), 743–761 (2012)
Munder, S., Gavrila, D.: An Experimental Study on Pedestrian Classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI 28(11) (2006)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 886–893 (2005)
Felzenszwalb, P., McAllester, D., Ramanan, D.: A Discriminatively Trained, Multiscale, Deformable Part Model. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2008)
Dollár, P., et al.: Pedestrian detection: A benchmark. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 304–311 (2009)
Levin, A., Viola, P., Freund, Y.: Unsupervised improvement of visual detectors using cotraining. In: IEEE International Conference on Computer Vision (2003)
Roth, P.M., et al.: Classifier grids for robust adaptive object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 2727–2734 (2009)
Nair, V., Clark, J.J.: An unsupervised, online learning framework for moving object detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2004)
Bo, W., Nevatia, R.: Improving Part based Object Detection by Unsupervised, Online Boosting. In: IEEE Conference on Computer Vision and Pattern Recognition (2007)
Meng, W., Wei, L., Xiaogang, W.: Transferring a generic pedestrian detector towards specific scenes. In: IEEE Computer Conference on Computer Vision and Patter Recognition (2012)
Wu, P., Dietterich, T.G.: Improving SVM accuracy by training on auxiliary data sources. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 110. ACM, Banff (2004)
Bin, C., et al.: Learning With l1-Graph for Image Analysis. IEEE Transactions on Image Processing 19(4), 858–866 (2010)
Tang, S., et al.: Sparse Ensemble Learning for Concept Detection. IEEE Transactions on Multimedia 14(1), 43–54 (2012)
Dai, W., et al.: Boosting for transfer learning. In: Proceedings of the 24th International Conference on Machine Learning, pp. 193–200. ACM, Corvalis (2007)
Junbiao, P., et al.: Transferring Boosted Detectors Towards Viewpoint and Scene Adaptiveness. IEEE Transactions on Image Processing 20(5), 1388–1400 (2011)
Wang, M., et al.: Assistive Tagging: A Survey of Multimedia Tagging with Human-Computer Joint Exploration. ACM Computing Surveys 44(4) (2012)
Wang, M., et al.: Towards a Relevant and Diverse Search of Social Images. IEEE Transactions on Multimedia 12(8), 829–842 (2010)
Meng, W., Xiaogang, W.: Automatic adaptation of a generic pedestrian detector to a specific traffic scene. In: IEEE Computer Conference on Computer Vision and Patter Recognition (2011)
Rosenberg, C., Hebert, M., Schneiderman, H.: Semi-Supervised Self-Training of Object Detection Models. In: IEEE Workshops on Application of Computer Vision (2005)
Wright, J., et al.: Sparse Representation for Computer Vision and Pattern Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 98(6) (2010)
Barnich, O., Van Droogenbroeck, M.: ViBe: a universal background subtraction algorithm for video sequences. IEEE Transactions on Image Process, ITIP 20(6) (2011)
Sangmin, O., et al.: A large-scale benchmark dataset for event recognition in surveillance video. In: IEEE Conference on Computer Vision and Pattern Recognition (2011)
Zhang, Y., et al.: Efficient Parallel Framework for H.264/AVC Deblocking Filter on Many-core Platform. IEEE Transactions on Multimedia (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liang, F., Tang, S., Wang, Y., Han, Q., Li, J. (2013). A Sparse Coding Based Transfer Learning Framework for Pedestrian Detection . In: Li, S., et al. Advances in Multimedia Modeling. Lecture Notes in Computer Science, vol 7733. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35728-2_26
Download citation
DOI: https://doi.org/10.1007/978-3-642-35728-2_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35727-5
Online ISBN: 978-3-642-35728-2
eBook Packages: Computer ScienceComputer Science (R0)