Abstract
Pedestrian detection is a fundamental problem in video surveillance and has achieved great progress in recent years. However, training a generic detector performing well in a great variety of scenes has proved to be very difficult. On the other hand, exhausting manual labeling efforts for each specific scene to achieve high accuracy of detection is not acceptable especially for video surveillance applications. To alleviate the manual labeling efforts without scarifying accuracy of detection, we propose a transfer learning framework based on sparse coding for pedestrian detection. In our method, generic detector is used to get the initial target samples, and then several filters are used to select a small part of samples (called as target templates) from the initial target samples which we are very sure about their labels and confidence values. The relevancy between source samples and target templates and the relevancy between target samples and target templates are estimated by sparse coding and later used to calculate the weights for source samples and target samples. By adding the sparse coding-based weights to all these samples during re-training process, we can not only exclude outliers in the source samples, but also tackle the drift problem in the target samples, and thus get a well scene-specific pedestrian detector. Our experiments on two public datasets show that our trained scene-specific pedestrian detector performs well and is comparable with the detector trained on a large number of training samples manually labeled from the target scene.
Similar content being viewed by others
References
Oren, M., et al.: Pedestrian detection using wavelet templates. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 193–199 (1997)
Munder, S., Gavrila, D.: An experimental study on pedestrian classification. IEEE Trans. Pattern Anal. Mach. Intell. PAMI 28(11), 1863–1868 (2006)
Dollar, P., et al.: Pedestrian detection: an evaluation of the state of the art. IEEE Trans. on Pattern Anal. Mach. Intell. 34(4), 743–761 (2012)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 886–893 (2005)
Dollár, P., et al., Integral Channel Features. In: British Machine Vision Conference, BMVC (2009)
Felzenszwalb, P., McAllester, D., Ramanan, D.: A Discriminatively trained, multiscale, deformable part model. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2008)
Dollár, P., et al., Pedestrian detection: a benchmark. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 304–311 (2009)
Levin, A., Viola, P., Freund, Y.: Unsupervised improvement of visual detectors using cotraining. In: IEEE International Conference on Computer Vision (2003)
Roth, P.M., et al.: Classifier grids for robust adaptive object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 2727–2734 (2009)
Nair, V., Clark, J.J.: An unsupervised, online learning framework for moving object detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2004)
Wu, B., Nevatia, R.: Improving part based object detection by unsupervised, online boosting. In: IEEE Conference on Computer Vision and Pattern Recognition (2007)
Meng, W., Wei, L., Xiaogang, W.: Transferring a generic pedestrian detector towards specific scenes. In: IEEE Computer Conference on Computer Vision and Patter Recognition (2012)
Sinno Jialin, P, Qiang, Y.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)
Meng, W., Xiaogang, W.: Automatic adaptation of a generic pedestrian detector to a specific traffic scene. In: IEEE Computer Conference on Computer Vision and Patter Recognition (2011)
Zhu, X., et al.: Do we need more training data or better models for object detection? In: British Machine Vision Conference (2012)
Bin, C., et al.: Learning with l1-graph for image analysis. IEEE Trans. Image Process. 19(4), 858–866 (2010)
Papageorgiou, C., Poggio, T.: A trainable system for object detection. Int. J. Comput. Vis. IJCV 38 (2000)
Viola, P., Jones, M.J., Snow, D.: Detecting pedestrians using patterns of motion and appearance. Int. J. Comput. Vis. IJCV 63(2), 153–161 (2005)
Mohan, A., Papageorgiou, C., Poggio, T.: Example-based object detection in images by components. IEEE Trans. Pattern Anal. Mach. Intell. PAMI 23(4), 349–361 (2001)
Mikolajczyk, K., Schmid, C., Zisserman, A.: Human detection based on a probabilistic assembly of robust part detectors. In: European Conference on Computer Vision, ECCV, pp. 69–82 (2004)
Wu, B., Nevatia, R.: Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors. In: IEEE International Conference on Computer Vision, ICCV, pp. 90–97 (2005)
Felzenszwalb, P.F., et al.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. PAMI 32(9), 1627–1645 (2010)
Rosenberg, C., Hebert, M., Schneiderman, H.: Semi-supervised self-training of object detection models. In: IEEE Workshops on Application of Computer Vision (2005)
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)
Wu, P., Dietterich, T.G.: Improving SVM accuracy by training on auxiliary data sources. In: Proceedings of the Twenty-First International Conference on Machine Learning. ACM, Banff (2004)
Dai, W., et al.: Boosting for transfer learning. In: Proceedings of the 24th International Conference on Machine Learning, pp. 193–200. ACM, Corvalis (2007)
Junbiao, P., et al.: Transferring boosted detectors towards viewpoint and scene adaptiveness. IEEE Trans. Image Process. 20(5), 1388–1400 (2011)
Wright, J., et al.: Sparse representation for computer vision and pattern recognition. IEEE Trans. Pattern Anal. Mach. Intell. 98(6), 1031–1044 (2010)
Jurie, F., Triggs, B.: Creating efficient codebooks for visual recognition. In: Tenth IEEE International Conference on Computer Vision 2005. ICCV 2005 (2005)
Candès, E.J., Romberg, J.K., Tao, T.: Stable signal recovery from incomplete and inaccurate measurements. Commun. Pure Appl. Math. 59(8), 1207–1223 (2006)
Yang, J., et al.: Linear spatial pyramid matching using sparse coding for image classification. In: IEEE Conference on Computer Vision san Pattern Recognition, CVPR (2009)
Raina, R., et al.: Self-taught learning: transfer learning from unlabeled data. In: Proceedings of the 24th International Conference on Machine Learning, pp. 759–766. ACM, Corvalis (2007)
Mei, X., Ling, H.: Robust visual tracking and vehicle classification via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. PAMI 33(11), 2259–2272 (2011)
Elhamifar, E., Sqpiro, G., Vidal, R.: See all by looking at a few: sparse modeling for finding representative objects. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2012)
Barnich, O., Van Droogenbroeck, M.: ViBe: a universal background subtraction algorithm for video sequences. IEEE Trans. Image Process. ITIP 20(6), 1709–1724 (2011)
Wright, J., et al.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009)
Mairal, J., et al.: Online learning for matrix factorization and sparse coding. J. Mach. Learn. Res. (2010)
Fan, R., et al.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. (2008)
Sangmin, O., et al.: A large-scale benchmark dataset for event recognition in surveillance video. In: IEEE Conference on Computer Vision and Pattern Recognition (2011)
Liang, F., et al.: Fast pedestrian detection based on sliding window filtering. In: Lin, W., et al. (eds) Advances in Multimedia Information Processing—PCM 2012, pp. 811–822. Springer, Berlin (2012)
Acknowledgments
This work was supported in part by the National Nature Science Foundation of China (61173054) and the National Key Technology Research and Development Program of China (2012BAH39B02). The preliminary results of this work were published in the \(19{\mathrm{th}}\) international conference on multimedia modeling 2013, Part II, LNCS 7733, pp. 272–282, 2013.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liang, F., Tang, S., Zhang, Y. et al. Pedestrian detection based on sparse coding and transfer learning. Machine Vision and Applications 25, 1697–1709 (2014). https://doi.org/10.1007/s00138-013-0549-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00138-013-0549-2