Skip to main content

Reinforcing Pedestrian Parsing on Small Scale Dataset

  • Conference paper
  • First Online:
  • 3148 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10704))

Abstract

In this paper we address the problem of automatic pedestrian parsing in surveillance video with only a small number of training samples. Although human parsing has achieved great success with high-capacity models, it is still quite challenging to parse pedestrians in practical surveillance conditions because complicated environmental interferences need more pixel-level training samples to fit. But creating large datasets with pixel-level labels has been extremely costly due to the vast amount of human effort required. Our method is developed to capture the pedestrian information from the non-labeled datasets to update the trained model by reinforcement learning, which achieves elegant performance with only much fewer pixel-level labeled samples. Both quantitative and qualitative experiments conducted on practical surveillance datasets have shown the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Li, A., Liu, L., Wang, K., Liu, S., Yan, S.: Clothing attributes assisted person reidentification. IEEE Trans. Circ. Syst. Video Technol. 25(5), 869–878 (2015)

    Article  Google Scholar 

  2. Wang, Z., Ruimin, H., Liang, C., Yi, Y., Jiang, J., Ye, M., Chen, J., Leng, Q.: Zero-shot person re-identification via cross-view consistency. IEEE Trans. Multimedia 18(2), 260–272 (2016)

    Article  Google Scholar 

  3. Ye, M., Liang, C., Yi, Y., Wang, Z., Leng, Q., Xiao, C., Chen, J., Ruimin, H.: Person reidentification via ranking aggregation of similarity pulling and dissimilarity pushing. IEEE Trans. Multimedia 18(12), 2553–2566 (2016)

    Article  Google Scholar 

  4. Zeng, M., Cao, L., Dong, H., Lin, K., Wang, M., Tong, J.: Estimation of human body shape and cloth field in front of a kinect. Neurocomputing 151, 626–631 (2015)

    Article  Google Scholar 

  5. Yang, J., Franco, J.-S., Hétroy-Wheeler, F., Wuhrer, S.: Estimation of human body shape in motion with wide clothing. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 439–454. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_27

    Chapter  Google Scholar 

  6. Weber, M., Bauml, M., Stiefelhagen, R.: Part-based clothing segmentation for person retrieval. In: 2011 8th IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS), pp. 361–366. IEEE (2011)

    Google Scholar 

  7. Yamaguchi, K., Hadi Kiapour, M., Berg, T.L.: Paper doll parsing: retrieving similar styles to parse clothing items. In: IEEE International Conference on Computer Vision, pp. 3519–3526 (2013)

    Google Scholar 

  8. Yang, W., Luo, P., Lin, L.: Clothing co-parsing by joint image segmentation and labeling. In: Computer Vision and Pattern Recognition, pp. 3182–3189 (2014)

    Google Scholar 

  9. Simo-Serra, E., Fidler, S., Moreno-Noguer, F., Urtasun, R.: A high performance CRF model for clothes parsing. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014. LNCS, vol. 9005, pp. 64–81. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16811-1_5

    Google Scholar 

  10. Liu, S., Liang, X., Liu, L., Shen, X., Yang, J., Xu, C., Lin, L., Cao, X., Yan, S.: Matching-CNN meets KNN: quasi-parametric human parsing. In: Computer Vision and Pattern Recognition, pp. 1419–1427 (2015)

    Google Scholar 

  11. Liang, X., Liu, S., Shen, X., Yang, J., Liu, L., Dong, J., Lin, L., Yan, S.: Deep human parsing with active template regression. IEEE Trans. Pattern Anal. Mach. Intell. 37(12), 2402 (2015)

    Article  Google Scholar 

  12. Liu, S., Liang, X., Liu, L., Lin, L.: Transferred human parsing with video context. IEEE Trans. Multimedia 17, 1 (2015)

    Article  Google Scholar 

  13. Xia, F., Zhu, J., Wang, P., Yuille, A.L.: Pose-guided human parsing by an and/or graph using pose-context features. In: Thirtieth AAAI Conference on Artificial Intelligence, pp. 3632–3640 (2016)

    Google Scholar 

  14. Liang, X., Xu, C., Shen, X., Yang, J., Tang, J., Lin, L., Yan, S.: Human parsing with contextualized convolutional neural network. IEEE Trans. Pattern Anal. Mach. Intell. 39(1), 115–127 (2016)

    Article  Google Scholar 

  15. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  16. Handa, A., Patraucean, V., Badrinarayanan, V., Stent, S., Cipolla, R.: Scenenet: understanding real world indoor scenes with synthetic data. Comput. Sci. 4077–4085 (2015)

    Google Scholar 

  17. Papon, J., Schoeler, M.: Semantic pose using deep networks trained on synthetic RGB-D. In: IEEE International Conference on Computer Vision, pp. 774–782 (2015)

    Google Scholar 

  18. Richter, S.R., Vineet, V., Roth, S., Koltun, V.: Playing for data: ground truth from computer games. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 102–118. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_7

    Chapter  Google Scholar 

  19. Kaboutari, A., Bagherzadeh, J., Kheradmand, F.: An evaluation of two-step techniques for positive-unlabeled learning in text classification. Int. J. Comput. Appl. Technol. Res. 3, 592–594 (2014)

    Google Scholar 

  20. Day, W.Y., Chi, C.Y., Chen, R.C., Cheng, P.J.: Sampling the web as training data for text classification. Int. J. Digit. Libr. Syst. 1(4), 24–42 (2010)

    Article  Google Scholar 

  21. Benisty, H., Crammer, K.: Metric learning using labeled and unlabeled data for semi-supervised/domain adaptation classification. In: Electrical and Electronics Engineers in Israel, pp. 1–5 (2014)

    Google Scholar 

  22. Tangseng, P., Wu, Z., Yamaguchi, K.: Looking at Outfit to Parse Clothing (2017)

    Google Scholar 

  23. Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. In: British Machine Vision Conference (2014)

    Google Scholar 

  24. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 886–893 (2005)

    Google Scholar 

Download references

Acknowledgments

The research was supported by the National Nature Science Foundation of China under Grant U1611461, 61231015, 61303114, 61671332 and 61671336, by the EU FP7 QUICK project under Grant Agreement No. PIRSES-GA-2013-612652, by the National High Technology Research and Development Program of China under Grant 2015AA016306, by the Technology Research Program of Ministry of Public Security under Grant 2016JSYJA12, by the Hubei Province Technological Innovation Major Project under Grant 2016AAA015 and 2017AAA123, and by the Nature Science Foundation of Jiangsu Province under Grant BK20160386.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qi Zheng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zheng, Q., Chen, J., Jiang, J., Hu, R. (2018). Reinforcing Pedestrian Parsing on Small Scale Dataset. In: Schoeffmann, K., et al. MultiMedia Modeling. MMM 2018. Lecture Notes in Computer Science(), vol 10704. Springer, Cham. https://doi.org/10.1007/978-3-319-73603-7_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-73603-7_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-73602-0

  • Online ISBN: 978-3-319-73603-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics