Feature context learning for human parsing

Huang, Tengteng; Xu, Yongchao; Bai, Song; Wang, Yongpan; Bai, Xiang

doi:10.1007/s11432-019-9935-6

Feature context learning for human parsing

Research Paper
Published: 11 November 2019

Volume 62, article number 220101, (2019)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

Tengteng Huang¹,
Yongchao Xu¹,
Song Bai¹,
Yongpan Wang² &
…
Xiang Bai¹

202 Accesses
12 Citations
Explore all metrics

Abstract

Parsing inconsistency, referring to the scatters and speckles in the parsing results as well as imprecise contours, is a long-standing problem in human parsing. It results from the fact that the pixel-wise classification loss independently considers each pixel. To address the inconsistency issue, we propose in this paper an end-to-end trainable, highly flexible and generic module called feature context module (FCM). FCM explores the correlation of adjacent pixels and aggregates the contextual information embedded in the real topology of the human body. Therefore, the feature representations are enhanced and thus quite robust in distinguishing semantically related parts. Extensive experiments are done with three different backbone models and four benchmark datasets, suggesting that FCM can be an effective and efficient plug-in to consistently improve the performance of existing algorithms without sacrificing the inference speed too much.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Renovating Parsing R-CNN for Accurate Multiple Human Parsing

Preserving details in semantics-aware context for scene parsing

Article 15 January 2020

CNN-EFF: CNN Based Edge Feature Fusion in Semantic Image Labelling and Parsing

Article 18 January 2022

References

Gan C, Lin M, Yang Y, et al. Concepts not alone: exploring pairwise relationships for zero-shot video activity recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, 2016. 3487–3493
Han X, Wu Z X, Wu Z, et al. Viton: an image-based virtual try-on network. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, 2018. 7543–7552
Kalayeh M M, Basaran E, Gökmen M, et al. Human semantic parsing for person re-identification. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, 2018. 1062–1071
Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, 2015. 3431–3440
Zhao H S, Shi J P, Qi X J, et al. Pyramid scene parsing network. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, 2017. 2881–2890
Zhou Y Y, Wang Y, Tang P, et al. Semi-supervised 3D abdominal multi-organ segmentation via deep multi-planar co-training. In: Proceedings of 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), 2019. 121–140
Luo Y W, Zheng Z D, Zheng L, et al. Macro-micro adversarial network for human parsing. In: Proceedings of European Conference on Computer Vision, 2018. 418–434
Chen L C, Papandreou G, Kokkinos I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell, 2018, 40: 834–848
Article Google Scholar
Nie X C, Feng J S, Yan S C. Mutual learning to adapt for joint human parsing and pose estimation. In: Proceedings of European Conference on Computer Vision, 2018. 502–517
Gong K, Liang X D, Li Y C, et al. Instance-level human parsing via part grouping network. In: Proceedings of European Conference on Computer Vision, 2018. 770–785
Liu T, Ruan T, Huang Z, et al. Devil in the details: towards accurate single and multiple human parsing. 2018. ArXiv: 1809.05996
Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks. 2016. ArXiv: 1609.02907
Veličković P, Cucurull G, Casanova A, et al. Graph attention networks. 2017. ArXiv: 1710.10903
Xia F T, Wang P, Chen X J, et al. Joint multi-person pose estimation and semantic part segmentation. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, 2017. 6769–6778
Fang H-S, Lu G S, Fang X L, et al. Weakly and semi supervised human body part parsing via pose-guided knowledge transfer. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, 2018
Liu S, Sun Y, Zhu D F, et al. Cross-domain human parsing via adversarial feature and label adaptation. 2018. ArXiv: 1801.01260
Liang X, Gong K, Shen X, et al. Look into person: joint body parsing & pose estimation network and a new benchmark. IEEE Trans Pattern Anal Mach Intell, 2019, 41: 871–885
Article Google Scholar
Zhu B K, Chen Y Y, Tang M, et al. Progressive cognitive human parsing. In: Proceedings of the AAAI Conference on Artificial Intelligence, 2018
Guo L H, Guo C G, Li L, et al. Two-stage local constrained sparse coding for fine-grained visual categorization. Sci China Inf Sci, 2018, 61: 018104
Article Google Scholar
Sun H Q, Pang Y W. GlanceNets — efficient convolutional neural networks with adaptive hard example mining. Sci China Inf Sci, 2018, 61: 109101
Article Google Scholar
Xu Y, Wang Y, Zhou W, et al. TextField: learning a deep direction field for irregular scene text detection. IEEE Trans Image Process, 2019. doi: https://doi.org/10.1109/TIP.2019.2900589
Article MathSciNet MATH Google Scholar
Krähenbühl P, Koltun V. Efficient inference in fully connected crfs with gaussian edge potentials. In: Proceedings of Advances in Neural Information Processing Systems, 2011
Ke T-W, Hwang J-J, Liu Z W, et al. Adaptive affinity field for semantic segmentation. In: Proceedings of 2018 European Conference on Computer Vision. Berlin: Springer, 2018. 605–621
Google Scholar
Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets. In: Proceedings of Advances in Neural Information Processing Systems, 2014
Gong K, Liang X D, Zhang D Y, et al. Look into person: self-supervised structure-sensitive learning and a new benchmark for human parsing. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, 2017. 932–940
Jin J W, Liu Z L, Chen C L P. Discriminative graph regularized broad learning system for image recognition. Sci China Inf Sci, 2018, 61: 112209
Article Google Scholar
Liang X D, Lin L, Shen X H, et al. Interpretable structure-evolving LSTM. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, 2017. 1010–1019
Zhang H, Dana K, Shi J P, et al. Context encoding for semantic segmentation. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, 2018. 7151–7160
Wang X L, Girshick R, Gupta A, et al. Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018. 7794–7803
Huang Z, Wang X, Huang L, et al. Ccnet: criss-cross attention for semantic segmentation. 2018. ArXiv: 1811.11721
He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, 2016. 770–778
Chen X J, Mottaghi R, Liu X B, et al. Detect what you can: detecting and representing objects using holistic models and body parts. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, 2014. 1971–1978
Luo P, Wang X G, Tang X O. Pedestrian parsing via deep decompositional network. In: Proceedings of IEEE International Conference on Computer Vision, 2013. 2648–2655
Lin T-Y, Maire M, Belongie S, et al. Microsoft coco: common objects in context. In: Proceedings of European Conference on Computer Vision, 2014
Badrinarayanan V, Kendall A, Cipolla R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell, 2017, 39: 2481–2495
Article Google Scholar
Chen L-C, Yang Y, Wang J, et al. Attention to scale: scale-aware semantic image segmentation. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, 2016. 3640–3649
Liang X, Shen X, Feng J, et al. Semantic object parsing with graph LSTM. In: Proceedings of European Conference on Computer Vision, 2016
Luc P, Couprie C, Chintala S, et al. Semantic segmentation using adversarial networks. In: Proceedings of NIPS Workshop, 2016

Download references

Acknowledgements

This work was supported in part by National Key Research and Development Program of China (Grant No. 2018YFB1004600), National Natural Science Foundation of China (Grant No. 61703171), and Natural Science Foundation of Hubei Province of China (Grant No. 2018CFB199). This work was also supported by Alibaba Group through Alibaba Innovative Research (AIR) Program. The work of Yongchao XU was supported by Young Elite Scientists Sponsorship Program by CAST. The work of Xiang BAI was supported by National Program for Support of Top-Notch Young Professionals and in part by Program for HUST Academic Frontier Youth Team.

Author information

Authors and Affiliations

School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, 430074, China
Tengteng Huang, Yongchao Xu, Song Bai & Xiang Bai
Alibaba Group, Hangzhou, 311121, China
Yongpan Wang

Authors

Tengteng Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yongchao Xu
View author publications
You can also search for this author in PubMed Google Scholar
Song Bai
View author publications
You can also search for this author in PubMed Google Scholar
Yongpan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiang Bai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yongchao Xu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, T., Xu, Y., Bai, S. et al. Feature context learning for human parsing. Sci. China Inf. Sci. 62, 220101 (2019). https://doi.org/10.1007/s11432-019-9935-6

Download citation

Received: 03 June 2019
Accepted: 13 June 2019
Published: 11 November 2019
DOI: https://doi.org/10.1007/s11432-019-9935-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature context learning for human parsing

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Renovating Parsing R-CNN for Accurate Multiple Human Parsing

Preserving details in semantics-aware context for scene parsing

CNN-EFF: CNN Based Edge Feature Fusion in Semantic Image Labelling and Parsing

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now