Multi-class Human Body Parsing with Edge-Enhancement Network

Huang, Xi; Wu, Keyu; Hu, Gang; Shao, Jie

doi:10.1007/978-3-030-36808-1_51

Xi Huang⁹,
Keyu Wu⁹,
Gang Hu⁹ &
…
Jie Shao⁹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1142))

Included in the following conference series:

International Conference on Neural Information Processing

2706 Accesses
1 Citations

Abstract

Single human parsing aims at partitioning an image into semantically consistent regions belonging to the body parts or clothing items, which has gained remarkable improvement owing to a wide range of proposed methodologies. From the perspective of the loss design, besides the parsing loss of the final output, most existing studies target on exploiting multiple other losses to enhance parsing results, which is hard to make the model reach balanced condition by adjusting their ratios and may weaken the potential of some losses. In this work, we propose an edge enhancement module to emphasize the potential of edge loss and boundary information. At the same time, local and global information will be explored for complex multi-class human body parsing problem by densely connected atrous spatial pyramid pooling. This scheme results in a simple yet powerful Edge-Enhancement Network (EEN). Extensive experiments demonstrate that EEN achieves 56.55% mIoU on LIP dataset and 62.60% mIoU on CIHP dataset, which outperform the state-of-the-arts by 3.45% and 4.02%, respectively. The code of EEN is available at https://github.com/huangxi6/EEN.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bulò, S.R., Porzi, L., Kontschieder, P.: In-place activated BatchNorm for memory-optimized training of DNNs. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, pp. 5639–5647 (2018)
Google Scholar
Canny, J.F.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8(6), 679–698 (1986)
Article Google Scholar
Chen, L., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018)
Article Google Scholar
Chen, L., Papandreou, G., Schroff, F., Adam, H.: Rethinking Atrous convolution for semantic image segmentation. CoRR abs/1706.05587 (2017)
Google Scholar
Chen, L., Yang, Y., Wang, J., Xu, W., Yuille, A.L.: Attention to scale: scale-aware semantic image segmentation. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, pp. 3640–3649 (2016)
Google Scholar
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with Atrous separable convolution for semantic image segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part VII. LNCS, vol. 11211, pp. 833–851. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_49
Chapter Google Scholar
Chen, X., Mottaghi, R., Liu, X., Fidler, S., Urtasun, R., Yuille, A.L.: Detect what you can: detecting and representing objects using holistic models and body parts. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, pp. 1979–1986 (2014)
Google Scholar
Dollár, P., Zitnick, C.L.: Fast edge detection using structured forests. IEEE Trans. Pattern Anal. Mach. Intell. 37(8), 1558–1570 (2015)
Article Google Scholar
Fang, H., Lu, G., Fang, X., Xie, J., Tai, Y., Lu, C.: Weakly and semi supervised human body part parsing via pose-guided knowledge transfer. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, pp. 70–78 (2018)
Google Scholar
Gong, K., Gao, Y., Liang, X., Shen, X., Wang, M., Lin, L.: Graphonomy: universal human parsing via graph transfer learning. CoRR abs/1904.04536 (2019)
Google Scholar
Gong, K., Liang, X., Li, Y., Chen, Y., Yang, M., Lin, L.: Instance-level human parsing via part grouping network. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part IV. LNCS, vol. 11208, pp. 805–822. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01225-0_47
Chapter Google Scholar
Gong, K., Liang, X., Zhang, D., Shen, X., Lin, L.: Look into person: self-supervised structure-sensitive learning and a new benchmark for human parsing. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, pp. 6757–6765 (2017)
Google Scholar
Hou, Q., Liu, J., Cheng, M., Borji, A., Torr, P.H.S.: Three birds one stone: a unified framework for salient object segmentation, edge detection and skeleton extraction. CoRR abs/1803.09860 (2018)
Google Scholar
Hu, Y., Chen, Y., Li, X., Feng, J.: Dynamic feature fusion for semantic edge detection. CoRR abs/1902.09104 (2019)
Google Scholar
Isola, P., Zhu, J., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, pp. 5967–5976 (2017)
Google Scholar
Liang, X., Gong, K., Shen, X., Lin, L.: Look into person: joint body parsing & pose estimation network and a new benchmark. IEEE Trans. Pattern Anal. Mach. Intell. 41(4), 871–885 (2019)
Article Google Scholar
Liang, X., et al.: Human parsing with contextualized convolutional neural network. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, pp. 1386–1394 (2015)
Google Scholar
Lin, G., Milan, A., Shen, C., Reid, I.D.: RefineNet: multi-path refinement networks for high-resolution semantic segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, pp. 5168–5177 (2017)
Google Scholar
Liu, S., et al.: Matching-CNN meets KNN: quasi-parametric human parsing. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, pp. 1419–1427 (2015)
Google Scholar
Liu, T., et al.: Devil in the details: towards accurate single and multiple human parsing. CoRR abs/1809.05996 (2018)
Google Scholar
Lu, R., Zhou, M., Ming, A., Zhou, Y.: Context-constrained accurate contour extraction for occlusion edge detection. CoRR abs/1903.08890 (2019)
Google Scholar
Luo, Y., Zheng, Z., Zheng, L., Guan, T., Yu, J., Yang, Y.: Macro-micro adversarial network for human parsing. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part IX. LNCS, vol. 11213, pp. 424–440. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01240-3_26
Chapter Google Scholar
Martin, D.R., Fowlkes, C.C., Malik, J.: Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Trans. Pattern Anal. Mach. Intell. 26(5), 530–549 (2004)
Article Google Scholar
Nie, X., Feng, J., Zuo, Y., Yan, S.: Human pose estimation with parsing induced learner. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, pp. 2100–2108 (2018)
Google Scholar
Simo-Serra, E., Fidler, S., Moreno-Noguer, F., Urtasun, R.: A high performance CRF model for clothes parsing. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014, Part III. LNCS, vol. 9005, pp. 64–81. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16811-1_5
Chapter Google Scholar
Yang, M., Yu, K., Zhang, C., Li, Z., Yang, K.: DenseASPP for semantic segmentation in street scenes. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, pp. 3684–3692 (2018)
Google Scholar
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, pp. 6230–6239 (2017)
Google Scholar
Zhou, S., Wang, J., Wang, F., Huang, D.: SE2Net: Siamese edge-enhancement network for salient object detection. CoRR abs/1904.00048 (2019)
Google Scholar
Zhu, S., Fidler, S., Urtasun, R., Lin, D., Loy, C.C.: Be your own Prada: fashion synthesis with structural coherence. In: IEEE International Conference on Computer Vision, ICCV 2017, pp. 1689–1697 (2017)
Google Scholar

Download references

Acknowledgments

This work is supported by National Natural Science Foundation of China (grants No. 61672133 and No. 61832001).

Author information

Authors and Affiliations

Center for Future Media, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
Xi Huang, Keyu Wu, Gang Hu & Jie Shao

Authors

Xi Huang
View author publications
You can also search for this author in PubMed Google Scholar
Keyu Wu
View author publications
You can also search for this author in PubMed Google Scholar
Gang Hu
View author publications
You can also search for this author in PubMed Google Scholar
Jie Shao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jie Shao .

Editor information

Editors and Affiliations

Australian National University, Canberra, ACT, Australia
Tom Gedeon
Murdoch University, Murdoch, WA, Australia
Kok Wai Wong
Kyungpook National University, Daegu, Korea (Republic of)
Minho Lee

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huang, X., Wu, K., Hu, G., Shao, J. (2019). Multi-class Human Body Parsing with Edge-Enhancement Network. In: Gedeon, T., Wong, K., Lee, M. (eds) Neural Information Processing. ICONIP 2019. Communications in Computer and Information Science, vol 1142. Springer, Cham. https://doi.org/10.1007/978-3-030-36808-1_51

Download citation

DOI: https://doi.org/10.1007/978-3-030-36808-1_51
Published: 05 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36807-4
Online ISBN: 978-3-030-36808-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics