Skip to main content

Advertisement

Log in

HANA: Hierarchical Attention Network Assembling for Semantic Segmentation

  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

Semantic segmentation is a crucial issue in the field of computer vision, and it aims to assign each pixel in an image to a semantic object category. Modern cognitive research has presented that the biological system contains hidden features and explicit features, although they both contain useful information, the hidden features need further processing to make them explicit or clear. Inspired by this theory, a semantic segmentation framework named hierarchical attention network assembling is proposed. Multiple auxilary information of different levels corresponding to the two kinds of features of the cognitive system are exploited. Then we further process the hidden information to make them explicit for the semantic segmentation. While in the traditional methods, limited assistance of the auxiliary tasks with only hidden information is provided. In this study, the attention mechanism is utilized and two auxiliary tasks are introduced as attention modules to give explicit guidance to the semantic segmentation task. Two hierarchical sub-networks—an object-level bounding box attention network and an edge-level boundary attention network together serve as explicit auxiliary tasks, of which the first network driven by the object detection aims to aggrandize the consistency constraint of pixels belonging to the same object, and the second one driven by the boundary detection aims to improve the segmentation accuracy within the boundary regions. With the proposed method, the performance achieves 78.3% mean IOU on PASCAL VOC 2012. The explicit guidance of the two auxiliary tasks can well assist the semantic segmentation task.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation, In CVPR, 2015; pp. 3431–3440.

  2. Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation, In MICCAI, 2015; pp. 34–241.

  3. Badrinarayanan V, Kendall A, Cipolla R. Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis & Machine Intelligence. 2017;39(12):2481–95.

    Article  Google Scholar 

  4. Zhang H, Dana K, Shi JP, Zhang ZY, Wang XG, Tyagi A, Agrawal A. Context encoding for semantic segmentation, In CVPR, 2018.

  5. Zhao H, Shi J, Qi X, Wang X, Jia J. Pyramid scene parsing network, In CVPR, 2016; pp. 6230–6239.

  6. Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL. Semantic image segmentation with deep convolutional nets and fully connected CRFs, in ICLR, 2015.

  7. Xie J, Yu L, Zhu L, Chen X. Semantic image segmentation method with multiple adjacency trees and multiscale features. Cognitive Computation. 2017;9:168–79.

    Article  Google Scholar 

  8. Yu C, Wang J, Peng C, Gao C, Yu G, Sang N. Learning a discriminative feature network for semantic segmentation, In CVPR, 2018.

  9. Dai J, He K, Sun J. Instance-aware semantic segmentation via multi-task network cascades, In CVPR, 2016; pp. 3150–3158.

  10. Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL. Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected. Crfs, IEEE Transactions on Pattern Analysis & Machine Intelligence. 2017;40(4):834–48.

    Article  Google Scholar 

  11. Naveen C, Himadri V, Jayanta KG. Human cognition based framework for detecting roads from remote sensing images. Geocarto International, 2020; pp. 1–20.

  12. Naveen C, Jayanta KG. A cognitive framework for road detection from high resolution satellite images. Geocarto International. 2018;34:909–24.

    Google Scholar 

  13. Naveen C, Jayanta KG. A cognitive viewpoint on building detection from remotely sensed multispectral images. IETE-Journal of Research. 2017;64:165–75.

    Google Scholar 

  14. Naveen C, Jayanta KG. A cognitive method for building detection from high resolution satellite images. Current Science. 2017;112(5):1038–44.

    Article  Google Scholar 

  15. Girshick R, Donahue J, Darrell T, Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation, In CVPR, 2014; pp. 580–587.

  16. Ren S, He K, Girshick R, Sun J. Faster r-cnn: Towards real-time object detection with region proposal networks, In NIPS, 2015; pp. 91–99.

  17. Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: unified, real-time object detection, In CVPR, 2016; pp. 779–788.

  18. Redmon J, Farhadi A. Yolo9000: better, faster, stronger, In CVPR, 2017; pp. 6517–6525.

  19. Liu W, Anguelov D, Erhan D, Szegedy C, Reed DE, Fu C, Berg SC. SSD: single shot multibox detector, In ECCV, 2016; pp.21–37.

  20. Lin T, Goyal P, Girshick RB, He K, Dollar P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, 2017; 2999–3007.

  21. He K, Gkioxari G, Dollar P, Girshick R. Mask r-cnn, In CVPR, 2017; pp. 2980–2988.

  22. Dai J, He K, Li Y, Ren S, Sun J. Instance-sensitive fully convolutional networks, In ECCV, 2016; pp. 534–549.

  23. Liu Y, Cheng MM, Hu X, Wang K, Bai X. Richer convolutional features for edge detection, In CVPR, 2017; pp. 5872–5881.

  24. Yu Z, Feng C, Liu MY, Ramalingam S. Casenet: deep category-aware semantic boundary detection, In CVPR, 2017; pp. 1761–1770.

  25. Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A. The pascal visual object classes (voc) challenge. International Journal of Computer Vision. 2010;88(2):303–38.

    Article  Google Scholar 

  26. Hariharan B, Arbelaez P, Bourdev L, Maji S, Malik J. Semantic contours from inverse detectors, In ICCV, 2011; pp. 991–998.

  27. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Li FF. ImageNet large scale visual recognition challenge. Int J Comput Vis 2015;115(3):211–52.

    Article  MathSciNet  Google Scholar 

  28. Noh H, Hong S, Han B. Learning deconvolution network for semantic segmentation, in ICCV, 2015; pp. 1520–1528.

  29. Lin GS, Shen CH, van den Hengel A, Reid I. Exploring context with deep structured models for semantic segmentation. IEEE Transactions on Pattern Analysis & Machine Intelligence. 2018;40(6):1352–66.

    Article  Google Scholar 

  30. Wu H, Zhang J, Huang K, et al. FastFCN: rethinking dilated convolution in the backbone for semantic segmentation. arXiv: 1903.11816, 2019.

  31. Kirillov A, Girshick R, He K, et al. Panoptic Feature Pyramid Networks[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Liu.

Ethics declarations

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Conflict of Interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, W., Li, D. & Su, H. HANA: Hierarchical Attention Network Assembling for Semantic Segmentation. Cogn Comput 13, 1128–1135 (2021). https://doi.org/10.1007/s12559-021-09911-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12559-021-09911-z

Keywords

Navigation