Skip to main content

Saliency Prediction with Relation-Aware Global Attention Module

  • Conference paper
  • First Online:
  • 554 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1405))

Abstract

The deep learning method has achieved great success in saliency prediction task. Like depth and depth, the attention mechanism has been proved to be effective in enhancing the performance of Convolutional Neural Network (CNNs) in many studies. In this paper, we propose a new architecture that combines encoder-decoder architecture, multi-level integration, relation-aware global attention module. The encoder-decoder architecture is the main structure to extract deeper features. The multi-level integration constructs an asymmetric path that avoid information loss. The Relation-aware Global Attention module is used to enhance the network both channel-wise and spatial-wise. The architecture is trained and tested on SALICON 2017 benchmark and obtain competitive results compared with related research.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Schauerte, B., Richarz, J., Fink, G.A.: Saliency-based identification and recognition of pointed-at objects. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, pp. 4638–4643 (2010). https://doi.org/10.1109/IROS.2010.5649430

  2. Frintrop, S., Kessel, M.: Most salient region tracking. In: 2009 IEEE International Conference on Robotics and Automation, Kobe, pp. 1869–1874 (2009). https://doi.org/10.1109/ROBOT.2009.5152298

  3. Saeko, T., Ramesh, R., Michael, G., Bruce, G.: Automatic image retargeting. In: Proceedings of the 4th International Conference on Mobile and Ubiquitous Multimedia, MUM 2005, vol. 154, pp. 59–68 (2005). https://doi.org/10.1145/1149488.1149499

  4. Jiang, M., Huang, S., Duan, J., Zhao, Q.: SALICON: saliency in context. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, pp. 1072–1080 (2015). https://doi.org/10.1109/CVPR.2015.7298710

  5. Reddy, N., Jain, S., Yarlagadda, P., Gandhi, V.: Tidying deep saliency prediction architectures (2020). abs/2003.04942

    Google Scholar 

  6. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. arXiv e-prints. arXiv:1505.04597 (2015)

  7. Zhang, Z., Lan, C., Zeng, W., Jin, X., Chen, Z.: Relation-aware global attention for person re-identification. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, pp. 3183–3192 (2020). https://doi.org/10.1109/CVPR42600.2020.00325

  8. Cao, G., Tang, Q., Jo, K.: Aggregated deep saliency prediction by self-attention network. In: Huang, D.-S., Premaratne, P. (eds.) ICIC 2020. LNCS (LNAI), vol. 12465, pp. 87–97. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60796-8_8

    Chapter  Google Scholar 

  9. Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998). https://doi.org/10.1109/34.730558

    Article  Google Scholar 

  10. Bruce, N.D.B., Tsotsos, J.K.: Saliency, attention, and visual search: AN information theoretic approach. J. Vis. 9(3), 5 (2009)

    Article  Google Scholar 

  11. Borji, A.: Boosting bottom-up and top-down visual features for saliency estimation. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, pp. 438–445 (2012). https://doi.org/10.1109/CVPR.2012.6247706

  12. Cornia, M., Baraldi, L., Serra, G., Cucchiara, R.: Predicting human eye fixations via an LSTM-based saliency attentive model. IEEE Trans. Image Process. 27(10), 5142–5154 (2018). https://doi.org/10.1109/TIP.2018.2851672

    Article  MathSciNet  Google Scholar 

  13. Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1

    Chapter  Google Scholar 

  14. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv e-prints. arXiv:1409.1556 (2014)

Download references

Acknowledgement

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the government (MSIT) (No. 2020R1A2C2008972).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kang-Hyun Jo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cao, G., Jo, KH. (2021). Saliency Prediction with Relation-Aware Global Attention Module. In: Jeong, H., Sumi, K. (eds) Frontiers of Computer Vision. IW-FCV 2021. Communications in Computer and Information Science, vol 1405. Springer, Cham. https://doi.org/10.1007/978-3-030-81638-4_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-81638-4_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-81637-7

  • Online ISBN: 978-3-030-81638-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics