FGO-Net: Feature and Gaussian Optimization Network for visual saliency prediction

Pei, Jialun; Zhou, Tao; Tang, He; Liu, Chao; Chen, Chuanbo

doi:10.1007/s10489-022-03647-5

FGO-Net: Feature and Gaussian Optimization Network for visual saliency prediction

Published: 07 July 2022

Volume 53, pages 6214–6229, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Jialun Pei¹,
Tao Zhou²,
He Tang ORCID: orcid.org/0000-0002-8454-1407³,
Chao Liu³ &
…
Chuanbo Chen³

508 Accesses
Explore all metrics

Abstract

Convolutional neural networks (CNNs) have become a major driving force for visual saliency prediction. However, since the features of different layers have diverse characteristics, not all of them are effective for saliency detection, and some even cause interference. To effectively fuse multiscale features from different layers, in this paper, we propose a feature and Gaussian optimization network (FGO-Net) for saliency prediction. Specifically, we first design a novel attention ConvLSTM (ACL) module that contains circular soft layer attention to iteratively reweight multilevel features from different layers. Besides, previous works generally employed Gaussian blur with fixed kernel size after saliency maps generation as post-processing. To this end, we design an adaptive Gaussian blur (AGB) module that can automatically select the appropriate Gaussian kernel size to blur the saliency map without post-processing. Extensive experiments on several public saliency datasets demonstrate that the proposed FGO-Net achieves competitive results in terms of various evaluation metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

A novel deep network and aggregation model for saliency detection

Article 09 December 2019

Aggregated Deep Saliency Prediction by Self-attention Network

Multi-level Net: A Visual Saliency Prediction Model

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Borji A (2019) Saliency prediction in the deep learning era: Successes and limitations. IEEE Trans Patt Anal Mach Intell
Lennie P (2003) The cost of cortical computation. Curr Biol 13(6):493–497
Article Google Scholar
Wang W, Shen J, Dong X, Borji A, Yang R (2019) Inferring salient objects from human fixations. IEEE Trans Patt Anal Mach Intell 42(8):1913–1927
Article Google Scholar
Russakovsky O, Li L-J, Fei-Fei L (2015) Best of both worlds: human-machine collaboration for object annotation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2121–2131
Hadizadeh H, Bajić I V (2013) Saliency-aware video compression. IEEE Trans Image Process 23(1):19–33
Article MathSciNet MATH Google Scholar
Cornia M, Baraldi L, Serra G, Cucchiara R (2018) Paying more attention to saliency: Image captioning with saliency and context attention. ACM Trans Multimed Comput Commun Appl 14(2):1–21
Article Google Scholar
V M, N V (2009) Saliency-based discriminant tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1007–1013
Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Patt Anal Mach Intell 20(11):1254–1259
Article Google Scholar
Xie Y, Lu H, Yang M-H (2012) Bayesian saliency via low and mid level cues. IEEE Trans Image Process 22(5):1689–1698
MathSciNet MATH Google Scholar
Erdem E, Erdem A (2013) Visual saliency estimation by nonlinearly integrating features using region covariances. J Vis 13(4):11–11
Article Google Scholar
Goferman S, Zelnik-Manor L, Tal A (2011) Context-aware saliency detection. IEEE Trans Patt Anal Mach Intell 34(10):1915– 1926
Article Google Scholar
Liu N, Han J, Liu T, Li X (2016) Learning to predict eye fixations via multiresolution convolutional neural networks. IEEE Trans Neural Netw Learn Syst 29(2):392–404
Article MathSciNet Google Scholar
Kruthiventi Srinivas SS, Gudisa V, Dholakiya J H, Venkatesh Babu R (2016) Saliency unified: A deep architecture for simultaneous eye fixation prediction and salient object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5781–5790
Liu N, Han J, Zhang D, Wen S, Liu T (2015) Predicting eye fixations using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 362–370
Kruthiventi Srinivas SS, Ayush K, Babu R V (2017) Deepfix: A fully convolutional neural network for predicting human eye fixations. IEEE Trans Image Process 26(9):4446–4456
Article MathSciNet MATH Google Scholar
Zhao Q, Sheng T, Wang Y, Tang Z, Chen Y, Cai L, Ling H (2019) M2det: A single-shot object detector based on multi-level feature pyramid network. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 9259–9266
Wan B, Zhou D, Liu Y, Li R, He X (2019) Pose-aware multi-level feature network for human object interaction detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp 9469–9478
Huang Z, Chen H-X, Zhou T, Yang Y-Z, Liu B-Y (2021) Multi-level cross-modal interaction network for rgb-d salient object detection. Neurocomputing 452:200–211
Article Google Scholar
Woo S, Park J, Lee J-Y, So Kweon I (2018) Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision, pp 3–19
Zhao T, Wu X (2019) Pyramid feature attention network for saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3085–3094
Wang W, Shen J (2017) Deep visual attention prediction. IEEE Trans Image Process 27 (5):2368–2378
Article MathSciNet Google Scholar
Bruce N, Tsotsos J (2006) Saliency based on information maximization. In: Advances in Neural Information Processing Systems, pp 155–162
Liang M, Hu X (2015) Predicting eye fixations with higher-level visual features. IEEE Trans Image Process 24(3):1178–1189
Article MathSciNet MATH Google Scholar
Krizhevsky A, Sutskever I, Hinton G E (2012) Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp 1097–1105
Che Z, Borji A, Zhai G, Ling S, Li J, Tian Y, Guo G, Le Callet P (2021) Adversarial attack against deep saliency models powered by non-redundant priors. IEEE Trans Image Process 30:1973–1988
Article Google Scholar
Shen C, Song M, Zhao Q (2012) Learning high-level concepts by training a deep network on eye fixations. In: NIPS Deep Learning and Unsup Feat Learn Workshop, vol 2
Huang X, Shen C, Boix X, Zhao Q (2015) Salicon: Reducing the semantic gap in saliency prediction by adapting deep neural networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp 262–270
Pan J, Sayrol E, Giro-i Nieto X, McGuinness K, O’Connor N E (2016) Shallow and deep convolutional networks for saliency prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 598–606
Cornia M, Baraldi L, Serra G, Cucchiara R (2018) Predicting human eye fixations via an lstm-based saliency attentive model. IEEE Trans Image Process 27(10):5142–5154
Article MathSciNet Google Scholar
Liu N, Han J (2018) A deep spatial contextual long-term recurrent convolutional network for saliency detection. IEEE Trans Image Process 27(7):3264–3274
Article MathSciNet Google Scholar
Kummerer M, Wallis Thomas SA, Gatys L A, Bethge M (2017) Understanding low-and high-level contributions to fixation prediction. In: Proceedings of the IEEE International Conference on Computer Vision, pp 4789–4798
He S, Tavakoli H R, Borji A, Mi Y, Pugeault N (2020) Understanding and visualizing deep visual saliency models. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 10206–10215
Wang W, Lai Q, Fu H, Shen J, Ling H, Yang R (2021) Salient object detection in the deep learning era: An in-depth survey. IEEE Trans Patt Anal Mach Intell
Bi H-B, Lu D, Zhu H-H, Yang L-N, Guan H-P (2021) Sta-net: spatial-temporal attention network for video salient object detection. Appl Intell 51(6):3450–3459
Article Google Scholar
Cheng M-M, Mitra N J, Huang X, Torr PHS, Hu S-M (2014) Global contrast based salient region detection. IEEE Trans Patt Anal Mach Intell 37(3):569–582
Article Google Scholar
Pang Y, Zhao X, Zhang L, Lu H (2020) Multi-scale interactive network for salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 9413–9422
Li G, Yu Y (2015) Visual saliency based on multiscale deep features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5455–5463
Hou Q, Cheng M-M, Hu X, Borji A, Tu Z, Torr PHS (2017) Deeply supervised salient object detection with short connections. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3203–3212
Zhang Q, Cong R, Li C, Cheng M-M, Fang Y, Cao X, Zhao Y, Kwong S (2020) Dense attention fluid network for salient object detection in optical remote sensing images. IEEE Trans Image Process
Chen S, Tan X, Wang B, Hu X (2018) Reverse attention for salient object detection. In: Proceedings of the European Conference on Computer Vision, pp 234–250
Feng M, Lu H, Ding E (2019) Attentive feedback network for boundary-aware salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1623–1632
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in Neural Information Processing Systems, pp 5998–6008
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7132–7141
Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X (2017) Residual attention network for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3156–3164
Zhang Z, Lin Z, Xu J, Jin W-D, Lu S-P, Fan D-P (2021) Bilateral attention network for rgb-d salient object detection. IEEE Trans Image Process 30:1949–1961
Article Google Scholar
Xu K, Li D, Cassimatis N, Wang X (2018) Lcanet: End-to-end lipreading with cascaded attention-ctc. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition. IEEE, pp 548–555
Fan D-P, Ji G-P, Zhou T, Chen G, Fu H, Shen J, Shao L (2020) Pranet: Parallel reverse attention network for polyp segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, pp 263–273
Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: Neural image caption generation with visual attention. In: International Conference on Machine Learning, pp 2048–2057
Chen L, Zhang H, Xiao J, Nie L, Shao J, Liu W, Chua T-S (2017) Sca-cnn: Spatial and channel-wise attention in convolutional networks for image captioning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5659–5667
Lejbølle A R, Nasrollahi K, Krogh B, Moeslund T B (2019) Person re-identification using spatial and layer-wise attention. IEEE Trans Inf Forensic Secur 15:1216–1231
Article Google Scholar
He K, Zhang X, Ren S, Sun J Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Rothenstein A L, Tsotsos J K (2008) Attention links sensing to recognition. Image Vis Comput 26(1):114–126
Article Google Scholar
Han J, Zhang D, Wen S, Guo L, Liu T, Li X (2015) Two-stage learning to predict human eye fixations via sdaes. IEEE Trans Cybern 46(2):487–498
Article Google Scholar
Kingma D P, Ba J (2015) Adam: A method for stochastic optimization. In: ICLR
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Ieee, pp 248–255
Tatler B W, Baddeley R J, Gilchrist I D (2005) Visual correlates of fixation selection: Effects of scale and time. Vis Res 45(5):643–659
Article Google Scholar
Cornia M, Baraldi L, Serra G, Cucchiara R (2016) A deep multi-level network for saliency prediction. In: 2016 23rd International Conference on Pattern Recognition. IEEE, pp 3488–3493
Zhang J, Sclaroff S (2013) Saliency detection: A boolean map approach. In: Proceedings of the IEEE International Conference on Computer Vision, pp 153–160
Vig E, Dorr M, Cox D (2014) Large-scale optimization of hierarchical features for saliency prediction in natural images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2798–2805
Tavakoli H R, Borji A, Laaksonen J, Rahtu E (2017) Exploiting inter-image similarity and ensemble of extreme learners for fixation prediction using deep features. Neurocomputing 244:10–18
Article Google Scholar
Hou X, Zhang L (2009) Dynamic visual attention: Searching for coding length increments. In: Advances in Neural Information Processing Systems, pp 681–688
Jetley S, Murray N, Vig E (2016) End-to-end saliency mapping via probability distribution prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5753–5761
Harel J, Koch C, Perona P (2006) Graph-based visual saliency. In: Advances in Neural Information Processing Systems, pp 545–552
Kroner A, Senden M, Driessens K, Goebel R (2020) Contextual encoder–decoder network for visual saliency prediction. Neural Netw 129:261–270
Article Google Scholar
Qi F, Lin C, Shi G, Li H (2019) A convolutional encoder-decoder network with skip connections for saliency prediction. IEEE Access 7:60428–60438
Article Google Scholar
Yang S, Lin G, Jiang Q, Lin W (2019) A dilated inception network for visual saliency prediction. IEEE Trans Multimed 22(8):2163–2176
Article Google Scholar

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant No. 61902139).

Author information

Authors and Affiliations

School of Computer Science and Technology, Huazhong University of Science and Technology, Luoyu Road, Wuhan, 430074, Hubei, China
Jialun Pei
School of Computer Science and Engineering, Nanjing University of Science and Technology, Xiaolingwei, Nanjing, 210094, Jiangsu, China
Tao Zhou
School of Software Engineering, Huazhong University of Science and Technology, Luoyu Road, Wuhan, 430074, Hubei, China
He Tang, Chao Liu & Chuanbo Chen

Authors

Jialun Pei
View author publications
You can also search for this author in PubMed Google Scholar
Tao Zhou
View author publications
You can also search for this author in PubMed Google Scholar
He Tang
View author publications
You can also search for this author in PubMed Google Scholar
Chao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Chuanbo Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to He Tang.

Ethics declarations

Conflict of Interests

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The first two authors contributed equally to this work.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pei, J., Zhou, T., Tang, H. et al. FGO-Net: Feature and Gaussian Optimization Network for visual saliency prediction. Appl Intell 53, 6214–6229 (2023). https://doi.org/10.1007/s10489-022-03647-5

Download citation

Accepted: 16 April 2022
Published: 07 July 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s10489-022-03647-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

FGO-Net: Feature and Gaussian Optimization Network for visual saliency prediction

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A novel deep network and aggregation model for saliency detection

Aggregated Deep Saliency Prediction by Self-attention Network

Multi-level Net: A Visual Saliency Prediction Model

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

FGO-Net: Feature and Gaussian Optimization Network for visual saliency prediction

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A novel deep network and aggregation model for saliency detection

Aggregated Deep Saliency Prediction by Self-attention Network

Multi-level Net: A Visual Saliency Prediction Model

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation