Skip to main content
Log in

Cross channel aggregation similarity network for salient object detection

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Salient object detection is an efficient preprocessing technique to deal with binary segmentation task. Existing works based on deep learning have achieved an enormous leap forward with outstanding performance in the field of computer vision. Most of the previous methods mainly adopted multi-scale fusion and attention mechanisms to facilitate efficient feature extraction yet ignored necessary global context characteristics and general models computational limitation. To mitigate the adverse effects of feature dilution during the top-to-down transmission, we propose a cross channel aggregation similarity network (CCANet) with three modules. Cross channel aggregation module retains high-response channels from integrated different layer feature maps to extract efficient global context information. Similarity fusion module calculates the similarity among various features consisting of high-level semantic, low-level spatial, and global context information to enhance the complementary of maps. Dense residual module extracts denser features under multi-scale receptive fields to improve the density of prediction maps. Besides, a combined loss function with modified weighted binary cross-entropy is applied to alleviate the class imbalance issue incurred in the training process. Benefited from the overall harmonious design, experimental results show that CCANet achieves state-of-the-art performance on six public benchmark datasets. Without any post-processing operations, it runs real-time inference at a speed of around 32 FPS when processing a 320 \(\times\) 320 image.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Achanta R, Hemami SS, Estrada FJ, Süsstrunk S (2009) Frequency-tuned salient region detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1597–1604

  2. Chen L, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Computer Vision—ECCV 2018—15th European Conference, Munich, Germany, September 8–14, 2018, Proceedings, Part VII, vol. 11211, pp 833–851

  3. Chen Z, Xu Q, Cong R, Huang Q (2020) Global context-aware progressive aggregation network for salient object detection. In: The Thirty-Fourth AAAI conference on artificial intelligence, the thirty-second innovative applications of artificial intelligence conference,the tenth AAAI symposium on educational advances in artificial intelligence, pp 10599–10606

  4. Cheng M, Mitra NJ, Huang X, Torr PHS, Hu S (2015) Global contrast based salient region detection. IEEE Trans Pattern Anal Mach Intell 37(3):569–582

    Article  Google Scholar 

  5. Deng R, Shen C, Liu S, Wang H, Liu X (2018) Learning to predict crisp boundaries. In: Computer vision European conference. Lecture Notes in Computer Science, vol 11210, pp 570–586

  6. Everingham M, Gool LV, Williams CKI, Winn JM, Zisserman A (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338

    Article  Google Scholar 

  7. Fan D, Cheng M, Liu J, Gao S, Hou Q, Borji A (2018) Salient objects in clutter: Bringing salient object detection to the foreground. In: Computer vision—ECCV 2018—15th European Conference, Munich, Germany, September 8–14, 2018, proceedings, Part XV, pp 196–212

  8. Fan D, Gong C, Cao Y, Ren B, Cheng M, Borji A (2018) Enhanced-alignment measure for binary foreground map evaluation. In: Proceedings of the twenty-seventh international joint conference on artificial intelligence, IJCAI 2018, July 13–19, 2018, Stockholm, Sweden, pp 698–704

  9. Feng M, Lu H, Ding E (2019) Attentive feedback network for boundary-aware salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1623–1632

  10. Gao SH, Cheng MM, Zhao K, Zhang XY, Yang MH, Torr P (2020) Res2net: a new multi-scale backbone architecture. IEEE Trans Pattern Anal Mach Intell:1–1

  11. Gao P, Zhang Q, Wang F, Xiao L, Fujita H, Zhang Y (2020) Learning reinforced attentional representation for end-to-end visual tracking. Inf Sci 517:52–67

    Article  Google Scholar 

  12. Goodfellow IJ, Warde-Farley D, Mirza M, Courville AC, Bengio Y (2013) Maxout networks. Proc Int Conf Mach Learn 28:1319–1327

    Google Scholar 

  13. Guanbin L, Yizhou Y (2016) Visual saliency detection based on multiscale deep CNN features. IEEE Trans Image Process 25(11):5012–5024

    Article  MathSciNet  Google Scholar 

  14. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  15. Hou X, Zhang L (2007) Saliency detection: a spectral residual approach. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  16. Hou Q, Cheng M, Hu X, Borji A, Tu Z, Torr PHS (2019) Deeply supervised salient object detection with short connections. IEEE Trans Pattern Anal Mach Intell 41(4):815–828

    Article  Google Scholar 

  17. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141

  18. Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259

    Article  Google Scholar 

  19. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems 25: 26th annual conference on neural information processing systems 2012. Proceedings of a meeting held December 3–6, 2012, Lake Tahoe, Nevada, United States, pp 1106–1114

  20. Li G, Yu Y (2015) Visual saliency based on multiscale deep features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5455–5463

  21. Li Y, Hou X, Koch C, Rehg JM, Yuille AL (2014) The secrets of salient object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 280–287

  22. Lin T, Dollár P, Girshick RB, He K, Hariharan B, Belongie SJ (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 936–944

  23. Liu C, Chen L, Schroff F, Adam H, Hua W, Yuille AL, Li F (2019) Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 82–92

  24. Liu J, Hou Q, Cheng M, Feng J, Jiang J (2019) A simple pooling-based design for real-time salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3917–3926

  25. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440

  26. Luo Z, Mishra AK, Achkar A, Eichel JA, Li S, Jodoin P (2017) Non-local deep features for salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6593–6601

  27. Milletari F, Navab N, Ahmadi S (2016) V-net: fully convolutional neural networks for volumetric medical image segmentation. In: Fourth international conference on 3D Vision, 3DV, pp 565–571

  28. Movahedi V, Elder JH (2010) Design and perceptual validation of performance measures for salient object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 49–56

  29. Pan J, Sayrol E, Giró-i-Nieto X, McGuinness K, O’Connor NE (2016) Shallow and deep convolutional networks for saliency prediction. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 598–606

  30. Pang Y, Zhao X, Zhang L, Lu H (2020) Multi-scale interactive network for salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9410–9419

  31. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Köpf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019) Pytorch: An imperative style, high-performance deep learning library. In: Advances in neural information processing systems 32: annual conference on neural information processing systems, pp 8024–8035

  32. Qin X, Zhang ZV, Huang C, Gao C, Dehghan M, Jägersand M (2019) Basnet: boundary-aware salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7479–7489

  33. Ren G, Dai T, Barmpoutis P, Stathaki T (2020) Salient object detection combining a self-attention module and a feature pyramid network. CoRR arXiv:2004.14552

  34. Wang L, Lu H, Ruan X, Yang (2015) Deep networks for saliency detection via local estimation and global search. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3183–3192

  35. Wang L, Wang L, Lu H, Zhang P, Ruan X (2016) Saliency detection with recurrent fully convolutional networks. In: Computer vision european conference, vol 9908, pp 825–841

  36. Wang L, Lu H, Wang Y, Feng M, Wang D, Yin B, Ruan X (2017) Learning to detect salient objects with image-level supervision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3796–3805

  37. Wang X, Girshick RB, Gupta A, He K (2018) Non-local neural networks. In: IEEE conference on computer vision and pattern recognition, pp 7794–7803

  38. Wang W, Zhao S, Shen J, Hoi SCH, Borji A (2019) Salient object detection with pyramid attention and salient edges. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1448–1457

  39. Wei J, Wang S, Huang Q (2019) F3net: Fusion, feedback and focus for salient object detection. CoRR arXiv:1911.11445

  40. Woo S, Park J, Lee J, Kweon IS (2018) CBAM: convolutional block attention module. In: Computer Vision—ECCV 2018—15th European Conference, Munich, Germany, September 8–14, 2018, Proceedings, Part VII, vol 11211, pp 3–19

  41. Wu Z, Su L, Huang Q (2019) Cascaded partial decoder for fast and accurate salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3907–3916

  42. Yan Q, Xu L, Shi J, Jia J (2013) Hierarchical saliency detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1155–1162

  43. Yang C, Zhang L, Lu H, Ruan X, Yang M (2013) Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3166–3173

  44. You Q, Jin H, Wang Z, Fang C, Luo J (2016) Image captioning with semantic attention. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4651–4659

  45. Zhang P, Wang D, Lu H, Wang H, Ruan X (2017) Amulet: aggregating multi-level convolutional features for salient object detection. In: IEEE international conference on computer vision, pp 202–211

  46. Zhang X, Wang T, Qi J, Lu H, Wang G (2018) Progressive attention guided recurrent network for salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 714–722

  47. Zhang Q, Shi Y, Zhang X (2020) Attention and boundary guided salient object detection. Pattern Recogn 107:107484

    Article  Google Scholar 

  48. Zhang S, Zhao W, Guan Z, Peng X, Peng J (2021) Keypoint-graph-driven learning framework for object pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1065–1073

  49. Zhao W, Guan Z, Luo H, Peng J, Fan J (2017) Deep multiple instance hashing for object-based image retrieval. In: IJCAI, pp 3504–3510

  50. Zhao J, Liu J, Fan D, Cao Y, Yang J, Cheng M (2019) Egnet: Edge guidance network for salient object detection. In: 2019 IEEE/CVF international conference on computer vision, pp 8778–8787

  51. Zhao W, Zhang S, Guan Z, Luo H, Tang L, Peng J, Fan J (2020) 6d object pose estimation via viewpoint relation reasoning. Neurocomputing 389:9–17

    Article  Google Scholar 

  52. Zhao X, Pang Y, Zhang L, Lu H, Zhang L (2020) Suppress and balance: a simple gated network for salient object detection. In: Computer Vision—ECCV 2020—16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II, vol 12347, pp 35–51

  53. Zhao W, Zhang S, Guan Z, Zhao W, Peng J, Fan J (2020) Learning deep network for detecting 3d object keypoints and 6d poses. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14134–14142

  54. Zheng Z, Yang X, Yu Z, Zheng L, Yang Y, Kautz J (2019) Joint discriminative and generative learning for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2138–2147

  55. Zhou H, Xie X, Lai J, Chen Z, Yang L (2020) Interactive two-stream decoder for accurate and fast saliency detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9138–9147

Download references

Acknowledgements

This work was supported by National Science Foundation of China under Grant nos. 61672467, 61877055, 61976195 and 61902358, as well as in part by the Natural Science Foundation of Zhejiang under Grant LY18F030013, Basic Science and Technology Research of Heilongjiang under Grant 2020-KYYWFMY-0065.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Zhonglong Zheng or Riheng Jia.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, L., Liu, H., Mo, J. et al. Cross channel aggregation similarity network for salient object detection. Int. J. Mach. Learn. & Cyber. 13, 2153–2169 (2022). https://doi.org/10.1007/s13042-022-01512-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-022-01512-y

Keywords

Navigation