ABSTRACT
Currently, for the task of salient object detection (SOD) based on deep learning, most approaches use a strategy of multi-level feature aggregation to enhance performance. However, due to the insufficient utilization of inter-pixel information, the aggregation of multi-level features often affects the prediction of salient objects and results in detecting blurry boundaries of salient objects. To tackle this problem, we have proposed a salient object detection network based on context-aware boundary perception. This network utilizes the context awareness (CA) branch to extract comprehensive contextual semantic information, guiding the network to focus attention not only on salient objects, but also on learning the mutual relationships between multiple salient objects. In addition, the boundary awareness (BA) branch is utilized to explore detailed boundary information around the contours of salient objects, enhancing the network's understanding of the edge pixels of salient objects. Moreover, we have introduced a new feature interaction aggregation (FIA) module, which is used to merge contextual semantic information and boundary detail information in the decoding stage to effectively utilize multi-level features and generate clearer and more accurate saliency maps. By conducting comprehensive experiments on three public datasets, we have demonstrated that our proposed method outperforms the current state-of-the-art representative methods.
- J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, in: IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 3431–3440 .Google ScholarCross Ref
- Zhou, H., **e, X., Lai, J. H., Chen, Z., & Yang, L. (2020). Interactive two-stream decoder for accurate and fast saliency detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 9141-9150).Google ScholarCross Ref
- Wei, J., Wang, S., Wu, Z., Su, C., Huang, Q., & Tian, Q. (2020). Label decoupling framework for salient object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 13025-13034).Google ScholarCross Ref
- Pang, Y., Zhao, X., Zhang, L., & Lu, H. (2020). Multi-scale interactive network for salient object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 9413-9422).Google ScholarCross Ref
- Cheng, M. M., Mitra, N. J., Huang, X., Torr, P. H., & Hu, S. M. (2014). Global contrast based salient region detection. IEEE transactions on pattern analysis and machine intelligence, 37(3), 569-582.Google Scholar
- Zhu, W., Liang, S., Wei, Y., & Sun, J. (2014). Saliency optimization from robust background detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2814-2821).Google ScholarDigital Library
- Lee, G., Tai, Y. W., & Kim, J. (2016). Deep saliency with encoded low level distance map and high level features. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 660-668).Google ScholarCross Ref
- Li, G., & Yu, Y. (2015). Visual saliency based on multiscale deep features. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5455-5463).Google Scholar
- Zhao, X., Pang, Y., Zhang, L., Lu, H., & Zhang, L. (2020). Suppress and balance: A simple gated network for salient object detection. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16 (pp. 35-51). Springer International Publishing.Google Scholar
- Zhang, Q., Shi, Y., Zhang, X., & Zhang, L. (2022). Residual attentive feature learning network for salient object detection. Neurocomputing, 501, 741-752.Google ScholarDigital Library
- Li, J., Pan, Z., Liu, Q., & Wang, Z. (2020). Stacked U-shape network with channel-wise attention for salient object detection. IEEE Transactions on Multimedia, 23, 1397-1409.Google ScholarDigital Library
- Hui, S., Guo, Q., Geng, X., & Zhang, C. (2023). Multi-Guidance CNNs for Salient Object Detection. ACM Transactions on Multimedia Computing, Communications and Applications, 19(3), 1-19.Google ScholarDigital Library
- Feng M, Lu H, Ding E. Attentive feedback network for boundary-aware salient object detection[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 1623-1632.Google Scholar
- Liu, N., Han, J., & Yang, M. H. (2018). Picanet: Learning pixel-wise contextual attention for saliency detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3089-3098).Google ScholarCross Ref
- Wu, Z., Su, L., & Huang, Q. (2019). Cascaded partial decoder for fast and accurate salient object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3907-3916).Google ScholarCross Ref
- He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).Google ScholarCross Ref
- Yun Y K, Lin W. Selfreformer: Self-refined network with transformer for salient object detection[J]. arXiv preprint arXiv:2205.11283, 2022.Google Scholar
- Zhang J, Shi Y, Zhang Q, Attention guided contextual feature fusion network for salient object detection[J]. Image and Vision Computing, 2022, 117: 104337.Google Scholar
- Qin X, Zhang Z, Huang C, Basnet: Boundary-aware salient object detection[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 7479-7489.Google Scholar
- Máttyus G, Luo W, Urtasun R. Deeproadmapper: Extracting road topology from aerial images[C]//Proceedings of the IEEE international conference on computer vision. 2017: 3438-3446.Google Scholar
- Wang L, Lu H, Wang Y, Learning to detect salient objects with image-level supervision[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 136-145.Google Scholar
- Yan Q, Xu L, Shi J, Hierarchical saliency detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2013: 1155-1162.Google Scholar
- Li G, Yu Y. Visual saliency based on multiscale deep features[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 5455-5463.Google Scholar
- He K, Zhang X, Ren S, Delving deep into rectifiers: Surpassing human-level performance on imagenet classification[C]//Proceedings of the IEEE international conference on computer vision. 2015: 1026-1034.Google Scholar
- Margolin R, Zelnik-Manor L, Tal A. How to evaluate foreground maps?[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2014: 248-255.Google Scholar
- Fan D P, Gong C, Cao Y, Enhanced-alignment measure for binary foreground map evaluation[J]. arXiv preprint arXiv:1805.10421, 2018.Google Scholar
- Fan D P, Cheng M M, Liu Y, Structure-measure: A new way to evaluate foreground maps[C]//Proceedings of the IEEE international conference on computer vision. 2017: 4548-4557.Google Scholar
Index Terms
- Contextual Boundary Aware Network for Salient Object Detection
Recommendations
Salient object detection: From pixels to segments
In this paper we propose a novel approach to the task of salient object detection. In contrast to previous salient object detectors that are based on a spotlight attention theory, we follow an object-based attention theory and incorporate the notion of ...
Salient object detection via multiple saliency weights
Salient object detection aims to emulate the extraordinary capability of human visual system, which has the ability to find the most visually attractive objects in a complex visual scene. The human visual attention is often complicated and affected by ...
Fixation guided network for salient object detection
MMAsia '20: Proceedings of the 2nd ACM International Conference on Multimedia in AsiaConvolutional neural network (CNN) based salient object detection (SOD) has achieved great development in recent years. However, in some challenging cases, i.e. small-scale salient object, low contrast salient object and cluttered background, existing ...
Comments