ABSTRACT
The emergence of deep neural networks and full convolutional neural networks has brought great progress to salient object detection. In this paper, we propose a new type of deep full convolutional neural network structure, named top-down feature aggregation block fusion network, which aims to fuse the rich features of feature aggregation blocks at each layer. In addition to the features of this layer, the feature aggregation blocks have other layer features, that is, each layer of feature aggregation blocks has both strong semantic information of the deep network and detailed features of the shallow network. In the top-down fusion process, the residual information of each layer can be learned like ResNet. At the same time, a non-local attention mechanism is introduced to improve the relevance of the context, and multiple auxiliary supervision connections are added to the intermediate layers, so that the network can more easily optimize and accelerate convergence. We have performed experiments on six benchmark datasets, and the results of the experiments show that our model is superior to the state-of-the-art methods both quantitatively and qualitatively.
- V. Navalpakkam and L. Itti, "An Integrated Model of Top-Down and Bottom-Up Attention for Optimizing Detection Speed," 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06), New York, NY, USA, 2006, pp. 2049--2056. DOI= https://doi.org/10.1109/CVPR.2006.54Google ScholarDigital Library
- C. Craye, D. Filliat and J. Goudou, "Environment exploration for object-based visual saliency learning," 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, 2016, pp. 2303--2309. DOI= https://doi.org/10.1109/ICRA.2016.7487379Google ScholarDigital Library
- Ming-Ming Cheng, Fang-Lue Zhang, Niloy J. Mitra, Xiaolei Huang, and Shi-Min Hu. 2010. RepFinder: finding approximately repeated scene elements for image editing. In ACM SIGGRAPH 2010 papers (SIGGRAPH '10). Association for Computing Machinery, New York, NY, USA, Article 83, 1--8. DOI= https://doi.org/10.1145/1833349.1778820Google ScholarDigital Library
- L. Itti, C. Koch and E. Niebur, "A model of saliency-based visual attention for rapid scene analysis," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 11, pp. 1254--1259, Nov. 1998. DOI= https://doi.org/10.1109/34.730558Google ScholarDigital Library
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1 (NIPS'12). Curran Associates Inc., Red Hook, NY, USA, 1097--1105.Google Scholar
- J. Long, E. Shelhamer and T. Darrell, "Fully convolutional networks for semantic segmentation," 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, 2015, pp. 3431--3440.DOI= https://doi.org/10.1109/CVPR.2015.7298965Google Scholar
- R. Zhao, W. Ouyang, H. Li and X. Wang, "Saliency detection by multi-context deep learning," 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, 2015, pp. 1265--1274. DOI= https://doi.org/10.1109/CVPR.2015.7298731Google Scholar
- Z. Luo, A. Mishra, A. Achkar, J. Eichel, S. Li and P. Jodoin, "Non-local Deep Features for Salient Object Detection," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017, pp. 6593--6601. DOI= https://doi.org/10.1109/CVPR.2017.698Google Scholar
- G. Li, Y. Xie, L. Lin and Y. Yu, "Instance-Level Salient Object Segmentation," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017, pp. 247--256. DOI= https://doi.org/10.1109/CVPR.2017.34Google Scholar
- Q. Hou, M. Cheng, X. Hu, A. Borji, Z. Tu and P. H. S. Torr, "Deeply Supervised Salient Object Detection with Short Connections," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 4, pp. 815--828, 1 April 2019. DIO= https://doi.org/10.1109/TPAMI.2018.2815688Google ScholarDigital Library
- J. Liu, Q. Hou, M. Cheng, J. Feng and J. Jiang, "A Simple Pooling-Based Design for Real-Time Salient Object Detection," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 3912--3921. DOI=https://doi.org/10.1109/CVPR.2019.00404Google Scholar
- X. Zhang, T. Wang, J. Qi, H. Lu and G. Wang, "Progressive Attention Guided Recurrent Network for Salient Object Detection," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018, pp. 714--722. DOI=https://doi.org/10.1109/CVPR.2018.00081.Google Scholar
- Deng, Zijun & Hu, Xiaowei & Zhu, Lei & xu, Xuemiao & Qin, Jing & Han, Guoqiang & Heng, Pheng-Ann. (2018). R^3 Net: Recurrent Residual Refinement Network for Saliency Detection. 10.24963/ijcai.2018/95. DOI= https://doi.org/10.24963/ijcai.2018/95Google Scholar
- X. Qin, Z. Zhang, C. Huang, C. Gao, M. Dehghan and M. Jagersand, "BASNet: Boundary-Aware Salient Object Detection," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 7471--7481. DOI= https://doi.org/10.1109/CVPR.2019.00766Google Scholar
- X. Wang, R. Girshick, A. Gupta and K. He, "Non-local Neural Networks," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018, pp. 7794--7803.Google Scholar
- Q. Yan, L. Xu, J. Shi and J. Jia, "Hierarchical Saliency Detection," 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, 2013, pp. 1155--1162. DOI= https://doi.org/10.1109/CVPR.2013.153Google ScholarDigital Library
- Y. Li, X. Hou, C. Koch, J. M. Rehg and A. L. Yuille, "The Secrets of Salient Object Segmentation," 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, 2014, pp. 280--287. DOI= https://doi.org/10.1109/CVPR.2014.43Google ScholarDigital Library
- Guanbin Li and Y. Yu, "Visual saliency based on multiscale deep features," 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, 2015, pp. 5455--5463.Google Scholar
- C. Yang, L. Zhang, H. Lu, X. Ruan and M. Yang, "Saliency Detection via Graph-Based Manifold Ranking," 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, 2013, pp. 3166--3173.Google Scholar
- V. Movahedi and J. H. Elder, "Design and perceptual validation of performance measures for salient object segmentation," 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops, San Francisco, CA, 2010, pp. 49--56.Google Scholar
- L. Wang et al., "Learning to Detect Salient Objects with Image-Level Supervision," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017, pp. 3796--3805. DOI= https://doi.org/10.1109/CVPR.2017.404Google ScholarCross Ref
- R. Zhao, W. Ouyang, H. Li and X. Wang, "Saliency detection by multi-context deep learning," 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, 2015, pp. 1265--1274. DOI=https://doi.org/10.1109/CVPR.2015.7298731Google Scholar
- Guanbin Li and Y. Yu, "Visual saliency based on multiscale deep features," 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, 2015, pp. 5455--5463. DOI=https://doi.org/10.1109/CVPR.2015.7299184Google Scholar
- G. Lee, Y. Tai and J. Kim, "Deep Saliency with Encoded Low Level Distance Map and High Level Features," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016, pp. 660--668. DOI=https://doi.org/10.1109/CVPR.2016.78Google Scholar
- G. Li and Y. Yu, "Deep Contrast Learning for Salient Object Detection," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016, pp. 478--487. DOI=https://doi.org/10.1109/CVPR.2016.58Google Scholar
- Z. Luo, A. Mishra, A. Achkar, J. Eichel, S. Li and P. Jodoin, "Non-local Deep Features for Salient Object Detection," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017, pp. 6593--6601. DOI=https://doi.org/10.1109/CVPR.2017.698Google Scholar
- P. Zhang, D. Wang, H. Lu, H. Wang and X. Ruan, "Amulet: Aggregating Multi-level Convolutional Features for Salient Object Detection," 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 2017, pp. 202--211. DOI=https://doi.org/10.1109/ICCV.2017.31Google Scholar
- L. Zhang, J. Dai, H. Lu, Y. He and G. Wang, "A Bi-Directional Message Passing Model for Salient Object Detection," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018, pp. 1741--1750.Google Scholar
- T. Wang et al., "Detect Globally, Refine Locally: A Novel Approach to Saliency Detection," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018, pp. 3127--3135. DOI=https://doi.org/10.1109/CVPR.2018.00330Google Scholar
- S. Chen, X. Tan, B. Wang, and X. Hu, "Reverse attention for salient object detection," in Proc. Eur. Conf. Comput. Vis., 2018, pp. 236--252.Google Scholar
- R. Wu, M. Feng, W. Guan, D. Wang, H. Lu and E. Ding, "A Mutual Learning Method for Salient Object Detection With Intertwined Multi-Supervision," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 8142--8151.Google Scholar
- M. Feng, H. Lu and E. Ding, "Attentive Feedback Network for Boundary-Aware Salient Object Detection," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 1623--1632. DOI=https://doi.org/10.1109/CVPR.2019.00172Google Scholar
- Z. Wu, L. Su and Q. Huang, "Cascaded Partial Decoder for Fast and Accurate Salient Object Detection," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 3902--3911. DOI=https://doi.org/10.1109/CVPR.2019.00403Google Scholar
- W. Wang, S. Zhao, J. Shen, S. C. H. Hoi and A. Borji, "Salient Object Detection With Pyramid Attention and Salient Edges," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 1448--1457.Google Scholar
Index Terms
- Top-down Feature Aggregation Block Fusion Network for Salient Object Detection
Recommendations
Deep layer guided network for salient object detection
AbstractRecently salient object detection with convolutional neural networks has made great progress. More and more methods design more complex networks to integrate the features of each stage from backbone extractor. Considering that global ...
Attention guided contextual feature fusion network for salient object detection
Highlights- We propose an attention-guided network for salient object detection.
AbstractIn recent years, the Convolutional Neural Network (CNN) has been widely used in various visual tasks because of its powerful feature extraction ability. Salient object detection methods based on CNN have also achieved great ...
Feature extraction and fusion network for salient object detection
AbstractIn the salient object detection (SOD) models based on convolutional neural network (CNN), the high-level semantic features and low-level features of the image are effectively fused and complementary, which can effectively improve the performance ...
Comments