Abstract
In the trend of deep learning, salient object detection (SOD) has achieved preeminent performance over the years. Nevertheless, scant work has been done to address the contradiction between excellent saliency map segmentation and the computational consumption required by high-resolution input images (e.g., 1024×2048 pixels or more). To meet this challenge, we introduce a dual-path processing network (DPPNet) to detect and segment salient objects in high-resolution input images directly and efficiently, the network contains global context and spatial details paths. Specifically, the global context path utilizes a multi feature extraction and enhanced (MFEE) module to extract richer global multiscale semantic features with a large receptive field at a lower resolution. The spatial details path employs a boundary information guided (BIG) module to focus on accurate saliency objects location and maintain local boundary information at a higher resolution. Guided by the BIG module, a feature fusion unit (FFU) is further employed to heighten the spatial consistency of maps at different levels and boost the robustness of the network. Extensive evaluations on two high-resolution SOD datasets and four low-resolution SOD mainstream datasets indicate that the method we proposed can settle the challenge of high-resolution image input effectively and exceed ten top-notch comparison algorithms.
Similar content being viewed by others
Notes
Our source code will be available at https://github.com/YQP-CV/DPPNet
References
Li Z Y, Liu G G, Zhang D, Xu Y (2016) Robust single-object image segmentation based on salient transition region. Pattern Recogn 52:317–331
Zhi H, Shen J, Hong B (2018) Saliency driven region-edge-based top down level set evolution reveals the asynchronous focus in image segmentation. Pattern Recogn: J Pattern Recogn Soc 80:241–255
Hong S, You T, Kwak S, Han B (2015) Online tracking by learning discriminative saliency map with convolutional neural network. Int Conf Mach Learn:597–606
Lai BS, Gong SJ (2016) Saliency guided dictionary learning for weakly-supervised image parsing. Comput Vis Pattern Recogn:3630–3639
Zhu J Y, Wu J J, Wei Y C, Chang E, Tu Z W (2012) Unsupervised object class discovery via saliency-guided multiple class learning. Comput Vis Pattern Recogn 37(4):862–875
Bi H B, Lu D, Zhu H H, Yang L N (2020) STA-Net: spatial-temporal attention network for video salient object detection. Appl Intell 51:3450–3459
Shen J B, Peng J T, Shao L (2018) Submodular trajectories for better motion segmentation in videos. IEEE Trans Image Process 27(6):2688–2700
Wang W G, Shen J B, Ling H B (2018) A deep network solution for attention and aesthetics aware photo ropping. IEEE Trans Pattern Anal Mach Intell 41(7):1531–1544
Cheng M M, Mitra N, Huang X L, Torr P H, Hu S M (2014) Global contrast based salient region detection. IEEE Trans Pattern Anal Mach Intell 37(3):569–582
Peng H W, Li B, Ling H B, Hu W M, Maybank S J (2016) Salient object detection via structured matrix decomposition. IEEE Trans Pattern Anal Mach Intell 39(4):818–832
Zhang Q, Lin J J, Li W J, Shi Y J, Cao G G (2018) Salient object detection via compactness and objectness cues. Vis Comput 34(4):473–489
Zhang Q, Lin J J, Tao Y Y, Li W J, Shi Y J (2017) Salient object detection via color and texture cues. Neurocomputing 243:35–48
Shelhamer E, Long J T, Darrell T (2017) Fully Convolutional Networks for Semantic Segmentation. IEEE Comput Soc:3431–3440
Fan DP, Lin Z, Ji GP, Zhang DG, Cheng MM (2020) Taking a deeper look at co-salient object detection. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 862–875
Liu J J, Hou Q B, Cheng M M, Feng J S, Jiang J J (2019) A simple pooling-based design for real-time salient object detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3917–3926
Li J X, Pan Z F, Liu QS, Wang ZY (2020) Stacked u-shape network with channel-wise attention for salient object detection. IEEE Transactions on Multimedia. https://doi.org/10.1109/TMM.2020.2997192
Zhang P P, Wang D, Lu H C, Wang H Y, Ruan X (2017) Amulet: Aggregating multi-level convolutional features for salient object detection. Proceedings of the IEEE International Conference on Computer Vision, pp 202–211
Hou QB, Cheng MM, Hu XW, Borji A, Torr P (2017) Deeply supervised salient object detection with short connections. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3203–3212
Luo ZM, Mishra A, Achkar A, Eichel J, Marc P (2017) Non-local deep features for salient object detection. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 6609–6617
Poudel R P, Liwicki S, Cipolla R (2019) Fast-scnn: fast semantic segmentation network. arXiv:1902.04502
Zhao HS, Qi XJ, Shen XY, Shi JP, Jia JY (2018) ICNet for Real-Time Semantic Segmentation on High-Resolution Images. 2018,the European conference on computer vision (ECCV), pp 405–420
Lin G S, Milan A, Shen C H, Reid I (2017) Refinenet: Multi-path refinement networks for high-resolution semantic segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1925–1934
Wang W G, Lai Q X, Fu H Z, Shen J B, Ling H B (2021) Ruigang Salient object detection in the deep learning era: An in-depth survey. IEEE Trans Pattern Anal Mach Intell:220–232
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
He K M, Zhang X Y, Ren S Q, Sun J (2016) Deep residual learning for image recognition. Proc IEEE Conf Comput Vis Pattern Recogn:770–778
Huang G, Liu Z, Maaten L V, Weinberger K Q (2017) Densely connected convolutional networks. Proc IEEE Conf Comput Vis Pattern Recogn:4700–4708
Ronneberger O, Fischer P, Thomas B (2015) U-net:Convolutional networks for biomedical image segmentation. Int Conf Med Image Comput Comput-Assisted Intervent:234–241
Zhao J, Liu JJ, Fan DP, Cao Y (2019) EGNet: Edge Guidance Network for Salient Object Detection. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp 8779–8788
Jia F W, Guan J, Qi S H, Li H L (2020) A mix-supervised unified framework for salient object detection. Appl Intell 50(1):2945–2958
Yu C H, Wang J G, Peng C, Gao C X, Yu G, Sang N (2018) Bisenet: Bilateral segmentation network for real-time semantic segmentation. Proc Eur Conf Comput Vis (ECCV):325–341
Zhou HJ, Xie XH, Lai JH, Chen ZX, Yang LX (2020) Interactive two-stream decoder for accurate and fast saliency detection. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 9141–9150
Zhang XN, Wang TT, Qi JQ, Lu HC, Wang G (2018) Progressive attention guided recurrent network for salient object detection. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 9141–9150
Paszke A, Chaurasia A, Kim S, et al. (2016) Enet: A deep neural network architecture for real-time semantic segmentation. arXiv:1606.02147
Li H, Xiong P, Fan H, et al. (2019) Dfanet: Deep feature aggregation for real-time semantic segmentation. Proc IEEE/CVF Conf Comput Vis Pattern Recogn:9522–9531
Mehta S, Rastegari M, Caspi A, et al. (2018) Espnet: Efficient spatial pyramid of dilated convolutions for semantic segmentation. Proceedings Eur Conf Comput Vis (ECCV):552–568
Romera E, Alvarez J M, Bergasa L M et al (2017) Erfnet: Efficient residual factorized convnet for real-time semantic segmentation. Proc IEEE Trans Intell Transp Syst 19(1): 263–272
Wang H R, Fan Y, Wang Z X, Jiao L C (2018) Parameter-free spatial attention network for person re-identification. arXiv:1811.12150
Zhang Y L, Li K P, Li K, Wang L C, Zhong B (2018) Image super-resolution using very deep residual channel attention networks. Proc Eur Conf Comput Vis (ECCV):286–301
Chu X, Yang W, Wan L, Ma C (2017) Multi-context attention for human pose estimation. Proc IEEE Conf Comput Vis Pattern Recognit:1831–1840
Li T P, Song H H, Zhang Z H, Liu Q S (2020) Recurrent reverse attention guided residual learning for saliency object detection. Neurocomputing 389:170–178
Howard A G, Zhu M L, Chen B, Dmitry K, Wang W J (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861
Fisher Y, Vladlen K (2015) Multi-scale context aggregation by dilated convolutions. arXiv:1511.07122
Chen L C, Zhu Y K, George P, Florian S, Hartwig A (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. Proc Eur Conf Comput Vis (ECCV):801–818
Zhao HS, Shi JP, Qi XJ, Wang XJ, Jia XJ (2017) Pyramid scene parsing network. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2881–2890
He K M, Zhang X Y, Ren S Q, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–16
Lin D, Ji Y F, Dani L, Daniel C O, Huang H (2018) Multi-scale context intertwining for semantic segmentation. Proc Eur Conf Comput Vis (ECCV):603–619
Shang R H, Zhang J Y, Jiao L C, Li Y Y, Marturi N, Stolkin R (2020) Multi-scale adaptive feature fusion network for semantic segmentation in remote sensing images. Remote Sens 12(5): 872
Mark S, Howard A, Zhu M L, Andrey Z, Liang C C (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. Proc IEEE Conf Comput Vis Pattern Recogn:4510–4520
Christian S, Liu W, Jia Y Q, Pierre S, Scott R, Dragomir A (2015) Going deeper with convolutions. Proc IEEE Conf Comput Vis Pattern Recogn:1–9
Sergey I, Christian S (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. Int Conf Mach Learn:448–456
Zeng Y, Zhang P P, Zhang J M, Lin Zhe, Lu H C (2019) Towards high-resolution salient object detection. Proc IEEE/CVF Int Conf Comput Vis:7234–7243
Perazzi F, Pont-Tuset J, Mcwilliams B (2016) A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Yan Q, Xu L, Shi J P, Jia J Y (2013) Hierarchical saliency detection. 2013 IEEE Conference Computer Vision and Pattern Recognition (CVPR), pp 1155–1162
Wolfgang E, Peter K (2015) Does luminance-contrast contribute to a saliency map for overt visual attention? Eur J Neurosci 17(5):1089–1097
Li G B, Yu Y Z (2015) Visual saliency based on multiscale deep features. Proc IEEE Conf Comput Vis Pattern Recogn:5455–5463
Wang L J, Lu H C, Wang Y F, Feng M, Ruan X (2017) Learning to detect salient objects with image-level supervision. IEEE Conf Comput Vis Pattern Recogn:136–145
Fan D P, Cheng M M, Liu Y, Li T, Ali B (2017) Structure-measure: A new way to evaluate foreground maps. Proc IEEE Int Conf Comput Vis:4548–4557
Zhang Y, Lan Y H, Ren H Z, Li M (2012) Robust frequency-tuned salient region detection. Int J Digit Content Technol Appl 6(20):361–369
Xavier G, Yoshua B (2010) Understanding the difficulty of training deep feedforward neural networks. J Mach Learn Res 9:249–256
Diederik K, Jimmy B (2014) Adam:A method for stochastic optimization. arXiv:1412.6980
Chen S H, Tan X H, Wang B, Hu X L (2018) Reverse attention for salient object detection. Proceedings of the European Conference on Computer Vision (ECCV), pp 234–250
Deng ZJ, Hu XW, Zhu L, Xu XM, Pheng AH (2018) R3 net: Recurrent residual refinement network for saliency detection. International Joint Conference on Artificial Intelligence (IJCAI), pp 684–690
Liu N, Han JW, Yang MH (2018) Picanet: Learning pixel-wise contextual attention for saliency detection. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), pp 684–690
Wang B, Chen Q, Zhou M, Zhang Z Q, Jin X G, Gai K (2020) Progressive feature polishing network for salient object detection. Proc AAAI Conf Artif Intell:12128–12135
Acknowledgements
This research is supported by the National Natural Science Foundation of China (No.62002100), the National Natural Science Foundation of China (No.61802111) and the Science and Technology Foundation of Henan Province of China (No.212102210156).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, J., Yang, Q., Yang, S. et al. Dual-path Processing Network for High-resolution Salient Object Detection. Appl Intell 52, 12034–12048 (2022). https://doi.org/10.1007/s10489-021-02971-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02971-6