Skip to main content
Log in

Dual-path Processing Network for High-resolution Salient Object Detection

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

In the trend of deep learning, salient object detection (SOD) has achieved preeminent performance over the years. Nevertheless, scant work has been done to address the contradiction between excellent saliency map segmentation and the computational consumption required by high-resolution input images (e.g., 1024×2048 pixels or more). To meet this challenge, we introduce a dual-path processing network (DPPNet) to detect and segment salient objects in high-resolution input images directly and efficiently, the network contains global context and spatial details paths. Specifically, the global context path utilizes a multi feature extraction and enhanced (MFEE) module to extract richer global multiscale semantic features with a large receptive field at a lower resolution. The spatial details path employs a boundary information guided (BIG) module to focus on accurate saliency objects location and maintain local boundary information at a higher resolution. Guided by the BIG module, a feature fusion unit (FFU) is further employed to heighten the spatial consistency of maps at different levels and boost the robustness of the network. Extensive evaluations on two high-resolution SOD datasets and four low-resolution SOD mainstream datasets indicate that the method we proposed can settle the challenge of high-resolution image input effectively and exceed ten top-notch comparison algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. Our source code will be available at https://github.com/YQP-CV/DPPNet

References

  1. Li Z Y, Liu G G, Zhang D, Xu Y (2016) Robust single-object image segmentation based on salient transition region. Pattern Recogn 52:317–331

    Article  Google Scholar 

  2. Zhi H, Shen J, Hong B (2018) Saliency driven region-edge-based top down level set evolution reveals the asynchronous focus in image segmentation. Pattern Recogn: J Pattern Recogn Soc 80:241–255

    Article  Google Scholar 

  3. Hong S, You T, Kwak S, Han B (2015) Online tracking by learning discriminative saliency map with convolutional neural network. Int Conf Mach Learn:597–606

  4. Lai BS, Gong SJ (2016) Saliency guided dictionary learning for weakly-supervised image parsing. Comput Vis Pattern Recogn:3630–3639

  5. Zhu J Y, Wu J J, Wei Y C, Chang E, Tu Z W (2012) Unsupervised object class discovery via saliency-guided multiple class learning. Comput Vis Pattern Recogn 37(4):862–875

    Google Scholar 

  6. Bi H B, Lu D, Zhu H H, Yang L N (2020) STA-Net: spatial-temporal attention network for video salient object detection. Appl Intell 51:3450–3459

    Article  Google Scholar 

  7. Shen J B, Peng J T, Shao L (2018) Submodular trajectories for better motion segmentation in videos. IEEE Trans Image Process 27(6):2688–2700

    Article  MathSciNet  Google Scholar 

  8. Wang W G, Shen J B, Ling H B (2018) A deep network solution for attention and aesthetics aware photo ropping. IEEE Trans Pattern Anal Mach Intell 41(7):1531–1544

    Article  Google Scholar 

  9. Cheng M M, Mitra N, Huang X L, Torr P H, Hu S M (2014) Global contrast based salient region detection. IEEE Trans Pattern Anal Mach Intell 37(3):569–582

    Article  Google Scholar 

  10. Peng H W, Li B, Ling H B, Hu W M, Maybank S J (2016) Salient object detection via structured matrix decomposition. IEEE Trans Pattern Anal Mach Intell 39(4):818–832

    Article  Google Scholar 

  11. Zhang Q, Lin J J, Li W J, Shi Y J, Cao G G (2018) Salient object detection via compactness and objectness cues. Vis Comput 34(4):473–489

    Article  Google Scholar 

  12. Zhang Q, Lin J J, Tao Y Y, Li W J, Shi Y J (2017) Salient object detection via color and texture cues. Neurocomputing 243:35–48

    Article  Google Scholar 

  13. Shelhamer E, Long J T, Darrell T (2017) Fully Convolutional Networks for Semantic Segmentation. IEEE Comput Soc:3431–3440

  14. Fan DP, Lin Z, Ji GP, Zhang DG, Cheng MM (2020) Taking a deeper look at co-salient object detection. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 862–875

  15. Liu J J, Hou Q B, Cheng M M, Feng J S, Jiang J J (2019) A simple pooling-based design for real-time salient object detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3917–3926

  16. Li J X, Pan Z F, Liu QS, Wang ZY (2020) Stacked u-shape network with channel-wise attention for salient object detection. IEEE Transactions on Multimedia. https://doi.org/10.1109/TMM.2020.2997192

  17. Zhang P P, Wang D, Lu H C, Wang H Y, Ruan X (2017) Amulet: Aggregating multi-level convolutional features for salient object detection. Proceedings of the IEEE International Conference on Computer Vision, pp 202–211

  18. Hou QB, Cheng MM, Hu XW, Borji A, Torr P (2017) Deeply supervised salient object detection with short connections. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3203–3212

  19. Luo ZM, Mishra A, Achkar A, Eichel J, Marc P (2017) Non-local deep features for salient object detection. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 6609–6617

  20. Poudel R P, Liwicki S, Cipolla R (2019) Fast-scnn: fast semantic segmentation network. arXiv:1902.04502

  21. Zhao HS, Qi XJ, Shen XY, Shi JP, Jia JY (2018) ICNet for Real-Time Semantic Segmentation on High-Resolution Images. 2018,the European conference on computer vision (ECCV), pp 405–420

  22. Lin G S, Milan A, Shen C H, Reid I (2017) Refinenet: Multi-path refinement networks for high-resolution semantic segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1925–1934

  23. Wang W G, Lai Q X, Fu H Z, Shen J B, Ling H B (2021) Ruigang Salient object detection in the deep learning era: An in-depth survey. IEEE Trans Pattern Anal Mach Intell:220–232

  24. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  25. He K M, Zhang X Y, Ren S Q, Sun J (2016) Deep residual learning for image recognition. Proc IEEE Conf Comput Vis Pattern Recogn:770–778

  26. Huang G, Liu Z, Maaten L V, Weinberger K Q (2017) Densely connected convolutional networks. Proc IEEE Conf Comput Vis Pattern Recogn:4700–4708

  27. Ronneberger O, Fischer P, Thomas B (2015) U-net:Convolutional networks for biomedical image segmentation. Int Conf Med Image Comput Comput-Assisted Intervent:234–241

  28. Zhao J, Liu JJ, Fan DP, Cao Y (2019) EGNet: Edge Guidance Network for Salient Object Detection. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp 8779–8788

  29. Jia F W, Guan J, Qi S H, Li H L (2020) A mix-supervised unified framework for salient object detection. Appl Intell 50(1):2945–2958

    Article  Google Scholar 

  30. Yu C H, Wang J G, Peng C, Gao C X, Yu G, Sang N (2018) Bisenet: Bilateral segmentation network for real-time semantic segmentation. Proc Eur Conf Comput Vis (ECCV):325–341

  31. Zhou HJ, Xie XH, Lai JH, Chen ZX, Yang LX (2020) Interactive two-stream decoder for accurate and fast saliency detection. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 9141–9150

  32. Zhang XN, Wang TT, Qi JQ, Lu HC, Wang G (2018) Progressive attention guided recurrent network for salient object detection. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 9141–9150

  33. Paszke A, Chaurasia A, Kim S, et al. (2016) Enet: A deep neural network architecture for real-time semantic segmentation. arXiv:1606.02147

  34. Li H, Xiong P, Fan H, et al. (2019) Dfanet: Deep feature aggregation for real-time semantic segmentation. Proc IEEE/CVF Conf Comput Vis Pattern Recogn:9522–9531

  35. Mehta S, Rastegari M, Caspi A, et al. (2018) Espnet: Efficient spatial pyramid of dilated convolutions for semantic segmentation. Proceedings Eur Conf Comput Vis (ECCV):552–568

  36. Romera E, Alvarez J M, Bergasa L M et al (2017) Erfnet: Efficient residual factorized convnet for real-time semantic segmentation. Proc IEEE Trans Intell Transp Syst 19(1): 263–272

  37. Wang H R, Fan Y, Wang Z X, Jiao L C (2018) Parameter-free spatial attention network for person re-identification. arXiv:1811.12150

  38. Zhang Y L, Li K P, Li K, Wang L C, Zhong B (2018) Image super-resolution using very deep residual channel attention networks. Proc Eur Conf Comput Vis (ECCV):286–301

  39. Chu X, Yang W, Wan L, Ma C (2017) Multi-context attention for human pose estimation. Proc IEEE Conf Comput Vis Pattern Recognit:1831–1840

  40. Li T P, Song H H, Zhang Z H, Liu Q S (2020) Recurrent reverse attention guided residual learning for saliency object detection. Neurocomputing 389:170–178

    Article  Google Scholar 

  41. Howard A G, Zhu M L, Chen B, Dmitry K, Wang W J (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861

  42. Fisher Y, Vladlen K (2015) Multi-scale context aggregation by dilated convolutions. arXiv:1511.07122

  43. Chen L C, Zhu Y K, George P, Florian S, Hartwig A (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. Proc Eur Conf Comput Vis (ECCV):801–818

  44. Zhao HS, Shi JP, Qi XJ, Wang XJ, Jia XJ (2017) Pyramid scene parsing network. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2881–2890

  45. He K M, Zhang X Y, Ren S Q, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–16

    Article  Google Scholar 

  46. Lin D, Ji Y F, Dani L, Daniel C O, Huang H (2018) Multi-scale context intertwining for semantic segmentation. Proc Eur Conf Comput Vis (ECCV):603–619

  47. Shang R H, Zhang J Y, Jiao L C, Li Y Y, Marturi N, Stolkin R (2020) Multi-scale adaptive feature fusion network for semantic segmentation in remote sensing images. Remote Sens 12(5): 872

    Article  Google Scholar 

  48. Mark S, Howard A, Zhu M L, Andrey Z, Liang C C (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. Proc IEEE Conf Comput Vis Pattern Recogn:4510–4520

  49. Christian S, Liu W, Jia Y Q, Pierre S, Scott R, Dragomir A (2015) Going deeper with convolutions. Proc IEEE Conf Comput Vis Pattern Recogn:1–9

  50. Sergey I, Christian S (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. Int Conf Mach Learn:448–456

  51. Zeng Y, Zhang P P, Zhang J M, Lin Zhe, Lu H C (2019) Towards high-resolution salient object detection. Proc IEEE/CVF Int Conf Comput Vis:7234–7243

  52. Perazzi F, Pont-Tuset J, Mcwilliams B (2016) A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  53. Yan Q, Xu L, Shi J P, Jia J Y (2013) Hierarchical saliency detection. 2013 IEEE Conference Computer Vision and Pattern Recognition (CVPR), pp 1155–1162

  54. Wolfgang E, Peter K (2015) Does luminance-contrast contribute to a saliency map for overt visual attention? Eur J Neurosci 17(5):1089–1097

  55. Li G B, Yu Y Z (2015) Visual saliency based on multiscale deep features. Proc IEEE Conf Comput Vis Pattern Recogn:5455–5463

  56. Wang L J, Lu H C, Wang Y F, Feng M, Ruan X (2017) Learning to detect salient objects with image-level supervision. IEEE Conf Comput Vis Pattern Recogn:136–145

  57. Fan D P, Cheng M M, Liu Y, Li T, Ali B (2017) Structure-measure: A new way to evaluate foreground maps. Proc IEEE Int Conf Comput Vis:4548–4557

  58. Zhang Y, Lan Y H, Ren H Z, Li M (2012) Robust frequency-tuned salient region detection. Int J Digit Content Technol Appl 6(20):361–369

    Google Scholar 

  59. Xavier G, Yoshua B (2010) Understanding the difficulty of training deep feedforward neural networks. J Mach Learn Res 9:249–256

    Google Scholar 

  60. Diederik K, Jimmy B (2014) Adam:A method for stochastic optimization. arXiv:1412.6980

  61. Chen S H, Tan X H, Wang B, Hu X L (2018) Reverse attention for salient object detection. Proceedings of the European Conference on Computer Vision (ECCV), pp 234–250

  62. Deng ZJ, Hu XW, Zhu L, Xu XM, Pheng AH (2018) R3 net: Recurrent residual refinement network for saliency detection. International Joint Conference on Artificial Intelligence (IJCAI), pp 684–690

  63. Liu N, Han JW, Yang MH (2018) Picanet: Learning pixel-wise contextual attention for saliency detection. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), pp 684–690

  64. Wang B, Chen Q, Zhou M, Zhang Z Q, Jin X G, Gai K (2020) Progressive feature polishing network for salient object detection. Proc AAAI Conf Artif Intell:12128–12135

Download references

Acknowledgements

This research is supported by the National Natural Science Foundation of China (No.62002100), the National Natural Science Foundation of China (No.61802111) and the Science and Technology Foundation of Henan Province of China (No.212102210156).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wanjun Zhang.

Ethics declarations

Conflict of Interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, J., Yang, Q., Yang, S. et al. Dual-path Processing Network for High-resolution Salient Object Detection. Appl Intell 52, 12034–12048 (2022). https://doi.org/10.1007/s10489-021-02971-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02971-6

Keywords

Navigation