Skip to main content
Log in

Camouflaged Object Segmentation with Omni Perception

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Camouflaged object segmentation (COS) is a very challenging task due to the deceitful appearances of the candidate objects to the noisy backgrounds. Most existing state-of-the-art methods mimic the first-positioning-then-focus mechanism of predators, but still fail in positioning camouflaged objects in cluttered scenes or delineating their boundaries. The key reason is that their methods do not have a comprehensive understanding of the scene when they spot and focus on the objects, so that they are easily attracted by local surroundings. An ideal COS model should be able to process local and global information at the same time, i.e., to have omni perception of the scene through the whole process of camouflaged object segmentation. To this end, we propose to learn the omni perception for the first-positioning-then-focus COS scheme. Specifically, we propose an omni perception network (OPNet) with two novel modules, i.e., the pyramid positioning module (PPM) and dual focus module (DFM). They are proposed to integrate local features and global representations for accurate positioning of the camouflaged objects and focus on their boundaries, respectively. Extensive experiments demonstrate that our method, which runs at 54 fps, significantly outperforms 15 cutting-edge models on 4 challenging datasets under 4 standard metrics. The code will be made publicly available.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Achanta, R., Hemami, S., Estrada, F., & Susstrunk, S. (2009). Frequency-tuned salient region detection. In CVPR.

  • Chen, S., Tan, X., Wang, B., & Hu, X. (2018). Reverse attention for salient object detection. In ECCV.

  • Chen, Z., Xu, Q., Cong, R., & Huang, Q. (2020). Global context-aware progressive aggregation network for salient object detection. In AAAI.

  • Cheng, M. M., Mitra, N. J., Huang, X., Torr, P. H., & Hu, S. M. (2014). Global contrast based salient region detection. IEEE TPAMI.

  • Cheng, X., Xiong, H., Fan, D. P., Zhong, Y., Harandi, M., Drummond, T., & Ge, Z. (2022). Implicit motion handling for video camouflaged object detection. In CVPR.

  • De Boer, P. T., Kroese, D. P., Mannor, S., & Rubinstein, R. Y. (2005). A tutorial on the cross-entropy method. Annals of Operations Research, 134, 19–67. https://doi.org/10.1007/s10479-005-5724-z

    Article  MathSciNet  MATH  Google Scholar 

  • Deng, Z., Hu, X., Zhu, L., Xu, X., Qin, J., Han, G., & Heng, P. A. (2018). R3net: Recurrent residual refinement network for saliency detection. In IJCAI.

  • Fan, D. P., Cheng, M. M., Liu, Y., Li, T., & Borji, A. (2017). Structure-measure: A new way to evaluate foreground maps. In ICCV.

  • Fan, D. P., Ji, G. P., Cheng, M. M., & Shao, L. (2021). Concealed object detection. IEEE TPAMI.

  • Fan, D. P., Ji, G. P., Qin, X., & Cheng, M. M. (2021). Cognitive vision inspired object segmentation metric and loss function. Scientia Sinica Informationis, 51(9), 1475. https://doi.org/10.1360/SSI-2020-0370

    Article  Google Scholar 

  • Fan, D. P., Ji, G. P., Sun, G., Cheng, M. M., Shen, J., & Shao, L. (2020). Camouflaged object detection. In CVPR.

  • Fan, D. P., Ji, G. P., Zhou, T., Chen, G., Fu, H., Shen, J., & Shao, L. (2020). Pranet: Parallel reverse attention network for polyp segmentation. In MICCAI.

  • Feng, M., Lu, H., & Ding, E. (2019). Attentive feedback network for boundary-aware salient object detection. In CVPR.

  • He, J., Deng, Z., & Qiao, Y. (2019). Dynamic multi-scale filters for semantic segmentation. In ICCV.

  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In CVPR.

  • Hou, Q., Cheng, M. M., Hu, X., Borji, A., Tu, Z., & Torr, P. (2019). Deeply supervised salient object detection with short connections. IEEE TPAMI.

  • Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. In CVPR.

  • Hu, X., Zhu, L., Fu, C. W., Qin, J., & Heng, P. A. (2018). Direction-aware spatial context features for shadow detection. In CVPR.

  • Hu, X., Zhu, L., Fu, C. W., Qin, J., & Heng, P. A. (2018). Direction-aware spatial context features for shadow detection. In CVPR.

  • Huang, Z., Wang, X., Huang, L., Huang, C., Wei, Y., & Liu, W. (2019). Ccnet: Criss-cross attention for semantic segmentation. In ICCV.

  • Krähenbühl, P., & Koltun, V. (2011). Efficient inference in fully connected crfs with gaussian edge potentials. In NeurIPS.

  • Le, T. N., Nguyen, T. V., Nie, Z., Tran, M. T., & Sugimoto, A. (2019). Anabranch network for camouflaged object segmentation. Computer Vision and Image Understanding, 184, 45–56.

    Article  Google Scholar 

  • Li, A., Zhang, J., Lv, Y., Liu, B., Zhang, T., & Dai, Y. (2021). Uncertainty-aware joint salient object and camouflaged object detection. In CVPR.

  • Li, X., Song, D., & Dong, Y. (2020). Hierarchical feature fusion network for salient object detection. IEEE Transactions on Image Processing, 29, 9165–9175.

    Article  MATH  Google Scholar 

  • Li, Z., Lang, C., Liew, J. H., Li, Y., Hou, Q., & Feng, J. (2021). Cross-layer feature pyramid network for salient object detection. IEEE Transactions on Image Processing, 30, 4587–4598.

    Article  Google Scholar 

  • Liu, N., Han, J., & Yang, M. H. (2018). Picanet: Learning pixel-wise contextual attention for saliency detection. In CVPR.

  • Liu, N., Zhang, N., Wan, K., Shao, L., & Han, J. (2021). Visual saliency transformer. In ICCV.

  • Liu, W., Rabinovich, A., & Berg, A. C. (2015). Parsenet: Looking wider to see better. arXiv:1506.04579

  • Liu, Y., Han, J., Zhang, Q., & Shan, C. (2019). Deep salient object detection with contextual information guidance. IEEE Transactions on Image Processing, 29, 360–374.

    Article  MathSciNet  MATH  Google Scholar 

  • Liu, Y., Long, C., Zhang, Z., Liu, B., Zhang, Q., Yin, B., & Yang, X. (2022). Explore contextual information for 3d scene graph generation. IEEE Transactions on Visualization and Computer Graphics. https://doi.org/10.1109/TVCG.2022.3219451

    Article  Google Scholar 

  • Liu, Y., Xie, J., Shi, X., Qiao, Y., Huang, Y., Tang, Y., & Yang, X. (2021). Tripartite information mining and integration for image matting. In ICCV.

  • Lv, Y., Zhang, J., Dai, Y., Li, A., Liu, B., Barnes, N., & Fan, D. P. (2021). Simultaneously localize, segment and rank the camouflaged objects. In CVPR.

  • Margolin, R., Zelnik-Manor, L., & Tal, A. (2014). How to evaluate foreground maps? In CVPR.

  • Mattyus, G., Luo, W., & Urtasun, R. (2017). Deeproadmapper: Extracting road topology from aerial images. In ICCV.

  • Mei, H., Dong, B., Dong, W., Peers, P., Yang, X., Zhang, Q., & Wei, X. (2021). Depth-aware mirror segmentation. In CVPR.

  • Mei, H., Dong, B., Dong, W., Yang, J., Baek, S. H., Heide, F., Peers, P., Wei, X., & Yang, X. (2022). Glass segmentation using intensity and spectral polarization cues. In CVPR.

  • Mei, H., Ji, G., Wei, Z., Yang, X., Wei, X., & Fan, D. (2021). Camouflaged object segmentation with distraction mining. In CVPR.

  • Mei, H., Liu, Y., Wei, Z., Zhou, D., Wei, X., Zhang, Q., & Yang, X. (2021). Exploring dense context for salient object detection. IEEE Transactions on Circuits and Systems for Video Technology, 32, 1378–1389.

    Article  Google Scholar 

  • Mei, H., Wang, Z., Yang, X., Wei, X., & Delbruck, T. (2023). Deep polarization reconstruction with pdavis events. In CVPR.

  • Mei, H., Yang, X., Wang, Y., Liu, Y., He, S., Zhang, Q., Wei, X., & Lau, R. W. (2020). Don’t hit me! glass detection in real-world scenes. In CVPR.

  • Mei, H., Yang, X., Yu, L., Zhang, Q., Wei, X., & Lau, R. W. (2022). Large-field contextual feature learning for glass detection. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2022.3181973

    Article  Google Scholar 

  • Mei, H., Yang, X., Zhou, Y., Ji, G. P., Wei, X., Fan, D. P. (2023). Distraction-aware camouflaged object segmentation. In SCIENTIA SINICA Informationis (SSI).

  • Mei, H., Yu, L., Xu, K., Wang, Y., Yang, X., Wei, X., & Lau, R. W. (2022). Mirror segmentation via semantic-aware contextual contrasted feature learning. ACM Transactions on Multimedia Computing, Communications and Applications, 19, 1–22.

    Article  Google Scholar 

  • Pang, Y., Zhao, X., Xiang, T. Z., Zhang, L., & Lu, H. (2022). Zoom in and out: A mixed-scale triplet network for camouflaged object detection. In CVPR.

  • Pang, Y., Zhao, X., Zhang, L., & Lu, H. (2020). Multi-scale interactive network for salient object detection. In CVPR.

  • Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al. (2019). Pytorch: An imperative style, high-performance deep learning library. In NeurIPS.

  • Peng, Z., Huang, W., Gu, S., Xie, L., Wang, Y., Jiao, J., & Ye, Q. (2021). Conformer: Local features coupling global representations for visual recognition. In ICCV.

  • Qiao, Y., Liu, Y., Yang, X., Zhou, D., Xu, M., Zhang, Q., & Wei, X. (2020). Attention-guided hierarchical structure aggregation for image matting. In CVPR.

  • Qin, X., Zhang, Z., Huang, C., Gao, C., Dehghan, M., & Jagersand, M. (2019). Basnet: Boundary-aware salient object detection. In CVPR.

  • Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional networks for biomedical image segmentation. In MICCAI.

  • Skurowski, P., Abdulameer, H., Błaszczyk, J., Depta, T., Kornacki, A., & Kozieł, P. (2018). Animal camouflage analysis: Chameleon database. Unpublished Manuscript.

  • Su, J., Li, J., Zhang, Y., Xia, C., & Tian, Y. (2019). Selectivity or invariance: Boundary-aware salient object detection. In ICCV

  • Tian, X., Xu, K., Yang, X., Du, L., Yin, B., & Lau, R. W. (2022). Bi-directional object-context prioritization learning for saliency ranking. In CVPR.

  • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. In NeurIPS.

  • Wang, W., Shen, J., Cheng, M. M., & Shao, L. (2019). An iterative and cooperative top-down and bottom-up inference network for salient object detection. In CVPR.

  • Wang, W., Zhao, S., Shen, J., Hoi, S. C. H., & Borji, A. (2019). Salient object detection with pyramid attention and salient edges. In CVPR.

  • Wang, X., Ma, H., Chen, X., & You, S. (2017). Edge preserving and multi-scale contextual neural network for salient object detection. IEEE Transactions on Image Processing, 27(1), 121–134.

    Article  MathSciNet  MATH  Google Scholar 

  • Wang, Y., Zhao, X., Hu, X., Li, Y., & Huang, K. (2019). Focal boundary guided salient object detection. IEEE Transactions on Image Processing, 28(6), 2813–2824.

    Article  MathSciNet  MATH  Google Scholar 

  • Wang, Z., Hu, Y., & Liu, S. C. (2022). Exploiting spatial sparsity for event cameras with visual transformers. In ICIP.

  • Wang, Z., Xiang, D., Hou, S., & Wu, F. (2016). Background-driven salient object detection. IEEE Transactions on Multimedia, 19(4), 750–762.

    Article  Google Scholar 

  • Wei, J., Wang, S., & Huang, Q. (2020). F3net: Fusion, feedback and focus for salient object detection. In AAAI.

  • Woo, S., Park, J., Lee, J. Y., & So Kweon, I. (2018). Cbam: Convolutional block attention module. In ECCV.

  • Xu, K., Yang, X., Yin, B., & Lau, R. W. (2020). Learning to restore low-light images via decomposition-and-enhancement. In CVPR.

  • Yan, J., Le, T. N., Nguyen, K. D., Tran, M. T., Do, T. T., & Nguyen, T. V. (2020). Mirrornet: Bio-inspired adversarial attack for camouflaged object segmentation. arXiv:2007.12881

  • Yang, M., Yu, K., Zhang, C., Li, Z., & Yang, K. (2018). Denseaspp for semantic segmentation in street scenes. In CVPR.

  • Yang, X., Mei, H., Xu, K., Wei, X., Yin, B., & Lau, R. W. (2019). Where is my mirror? In ICCV.

  • Yang, X., Mei, H., Zhang, J., Xu, K., Yin, B., Zhang, Q., & Wei, X. (2019). Drfn: Deep recurrent fusion network for single-image super-resolution with large factors. IEEE Transactions on Multimedia, 21, 328–337.

    Article  Google Scholar 

  • Yu, L., Mei, H., Dong, W., Wei, Z., Zhu, L., Wang, Y., & Yang, X. (2022). Progressive glass segmentation. IEEE Transactions on Image Processing, 31, 2920–2933.

    Article  Google Scholar 

  • Zhai, Q., Li, X., Yang, F., Chen, C., Cheng, H., & Fan, D. P. (2021). Mutual graph learning for camouflaged object detection. In CVPR.

  • Zhang, J., Ding, R., Ban, M., & Guo, T. (2022). Fdsnet: An accurate real-time surface defect segmentation network. In ICASSP ICASSP.

  • Zhang, J., Long, C., Wang, Y., Yang, X., Mei, H., & Yin, B. (2020). Multi-context and enhanced reconstruction network for single image super resolution. In ICME.

  • Zhang, L., Dai, J., Lu, H., He, Y., & Wang, G. (2018). A bi-directional message passing model for salient object detection. In CVPR.

  • Zhang, P., Wang, D., Lu, H., Wang, H., & Ruan, X. (2017). Amulet: Aggregating multi-level convolutional features for salient object detection. In ICCV.

  • Zhang, X., Wang, T., Qi, J., Lu, H., & Wang, G. (2018). Progressive attention guided recurrent network for salient object detection. In CVPR.

  • Zhao, H., Shi, J., Qi, X., Wang, X., & Jia, J. (2017). Pyramid scene parsing network. In CVPR.

  • Zhao, T., & Wu, X. (2019). Pyramid feature attention network for saliency detection. In CVPR.

  • Zhou, H., Xie, X., Lai, J. H., Chen, Z., & Yang, L. (2020). Interactive two-stream decoder for accurate and fast saliency detection. In CVPR.

  • Zhou, S., Wang, J., Wang, L., Zhang, J., Wang, F., Huang, D., & Zheng, N. (2020). Hierarchical and interactive refinement network for edge-preserving salient object detection. IEEE Transactions on Image Processing, 30, 1–14.

  • Zhu, L., Deng, Z., Hu, X., Fu, C. W., Xu, X., Qin, J., & Heng, P. A. (2018). Bidirectional feature pyramid network with recurrent attention residual modules for shadow detection. In ECCV.

Download references

Acknowledgements

This work was supported in part by National Key Research and Development Program of China (2022ZD0210500), the National Natural Science Foundation of China under Grants 61972067 /U21A20491, and the Distinguished Young Scholars Funding of Dalian (No. 2022RJ01).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xin Yang.

Additional information

Communicated by Karteek Alahari, Ph.D.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mei, H., Xu, K., Zhou, Y. et al. Camouflaged Object Segmentation with Omni Perception. Int J Comput Vis 131, 3019–3034 (2023). https://doi.org/10.1007/s11263-023-01838-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-023-01838-2

Keywords

Navigation