Spatiotemporal consistent selection-correction network for deep interactive image segmentation

Li, Yang; Wang, Tao; Ji, Zexuan; Fu, Peng; Shen, Xiaobo; Sun, Quansen

doi:10.1007/s00521-023-08210-y

Spatiotemporal consistent selection-correction network for deep interactive image segmentation

Original Article
Published: 27 January 2023

Volume 35, pages 9725–9738, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Yang Li ORCID: orcid.org/0000-0002-2034-5837¹,
Tao Wang ORCID: orcid.org/0000-0002-8585-5635^1,2,
Zexuan Ji¹,
Peng Fu¹,
Xiaobo Shen¹ &
…
Quansen Sun¹

174 Accesses
Explore all metrics

Abstract

Interactive image segmentation can extract specific targets meeting users’ intention, and has received widespread attention in computer vision. The conventional interactive methods rely too much on the user interaction due to the limitation caused by the hand-crafted low-level features. Recently, deep interactive approaches have significantly improved the segmentation performance thanks to the semantic perception ability. However, in these approaches each interaction is generally treated independently by the same way, regardless of its own intention of each click and the potential relationships among the continuous interactions. The above defects still leads them restricted to the conflict between the interaction quantity and the interaction number. To overcome the above problem, this paper focuses on the click-based interactive segmentation task by explicitly mining the intention of each click and linking the relationships among all clicks. A selection-collection training framework is first established to impose the global object selection and the local error correction roles during the whole interaction process. Then a temporal network architecture is designed to continuously connect the entire click sequence. In this case, the respective role of each click can be played as much as possible, and the spatially varying segmentation cues can be propagated in time series. Experiments on the challenging SBD, GrabCut, DAVIS and Berkeley segmentation datasets demonstrate the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PseudoClick: Interactive Image Segmentation with Click Imitation

Multi-stage Fusion for One-Click Segmentation

Enhanced Spatial Awareness for Deep Interactive Image Segmentation

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

Rother C, Kolmogorov V, Blake A (2004) “grabcut’’ interactive foreground extraction using iterated graph cuts. ACM Trans Gr (TOG) 23(3):309–314
Article Google Scholar
Lempitsky V, Kohli P, Rother C, Sharp T (2009) Image segmentation with a bounding box prior. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 277–284. IEEE
Grady L (2006) Random walks for image segmentation. IEEE Trans Pattern Anal Mach Intell 28(11):1768–1783
Article Google Scholar
Li Y, Sun J, Tang C-K, Shum H-Y (2004) Lazy snapping. ACM Trans Gr (ToG) 23(3):303–308
Article Google Scholar
Wang T, Qi S, Ji Z, Sun Q, Fu P, Ge Q (2020) Error-tolerant label prior for interactive image segmentation. Inf Sci 538:384–395
Article MathSciNet Google Scholar
Xu N, Price B, Cohen S, Yang J, Huang TS (2016) Deep interactive object selection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 373–381
Maninis K-K, Caelles S, Pont-Tuset J, Van Gool L (2018) Deep extreme cut: From extreme points to object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 616–625
Sakinis T, Milletari F, Roth H, Korfiatis P, Kostandy P, Philbrick K, Akkus Z, Xu Z, Xu D, Erickson BJ (2019) Interactive segmentation of medical images through fully convolutional neural networks. arXiv preprint arXiv:1903.08205
Girum KB, Créhange G, Hussain R, Lalande A (2020) Fast interactive medical image segmentation with weakly supervised deep learning method. Int J Comput Assist Radiol Surg 15(9):1437–1444
Article Google Scholar
Boykov YY, Jolly M-P (2001) Interactive graph cuts for optimal boundary & region segmentation of objects in nd images. In: Proceedings Eighth IEEE international conference on computer vision. ICCV 2001, vol. 1, pp. 105–112. IEEE
Bai X, Sapiro G (2007) A geodesic framework for fast interactive image and video segmentation and matting. In: 2007 IEEE 11th international conference on computer vision, pp. 1–8. IEEE
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431–3440
Ji Y, Zhang H, Zhang Z, Liu M (2021) Cnn-based encoder-decoder networks for salient object detection: a comprehensive review and recent advances. Inf Sci 546:835–857
Article MathSciNet Google Scholar
Li Z, Chen Q, Koltun V (2018) Interactive image segmentation with latent diversity. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 577–585
Jang W-D, Kim C-S (2019) Interactive image segmentation via backpropagating refinement scheme. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5297–5306
Sofiiuk K, Petrov I, Barinova O, Konushin A (2020) f-brs: rethinking backpropagating refinement for interactive segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 8623–8632
Zhang S, Liew JH, Wei Y, Wei S, Zhao Y (2020) Interactive object segmentation with inside-outside guidance. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 12234–12244
Mahadevan S, Voigtlaender P, Leibe B (2018) Iteratively trained interactive segmentation. arXiv preprint arXiv:1805.04398
Forte M, Price B, Cohen S, Xu N, Pitié F (2020) Getting to 99% accuracy in interactive segmentation. arXiv preprint arXiv:2003.07932
Lin Z, Zhang Z, Chen L-Z, Cheng M-M, Lu S-P (2020) Interactive image segmentation with first click attention. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 13339–13348
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338
Article Google Scholar
Perazzi F, Pont-Tuset J, McWilliams B, Van Gool L, Gross M, Sorkine-Hornung A (2016) A benchmark dataset and evaluation methodology for video object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 724–732
McGuinness K, O’connor NE (2010) A comparative evaluation of interactive segmentation algorithms. Pattern Recognit 43(2):434–444
Article MATH Google Scholar
Grady L, Funka-Lea G (2004) Multi-label image segmentation for medical applications based on graph-theoretic electrical potentials. In: Computer vision and mathematical methods in medical and biomedical image analysis, pp. 230–245. Springer
Kim TH, Lee KM, Lee SU (2008) Generative image segmentation using random walks with restart. In: European Conference on Computer Vision, pp. 264–275. Springer
Dong X, Shen J, Shao L, Van Gool L (2015) Sub-markov random walk for image segmentation. IEEE Trans Image Process 25(2):516–527
Article MathSciNet MATH Google Scholar
Xu N, Price B, Cohen S, Yang J, Huang T (2017) Deep grabcut for object selection. arXiv preprint arXiv:1707.00243
Majumder S, Yao A (2019) Content-aware multi-level guidance for interactive instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 11602–11611
Sofiiuk K, Barinova O, Konushin A (2019) Adaptis: Adaptive instance selection network. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 7355–7363
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. Adv Neural Inf Process Syst 27:3104–3112
Google Scholar
Srivastava N, Mansimov E, Salakhudinov R (2015) Unsupervised learning of video representations using lstms. In: International conference on machine learning, pp. 843–852
Shi X, Chen Z, Wang H, Yeung D-Y, Wong W-K, Woo W-C (2015) Convolutional lstm network: a machine learning approach for precipitation nowcasting. Adva Neural Inf Process Syst 28:802–810
Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778
Zhang Q, Bai C, Liu Z, Yang LT, Yu H, Zhao J, Yuan H (2020) A gpu-based residual network for medical image classification in smart medicine. Inf Sci 536:91–100
Article MathSciNet Google Scholar
Guo J, He H, He T, Lausen L, Li M, Lin H, Shi X, Wang C, Xie J, Zha S, Zhang A, Zhang H, Zhang Z, Zhang Z, Zheng S, Zhu Y (2020) Gluoncv and gluonnlp: deep learning in computer vision and natural language processing. J Mach Learn Res 21(23):1–7
Google Scholar
Gulshan V, Rother C, Criminisi A, Blake A, Zisserman A (2010) Geodesic star convexity for interactive image segmentation. In: 2010 IEEE computer society conference on computer vision and pattern recognition, pp. 3129–3136. IEEE
Liew J, Wei Y, Xiong W, Ong S-H, Feng J (2017) Regional interactive image segmentation networks. In: 2017 IEEE international conference on computer vision (ICCV), pp. 2746–2754. IEEE Computer Society

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grants 62172221 and 62072241, and in part by the Fundamental Research Funds for the Central Universities under Grant NO. JSGP202204.

Author information

Authors and Affiliations

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, Jiangsu, China
Yang Li, Tao Wang, Zexuan Ji, Peng Fu, Xiaobo Shen & Quansen Sun
Jiangsu Key Laboratory of Spectral Imaging and Intelligent Sense, Nanjing University of Science and Technology, Nanjing, 210094, Jiangsu, China
Tao Wang

Authors

Yang Li
View author publications
You can also search for this author in PubMed Google Scholar
Tao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zexuan Ji
View author publications
You can also search for this author in PubMed Google Scholar
Peng Fu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaobo Shen
View author publications
You can also search for this author in PubMed Google Scholar
Quansen Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tao Wang.

Ethics declarations

Conflict of interest

We declare that we have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, Y., Wang, T., Ji, Z. et al. Spatiotemporal consistent selection-correction network for deep interactive image segmentation. Neural Comput & Applic 35, 9725–9738 (2023). https://doi.org/10.1007/s00521-023-08210-y

Download citation

Received: 04 November 2021
Accepted: 06 January 2023
Published: 27 January 2023
Issue Date: May 2023
DOI: https://doi.org/10.1007/s00521-023-08210-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Spatiotemporal consistent selection-correction network for deep interactive image segmentation

Abstract

Access this article

Similar content being viewed by others

PseudoClick: Interactive Image Segmentation with Click Imitation

Multi-stage Fusion for One-Click Segmentation

Enhanced Spatial Awareness for Deep Interactive Image Segmentation

Data availability

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Spatiotemporal consistent selection-correction network for deep interactive image segmentation

Abstract

Access this article

Similar content being viewed by others

PseudoClick: Interactive Image Segmentation with Click Imitation

Multi-stage Fusion for One-Click Segmentation

Enhanced Spatial Awareness for Deep Interactive Image Segmentation

Data availability

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation