A Multi-View features hinged siamese U-Net for image Co-segmentation

Li, Yushuo; Liu, Xiabi; Gong, Xiaopeng; Wang, Murong

doi:10.1007/s11042-020-08794-w

A Multi-View features hinged siamese U-Net for image Co-segmentation

Published: 19 March 2020

Volume 80, pages 22965–22985, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Yushuo Li¹,
Xiabi Liu¹,
Xiaopeng Gong¹ &
…
Murong Wang²

574 Accesses
2 Citations
Explore all metrics

Abstract

This paper proposes a new U-shape structure to extract multi-view features from different images which is incorporated into the network that can be trained end-to-end for image co-segmentation task. The multi-view features integrate global correlations between images, so we can segment the common objects in different images from the features directly. Before getting the multi-view features, we extract the deep features of input images through two weights-shared streams. Then we get pixel-level similarity maps through a similarity layer from deep features. The whole architecture is a Siamese U-net hinged by multi-view features, called iMFNet for short. We further introduce Dice loss and employ both positive and negative examples to train the whole network. Furthermore, a learnable conditional random field (CRF) layer is added to iMFNet for more accurate results. Using the training data from MSRC and PASCAL VOC 2012 datasets, the iMFNet achieves the state-of-the-art performance on the Internet datasets and the competitive performance on the iCoseg datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

InfoSeg: Unsupervised Semantic Image Segmentation with Mutual Information Maximization

Self-supervised Multi-view Clustering for Unsupervised Image Segmentation

Automatic image co-segmentation: a survey

Article 26 April 2021

Xiabi Liu & Xin Duan

References

Ahmad M, Lee S W (2006) Human action recognition using multi-view image sequences. In: International conference on pattern recognition
Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis & Machine Intelligence (39):2481–2495
Batra D, Kowdle A, Parikh D, Luo J, Chen T (2010) icoseg: Interactive co-segmentation with intelligent scribble guidance. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3169–3176
Chandra S, Kokkinos I (2016) Fast, exact and multi-scale inference for semantic image segmentation with deep gaussian crfs. In: Proceedings of the European conference on computer vision, pp 402–418
Chen L C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article Google Scholar
Choy CB, Gwak J, Savarese S, Chandraker M (2016) Universal correspondence network. In: Advances in neural information processing systems, pp 2414–2422
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 248–255
Dosovitskiy A, Fischer P, Ilg E, Hausser P, Hazirbas C, Golkov V, Van Der Smagt P, Cremers D, Brox T (2015) Flownet: Learning optical flow with convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 2758–2766
Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2012) The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results. http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html
Faktor A, Irani M (2013) Co-segmentation by composition. In: Proceedings of the IEEE international conference on computer vision, pp 1297–1304
Fu H, Xu D, Lin S, Liu J (2015) Object-based rgbd image co-segmentation with mutex constraint. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4428–4436
Han J, Quan R, Zhang D, Nie F (2018) Robust object co-segmentation using background prior. IEEE Trans Image Process 27(4):1639–1651
Article MathSciNet Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hong R, Hu Z, Wang R, Wang M, Tao D (2016) Multi-view object retrieval via multi-scale topic models. IEEE Trans Image Process 25(12):5814–5827
Article MathSciNet Google Scholar
Hu YT, Huang JB, Schwing AG (2018) Videomatch: Matching based video object segmentation. In: Proceedings of the European conference on computer vision, pp 54–70
Huang T W, Cai J, Yang H, Hsu H M, Hwang J N (2019) Multi-view vehicle re-identification using temporal attention model and metadata re-ranking. In: Proceedings of The IEEE conference on computer vision and pattern recognition workshops
Thewlis J, Zheng S, Torr P, Vedaldi A (2016) Fully-trainable deep matching. In: Proceedings of the British machine vision conference, pp 145.1–145.12
Jerripothula KR, Cai J, Yuan J (2016) Image co-segmentation via saliency co-fusion. IEEE Trans Multimedia 18(9):1896–1909
Article Google Scholar
Jerripothula KR, Cai J, Lu J, Yuan J (2017) Object co-skeletonization with co-segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3881–3889
Joulin A, Bach F, Ponce J (2010) Discriminative clustering for image co-segmentation. In: 2010 IEEE conference on computer vision and pattern recognition (CVPR), pp 1943–1950
Kim E, Li H, Huang X (2012) A hierarchical image clustering cosegmentation framework. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR), pp 686–693
Kim G, Xing EP, Fei-Fei L, Kanade T (2011) Distributed cosegmentation via submodular optimization on anisotropic diffusion. In: 2011 international conference on computer vision, pp 169–176
Kim S, Min D, Ham B, Jeon S, Lin S, Sohn K (2017) Fcss: Fully convolutional self-similarity for dense semantic correspondence. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6560–6569
Le W, Gang H, Sukthankar R, Xue J, Zheng N (2014) Video object discovery and co-segmentation with extremely weak supervision. In: Proceedings of the European conference on computer vision
Lee C, Jang WD, Sim JY, Kim CS (2015) Multiple random walkers and their application to image cosegmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3837–3845
Li L, Liu Z, Zhang J (2018) Unsupervised image co-segmentation via guidance of simple images. Neurocomputing 275:1650–1661
Article Google Scholar
Li S, Shao M, Fu Y (2017) Person re-identification by cross-view multi-level dictionary learning. IEEE Trans pattern Anal Mach Intell 40(12):2963–2977
Article Google Scholar
Li W, Jafari O H, Rother C (2018) Deep object co-segmentation. In: ACCV
Li Y, Liu J, Li Z, Lu H, Ma S (2016) Object co-segmentation via salient and common regions discovery. Neurocomputing 172:225–234
Article Google Scholar
Liu L, Li K, Liao X (2017) Image co-segmentation by co-diffusion. Circ Syst Signal Process 36(11):4423–4440
Article MathSciNet Google Scholar
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Lu X, Ma C, Ni B, Yang X, Reid I, Yang MH (2018) Deep regression tracking with shrinkage loss. In: Proceedings of the European conference on computer vision, pp 353–369
Lu X, Wang W, Ma C, Shen J, Shao L, Porikli F (2019) See more, know more: Unsupervised video object segmentation with co-attention siamese networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3623–3632
Luo W, Schwing AG, Urtasun R (2016) Efficient deep learning for stereo matching. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5695–5703
Ma J, Li S, Qin H, Hao A (2017) Unsupervised multi-class co-segmentation via joint-cut over l₁-manifold hyper-graph of discriminative image regions. IEEE Trans Image Process 26(3):1216–1230
Article MathSciNet Google Scholar
Meng F, Cai J, Li H (2016) Cosegmentation of multiple image groups. Comput Vis Image Underst 146:67–76
Article Google Scholar
Milletari F, Navab N, Ahmadi SA (2016) V-net: Fully convolutional neural networks for volumetric medical image segmentation. In: Fourth international conference on 3D vision (3DV), pp 565–571
Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 1520–1528
Quan R, Han J, Zhang D, Nie F (2016) Object co-segmentation via graph optimized-flexible manifold ranking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 687–695
Rocco I, Arandjelovic R, Sivic J (2017) Convolutional neural network architecture for geometric matching. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6148–6157
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 234–241
Rother C, Minka T, Blake A, Kolmogorov V (2006) Cosegmentation of image pairs by histogram matching-incorporating a global constraint into mrfs. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), vol 1, pp 993–1000
Rubinstein M, Joulin A, Kopf J, Liu C (2013) Unsupervised joint object discovery and segmentation in internet images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1939–1946
Rubio JC, Serrat J, López A, Paragios N (2012) Unsupervised co-segmentation through region matching. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 749–756
Shotton J, Winn J, Rother C, Criminisi A (2006) Textonboost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In: Proceedings of the European conference on computer vision, pp 1–15
Sun J, Ponce J (2013) Learning discriminative part detectors for image classification and cosegmentation. In: Proceedings of the IEEE international conference on computer vision, pp 3400–3407
Taniai T, Sinha SN, Sato Y (2016) Joint recovery of dense correspondence and cosegmentation in two images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4246–4255
Tao D, Guo Y, Yu B, Pang J, Yu Z (2017) Deep multi-view feature learning for person re-identification. IEEE Trans Circ Syst Vid Technol 28(10):2657–2666
Article Google Scholar
Tao Z, Liu H, Fu H, Fu Y (2017) Image cosegmentation via saliency-guided constrained clustering with cosine similarity. In: AAAI, pp 4285–4291
Vemulapalli R, Tuzel O, Liu MY, Chellapa R (2016) Gaussian conditional random field network for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3224–3233
Wang C, Zhang H, Yang L, Cao X, Xiong H (2017) Multiple semantic matching on augmented n-partite graph for object co-segmentation. IEEE Trans Image Process 26(12):5825–5839
Article MathSciNet Google Scholar
Wang F, Huang Q, Guibas LJ (2013) Image co-segmentation via consistent functional maps. In: Proceedings of the IEEE international conference on computer vision, pp 849–856
Wang W, Shen J (2016) Higher-order image co-segmentation. IEEE Trans Multimedia 18(6):1011–1021
Article Google Scholar
Wang Z, Feng Y, Qi T, Yang X, Zhang JJ (2016) Adaptive multi-view feature selection for human motion retrieval. Signal Process 120:691–701
Article Google Scholar
Wug OS, Lee JY, Sunkavalli K, Joo KS (2018) Fast video object segmentation by reference-guided mask propagation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7376–7385
Yoon JS, Rameau F, Kim J, Lee S, Shin S, Kweon IS (2017) Pixel-level matching for video object segmentation using convolutional neural networks. In: Proceedings of the IEEE international conference on computer vision, pp 2186–2195
Yuan ZH, Lu T, Wu Y (2017) Deep-dense conditional random fields for object co-segmentation. In: IJCAI, pp 3371–3377
Zbontar J, LeCun Y (2015) Computing the stereo matching cost with a convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1592–1599
Zbontar J, LeCun Y (2016) Stereo matching by training a convolutional neural network to compare image patches. J Mach Learn Res 17(1-32):2
MATH Google Scholar
Zhang Q, Zhou J, Wang Y, Ye J, Li B (2014) Image cosegmentation via multi-task learning. In: Proceedings of the british machine vision conference
Zheng S, Jayasumana S, Romera-Paredes B, Vineet V, Su Z, Du D, Huang C, Torr PH (2015) Conditional random fields as recurrent neural networks. In: Proceedings of the IEEE international conference on computer vision, pp 1529–1537
Zheng W (2014) Multi-view facial expression recognition based on group sparse reduced-rank regression. IEEE Trans Affect Comput 5(1):71–85
Article Google Scholar
Zhou Y, Shao L (2018) Viewpoint-aware attentive multi-view inference for vehicle re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Zhou T, Zhang C, Gong C, Bhaskar H, Yang J (2018) Multiview latent space learning with feature redundancy minimization. IEEE Trans Cybern
Zhou T, Zhang C, Peng X, Bhaskar H, Yang J (2019) Dual shared-specific multiview subspace clustering. IEEE Trans Cybern
Zhou T, Fu H, Chen G, Shen J, Shen J, Shao L (2020) Hi-net: hybrid-fusion network for multi-modal MR image synthesis. IEEE Trans Med Imaging

Download references

Acknowledgements

We gratefully acknowledge the support of NVIDIA Corporation with the donation of the NVIDIA TITAN XP GPU used for this research.

Author information

Authors and Affiliations

School of Computer Science and Technology, Beijing Institute of Technology, No. 5 South Zhongguancun Street, Haidian District, Beijing, China
Yushuo Li, Xiabi Liu & Xiaopeng Gong
Sun Yat-sen University Cancer Center, 651 Dongfeng Road East, Yuexiu District, Guangzhou, China
Murong Wang

Authors

Yushuo Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiabi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaopeng Gong
View author publications
You can also search for this author in PubMed Google Scholar
Murong Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yushuo Li.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, Y., Liu, X., Gong, X. et al. A Multi-View features hinged siamese U-Net for image Co-segmentation. Multimed Tools Appl 80, 22965–22985 (2021). https://doi.org/10.1007/s11042-020-08794-w

Download citation

Received: 12 July 2019
Revised: 19 January 2020
Accepted: 24 February 2020
Published: 19 March 2020
Issue Date: June 2021
DOI: https://doi.org/10.1007/s11042-020-08794-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

A Multi-View features hinged siamese U-Net for image Co-segmentation

Abstract

Access this article

Similar content being viewed by others

InfoSeg: Unsupervised Semantic Image Segmentation with Mutual Information Maximization

Self-supervised Multi-view Clustering for Unsupervised Image Segmentation

Automatic image co-segmentation: a survey

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Multi-View features hinged siamese U-Net for image Co-segmentation

Abstract

Access this article

Similar content being viewed by others

InfoSeg: Unsupervised Semantic Image Segmentation with Mutual Information Maximization

Self-supervised Multi-view Clustering for Unsupervised Image Segmentation

Automatic image co-segmentation: a survey

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation