skip to main content
10.1145/3240876.3240891acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicimcsConference Proceedingsconference-collections
research-article

A novel framework for semantic segmentation with generative adversarial network

Authors Info & Claims
Published:17 August 2018Publication History

ABSTRACT

Semantic segmentation plays an important role in a series of high-level computer vision applications. However, the performance of Convolutional Neural Network (CNN) based segmentation models is currently influenced by higher order inconsistencies, which are mainly caused by the CNNs built-in invariance to spatial transformations and the independent prediction for each of pixel. In this paper, a novel framework, consisting of a segmentation network and a Generative Adversarial Network (GAN), is proposed to tackle this challenging problem by enforcing long-range spatial label contiguity. With the help of fully connected layers in the discriminator and adversarial training, the GAN model can evaluate the higher-order potentials loss. The motivation is that the GAN model provides an auxiliary higher-order potentials loss to the segmentation model, thus the segmentation model have the ability of correcting higher order inconsistencies. Extensive experiments on public benchmarking database demonstrate the effectiveness of the proposed method.

References

  1. 2015. 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7-13, 2015. IEEE Computer Society. http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=7407725Google ScholarGoogle Scholar
  2. 2016. 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016. IEEE Computer Society. http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=7776647Google ScholarGoogle Scholar
  3. 2017. 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017. IEEE Computer Society. http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=8097368Google ScholarGoogle Scholar
  4. Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. TensorFlow: A System for Large-Scale Machine Learning.. In OSDI, Vol. 16. 265--283. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille. 2016. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. CoRR abs/1606.00915 (2016). arXiv:1606.00915 http://arxiv.org/abs/1606.00915Google ScholarGoogle Scholar
  6. Liang-Chieh Chen, George Papandreou, Florian Schroff, and Hartwig Adam. 2017. Rethinking Atrous Convolution for Semantic Image Segmentation. CoRR abs/1706.05587 (2017). arXiv:1706.05587 http://arxiv.org/abs/1706.05587Google ScholarGoogle Scholar
  7. Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele. 2016. The Cityscapes Dataset for Semantic Urban Scene Understanding, See {2}, 3213--3223.Google ScholarGoogle Scholar
  8. Mark Everingham, SM Ali Eslami, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. 2015. The pascal visual object classes challenge: A retrospective. International journal of computer vision 111, 1 (2015), 98--136. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Golnaz Ghiasi and Charless C. Fowlkes. 2016. Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation. In Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part III (Lecture Notes in Computer Science), Bastian Leibe, Jiri Matas, Nicu Sebe, and MaxWelling (Eds.), Vol. 9907. Springer, 519--534.Google ScholarGoogle Scholar
  10. Huihui He and Rui Xia. 2018. Joint Binary Neural Network for Multi-label Learning with Applications to Emotion Classification. arXiv preprint arXiv:1802.00891 (2018).Google ScholarGoogle Scholar
  11. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Image-to-Image Translation with Conditional Adversarial Networks, See {3}, 5967--5976.Google ScholarGoogle Scholar
  12. Philipp Krähenbühl and Vladlen Koltun. 2011. Efficient inference in fully connected crfs with gaussian edge potentials. In Advances in neural information processing systems. 109--117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Ivan Kreso, Denis Causevic, Josip Krapac, and Sinisa Segvic. 2016. Convolutional Scale Invariance for Semantic Segmentation. In Pattern Recognition - 38th German Conference, GCPR 2016, Hannover, Germany, September 12-15, 2016, Proceedings (Lecture Notes in Computer Science), Bodo Rosenhahn and Bjoern Andres (Eds.), Vol. 9796. Springer, 64--75.Google ScholarGoogle Scholar
  14. Guosheng Lin, Chunhua Shen, Anton van den Hengel, and Ian D. Reid. 2016. Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation, See {2}, 3194--3203.Google ScholarGoogle Scholar
  15. Ziwei Liu, Xiaoxiao Li, Ping Luo, Chen Change Loy, and Xiaoou Tang. 2015. Semantic Image Segmentation via Deep Parsing Network, See {1}, 1377--1385. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12, 2015. 3431--3440.Google ScholarGoogle ScholarCross RefCross Ref
  17. Pauline Luc, Camille Couprie, Soumith Chintala, and Jakob Verbeek. 2016. Semantic Segmentation using Adversarial Networks. CoRR abs/1611.08408 (2016). arXiv:1611.08408 http://arxiv.org/abs/1611.08408Google ScholarGoogle Scholar
  18. Hyeonwoo Noh, Seunghoon Hong, and Bohyung Han. 2015. Learning deconvolution network for semantic segmentation. In Proceedings of the IEEE International Conference on Computer Vision. 1520--1528. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Falong Shen, Rui Gan, Shuicheng Yan, and Gang Zeng. 2017. Semantic Segmentation via Structured Patch Prediction, Context CRF and Guidance CRF, See {3}, 5178--5186.Google ScholarGoogle Scholar
  20. Tensorflow. 2018. DeepLab: Deep Labelling for Semantic Image Segmentation. (2018). https://github.com/tensorflow/models/tree/master/research/deeplab.Google ScholarGoogle Scholar
  21. Jinghua Wang, Zhenhua Wang, Dacheng Tao, Simon See, and Gang Wang. 2016. Learning Common and Specific Features for RGB-D Semantic Segmentation with Deconvolutional Networks. In Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part V (Lecture Notes in Computer Science), Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.), Vol. 9909. Springer, 664--679.Google ScholarGoogle Scholar
  22. Panqu Wang, Pengfei Chen, Ye Yuan, Ding Liu, Zehua Huang, Xiaodi Hou, and Garrison W. Cottrell. 2017. Understanding Convolution for Semantic Segmentation. CoRR abs/1702.08502 (2017). arXiv:1702.08502 http://arxiv.org/abs/1702.08502Google ScholarGoogle Scholar
  23. Fisher Yu and Vladlen Koltun. 2015. Multi-Scale Context Aggregation by Dilated Convolutions. CoRR abs/1511.07122 (2015). arXiv:1511.07122 http://arxiv.org/abs/1511.07122Google ScholarGoogle Scholar
  24. Matthew D. Zeiler and Rob Fergus. 2014. Visualizing and Understanding Convolutional Networks. In Computer Vision - ECCV 2014 - 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I (Lecture Notes in Computer Science), David J. Fleet, Tomás Pajdla, Bernt Schiele, and Tinne Tuytelaars (Eds.), Vol. 8689. Springer, 818--833.Google ScholarGoogle Scholar
  25. Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. 2016. Pyramid Scene Parsing Network. CoRR abs/1612.01105 (2016). arXiv:1612.01105 http://arxiv.org/abs/1612.01105Google ScholarGoogle Scholar
  26. Shuai Zheng, Sadeep Jayasumana, Bernardino Romera-Paredes, Vibhav Vineet, Zhizhong Su, Dalong Du, Chang Huang, and Philip H. S. Torr. 2015. Conditional Random Fields as Recurrent Neural Networks, See {1}, 1529--1537. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A novel framework for semantic segmentation with generative adversarial network

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICIMCS '18: Proceedings of the 10th International Conference on Internet Multimedia Computing and Service
      August 2018
      243 pages
      ISBN:9781450365208
      DOI:10.1145/3240876

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 August 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      ICIMCS '18 Paper Acceptance Rate46of116submissions,40%Overall Acceptance Rate163of456submissions,36%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader