skip to main content
10.1145/3240508.3240553acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

StripNet: Towards Topology Consistent Strip Structure Segmentation

Authors Info & Claims
Published:15 October 2018Publication History

ABSTRACT

In this work, we propose to study a special semantic segmentation problem where the targets are long and continuous strip patterns. Strip patterns widely exist in medical images and natural photos, such as retinal layers in OCT images and lanes on the roads, and segmentation of them has practical significance. Traditional pixel-level segmentation methods largely ignore the structure prior of strip patterns and thus easily suffer from the topological inconformity problem, such as holes and isolated islands in segmentation results. To tackle this problem, we design a novel deep framework, StripNet, that leverages the strong end-to-end learning ability of CNNs to predict the structured outputs as a sequence of boundary locations of the target strips. Specifically, StripNet decomposes the original segmentation problem into more easily solved local boundary-regression problems, and takes account of the topological constraints on the predicted boundaries. Moreover, our framework adopts a coarse-to-fine strategy and uses carefully designed heatmaps for training the boundary localization network. We examine StripNet on two challenging strip pattern segmentation tasks, retinal layer segmentation and lane detection. Extensive experiments demonstrate that StripNet achieves excellent results and outperforms state-of-the-art methods in both tasks.

References

  1. Joao Carreira, Rui Caseiro, Jorge Batista, and Cristian Sminchisescu. 2012. Semantic segmentation with second-order pooling. In Proc. ECCV. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Dengfeng Chai, Wolfgang Förstner, and Florent Lafarge. 2013. Recovering line-networks in images by junction-point processes. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on. IEEE, 1894--1901. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2014. Return of the Devil in the Details: Delving Deep into Convolutional Nets. (2014).Google ScholarGoogle Scholar
  4. Liang-Chieh Chen, George Papandreou, Florian Schroff, and Hartwig Adam. 2017. Rethinking Atrous Convolution for Semantic Image Segmentation. CoRR, Vol. abs/1706.05587 (2017). arxiv: 1706.05587 http://arxiv.org/abs/1706.05587Google ScholarGoogle Scholar
  5. Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille. 2015. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. In Proc. ICLR.Google ScholarGoogle Scholar
  6. Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille. 2016. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. arXiv preprint arXiv:1606.00915 (2016).Google ScholarGoogle Scholar
  7. KY Chiu and SF Lin. 2005. Lane detection using color-based segmentation. WOS:000235518700117 (2005). https://ir.nctu.edu.tw/handle/11536/17998Google ScholarGoogle Scholar
  8. Francc ois Chollet. 2016. Xception: Deep Learning with Depthwise Separable Convolutions. CoRR, Vol. abs/1610.02357 (2016). arxiv: 1610.02357 http://arxiv.org/abs/1610.02357Google ScholarGoogle Scholar
  9. Xiao Chu, Wanli Ouyang, Hongsheng Li, and Xiaogang Wang. 2016. Structured Feature Learning for Pose Estimation. CoRR, Vol. abs/1603.09065 (2016). arxiv: 1603.09065 http://arxiv.org/abs/1603.09065Google ScholarGoogle Scholar
  10. Clement Farabet, Camille Couprie, Laurent Najman, and Yann LeCun. 2013. Learning hierarchical features for scene labeling. TPAMI, Vol. 35, 8 (2013), 1915--1929. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Mona Kathryn Garvin, Michael David Abramoff, Xiaodong Wu, Stephen R Russell, Trudy L Burns, and Milan Sonka. 2009. Automated 3-D intraretinal layer segmentation of macular spectral-domain optical coherence tomography images. IEEE transactions on medical imaging, Vol. 28, 9 (2009), 1436--1447.Google ScholarGoogle Scholar
  12. Raghuraman Gopalan, Tsai Hong, Michael Shneier, and Rama Chellappa. 2012. A Learning Approach Towards Detection and Tracking of Lane Markings. Technical Report. IEEE Transactions on Intelligent Transportation Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Bei He, Rui Ai, Yang Yan, and Xianpeng Lang. 2016a. Accurate and robust lane detection based on dual-view convolutional neutral network. In Intelligent Vehicles Symposium (IV), 2016 IEEE. IEEE, 1041--1046.Google ScholarGoogle Scholar
  14. Kaiming He, Georgia Gkioxari, Piotr Dollá r, and Ross B. Girshick. 2017b. Mask R-CNN. CoRR, Vol. abs/1703.06870 (2017). arxiv: 1703.06870 http://arxiv.org/abs/1703.06870Google ScholarGoogle Scholar
  15. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016b. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarGoogle ScholarCross RefCross Ref
  16. Yufan He, Aaron Carass, Yeyi Yun, Can Zhao, Bruno M. Jedynak, Sharon D. Solomon, Shiv Saidha, Peter A. Calabresi, and Jerry L. Prince. 2017a. Towards Topological Correct Segmentation of Macular OCT from Cascaded FCNs. (2017).Google ScholarGoogle Scholar
  17. Brody Huval, Tao Wang, Sameep Tandon, Jeff Kiske, Will Song, Joel Pazhayampallil, Mykhaylo Andriluka, Pranav Rajpurkar, Toki Migimatsu, Royce Cheng-Yue, et almbox. 2015. An empirical evaluation of deep learning on highway driving. arXiv preprint arXiv:1504.01716 (2015).Google ScholarGoogle Scholar
  18. Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional Architecture for Fast Feature Embedding. arXiv preprint arXiv:1408.5093 (2014).Google ScholarGoogle Scholar
  19. Claudio Rosito Jung and Christian Roberto Kelber. 2004. A robust linear-parabolic model for lane following. In Computer Graphics and Image Processing, 2004. Proceedings. 17th Brazilian Symposium on. IEEE, 72--79. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Byungsoo Kim, Oliver Wang, A. Cengiz Öztireli, and Markus Gross. 2018. Semantic Segmentation for Line Drawing Vectorization Using Neural Networks. Computer Graphics Forum (Proc. Eurographics), Vol. 37, 2 (2018), 329--338.Google ScholarGoogle ScholarCross RefCross Ref
  21. Jihun Kim and Minho Lee. 2014. Robust lane detection based on convolutional neural network and random sample consensus. In International Conference on Neural Information Processing. Springer, 454--461.Google ScholarGoogle ScholarCross RefCross Ref
  22. Andrew Lang, Carass Aaron, Hauser Matthew, Elias S Sotirchos, Peter A Calabresi, Howard S Ying, and Jerry L Prince. 2013. Retinal layer segmentation of macular OCT images using boundary classification. Biomedical Optics Express, Vol. 4, 7 (2013), 1133--1152.Google ScholarGoogle ScholarCross RefCross Ref
  23. Seokju Lee, Junsik Kim, Jae Shin Yoon, Seunghak Shin, Oleksandr Bailo, Namil Kim, Tae-Hee Lee, Hyun Seok Hong, Seung-Hoon Han, and In So Kweon. 2017. VPGNet: Vanishing Point Guided Network for Lane and Road Marking Detection and Recognition. In The IEEE International Conference on Computer Vision (ICCV).Google ScholarGoogle Scholar
  24. Ziwei Liu, Xiaoxiao Li, Ping Luo, Chen-Change Loy, and Xiaoou Tang. 2015. Semantic image segmentation via deep parsing network. In Proc. ICCV. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3431--3440.Google ScholarGoogle ScholarCross RefCross Ref
  26. Agata Mosinska, Pablo Marquez-Neila, Mateusz Kozinski, and Pascal Fua. 2017. Beyond the Pixel-Wise Loss for Topology-Aware Delineation. arXiv preprint arXiv:1712.02190 (2017).Google ScholarGoogle Scholar
  27. Jelena Novosel, Koenraad A. Vermeer, Gijs Thepass, Hans G. Lemij, and Lucas J. Van Vliet. 2003. Loosely coupled level sets for simultaneous 3D retinal layer segmentation in optical coherence tomography. In Simulation Conference, 2003. Proceedings of the. 59--65.Google ScholarGoogle Scholar
  28. Tomas Pfister, James Charles, and Andrew Zisserman. 2015. Flowing ConvNets for Human Pose Estimation in Videos. CoRR, Vol. abs/1506.02897 (2015). arxiv: 1506.02897 http://arxiv.org/abs/1506.02897 Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Pedro H. O. Pinheiro and Ronan Collobert. 2014. Recurrent Convolutional Neural Networks for Scene Labeling. In Proc. ICML. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Xiaojuan Qi, Jianping Shi, Shu Liu, Renjie Liao, and Jiaya Jia. 2015. Semantic Segmentation With Object Clique Potential. In Proc. ICCV. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Fabian Rathke, Stefan Schmidt, and Christoph Schnörr. 2014. Probabilistic intra-retinal layer segmentation in 3-D OCT images using global shape regularization. Medical image analysis, Vol. 18, 5 (2014), 781--794.Google ScholarGoogle Scholar
  32. Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In Proc. NIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In MICCAI. Springer, 234--241.Google ScholarGoogle Scholar
  34. Abhijit Guha Roy, Sailesh Conjeti, Sri Phani Krishna Karri, Debdoot Sheet, Amin Katouzian, Christian Wachinger, and Nassir Navab. 2017. ReLayNet: retinal layer and fluid segmentation of macular optical coherence tomography using fully convolutional networks. Biomedical optics express, Vol. 8, 8 (2017), 3627--3642.Google ScholarGoogle Scholar
  35. Alexander G Schwing and Raquel Urtasun. 2015. Fully connected deep structured networks. arXiv preprint arXiv:1503.02351 (2015).Google ScholarGoogle Scholar
  36. Abhishek Sharma, Oncel Tuzel, and David W Jacobs. 2015. Deep Hierarchical Parsing for Semantic Segmentation. Proc. CVPR (2015).Google ScholarGoogle ScholarCross RefCross Ref
  37. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google ScholarGoogle Scholar
  38. Ben Southall and Camillo J Taylor. 2001. Stochastic road shape estimation. In Computer Vision, 2001. ICCV 2001. Proceedings. Eighth IEEE International Conference on, Vol. 1. IEEE, 205--212.Google ScholarGoogle ScholarCross RefCross Ref
  39. Zhu Teng, Jeong-Hyun Kim, and Dong-Joong Kang. 2010. Real-time Lane detection by using multiple cues. In Control Automation and Systems (ICCAS), 2010 International Conference on. IEEE, 2334--2337.Google ScholarGoogle Scholar
  40. Chuang Wang, Yaxing Wang, Djibril Kaba, Zidong Wang, Xiaohui Liu, and Yongmin Li. 2015. Automated Layer Segmentation of 3D Macular Images Using Hybrid Methods. In International Conference on Image and Graphics. 614--628.Google ScholarGoogle ScholarCross RefCross Ref
  41. Jan D Wegner, Javier A Montoya-Zegarra, and Konrad Schindler. 2013. A higher-order CRF model for road network extraction. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on. IEEE, 1698--1705. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Shih En Wei, Varun Ramakrishna, Takeo Kanade, and Yaser Sheikh. 2016. Convolutional Pose Machines. In Computer Vision and Pattern Recognition. 4724--4732.Google ScholarGoogle Scholar
  43. Pan Xingang, Shi Jianping, Luo Ping, Wang Xiaogang, and Tang Xiaoou. 2018. Spatial As Deep: Spatial CNN for Traffic Scene Understanding. In AAAI Conference on Artificial Intelligence (AAAI).Google ScholarGoogle Scholar
  44. Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. 2017. Pyramid scene parsing network. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). 2881--2890.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. StripNet: Towards Topology Consistent Strip Structure Segmentation

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          MM '18: Proceedings of the 26th ACM international conference on Multimedia
          October 2018
          2167 pages
          ISBN:9781450356657
          DOI:10.1145/3240508

          Copyright © 2018 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 15 October 2018

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          MM '18 Paper Acceptance Rate209of757submissions,28%Overall Acceptance Rate995of4,171submissions,24%

          Upcoming Conference

          MM '24
          MM '24: The 32nd ACM International Conference on Multimedia
          October 28 - November 1, 2024
          Melbourne , VIC , Australia

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader