skip to main content
10.1145/3240508.3240688acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Cumulative Nets for Edge Detection

Authors Info & Claims
Published:15 October 2018Publication History

ABSTRACT

Lots of recent progress have been made by using Convolutional Neural Networks (CNN) for edge detection. Due to the nature of hierarchical representations learned in CNN, it is intuitive to design side networks utilizing the richer convolutional features to improve the edge detection. However, different side networks are isolated, and the final results are usually weighted sum of the side outputs with uneven qualities. To tackle these issues, we propose a Cumulative Network (C-Net), which learns the side network cumulatively based on current visual features and low-level side outputs, to gradually remove detailed or sharp boundaries to enable high-resolution and accurate edge detection. Therefore, the lower-level edge information is cumulatively inherited while the superfluous details are progressively abandoned. In fact, recursively Learningwhere to remove superfluous details from the current edge map with the supervision of a higher-level visual feature is challenging. Furthermore, we employ atrous convolution (AC) and atrous convolution pyramid pooling (ASPP) to robustly detect object boundaries at multiple scales and aspect ratios. Also, cumulatively refining edges using high-level visual information and lower-lever edge maps is achieved by our designed cumulative residual attention (CRA) block. Experimental results show that our C-Net sets new records for edge detection on both two benchmark datasets: BSDS500 (i.e., .819 ODS, .835 OIS and .862 AP) and NYUDV2 (i.e., .762 ODS, .781 OIS, .797 AP). C-Net has great potential to be applied to other deep learning based applications, e.g., image classification and segmentation.

References

  1. Mart'i n Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Gregory S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian J. Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jó zefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané , Rajat Monga, Sherry Moore, Derek Gordon Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul A. Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda B. Vié gas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. CoRR , Vol. abs/1603.04467 (2016).Google ScholarGoogle Scholar
  2. Pablo Arbelaez, Michael Maire, Charless C. Fowlkes, and Jitendra Malik. 2011. Contour Detection and Hierarchical Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. , Vol. 33, 5 (2011), 898--916. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Pablo André s Arbelá ez, Jordi Pont-Tuset, Jonathan T. Barron, Ferran Marqué s, and Jitendra Malik. 2014. Multiscale Combinatorial Grouping. In CVPR. 328--335. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Gedas Bertasius, Jianbo Shi, and Lorenzo Torresani. 2015a. DeepEdge: A multi-scale bifurcated deep network for top-down contour detection. In CVPR . 4380--4389.Google ScholarGoogle Scholar
  5. Gedas Bertasius, Jianbo Shi, and Lorenzo Torresani. 2015b. High-for-Low and Low-for-High: Efficient Boundary Detection from Deep Object Features and Its Applications to High-Level Vision. In ICCV. 504--512. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. John Canny. 1987. A computational approach to edge detection. In Readings in Computer Vision . Elsevier, 184--203. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Liang-Chieh Chen, Jonathan T. Barron, George Papandreou, Kevin Murphy, and Alan L. Yuille. 2016. Semantic Image Segmentation with Task-Specific Edge Detection Using CNNs and a Discriminatively Trained Domain Transform. In CVPR. 4545--4554.Google ScholarGoogle Scholar
  8. Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille. 2018. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. , Vol. 40, 4 (2018), 834--848.Google ScholarGoogle ScholarCross RefCross Ref
  9. Ming-Ming Cheng, Yun Liu, Qibin Hou, Jiawang Bian, Philip H. S. Torr, Shi-Min Hu, and Zhuowen Tu. 2016. HFS: Hierarchical Feature Selection for Efficient Image Segmentation. In ECCV . 867--882.Google ScholarGoogle Scholar
  10. Xiao Chu, Wei Yang, Wanli Ouyang, Cheng Ma, Alan L. Yuille, and Xiaogang Wang. 2017. Multi-context Attention for Human Pose Estimation. In CVPR. 5669--5678.Google ScholarGoogle Scholar
  11. Dorin Comaniciu and Peter Meer. 2002. Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. , Vol. 24, 5 (2002), 603--619. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Fei-Fei Li. 2009. ImageNet: A large-scale hierarchical image database. In CVPR. 248--255.Google ScholarGoogle Scholar
  13. Piotr Dollá r and C. Lawrence Zitnick. 2015. Fast Edge Detection Using Structured Forests. IEEE Trans. Pattern Anal. Mach. Intell. , Vol. 37, 8 (2015), 1558--1570.Google ScholarGoogle ScholarCross RefCross Ref
  14. Pedro F Felzenszwalb and Daniel P Huttenlocher. 2004. Efficient graph-based image segmentation. International journal of computer vision , Vol. 59, 2 (2004), 167--181. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Vittorio Ferrari, L. Fevrier, Fré dé ric Jurie, and Cordelia Schmid. 2008. Groups of Adjacent Contour Segments for Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. , Vol. 30, 1 (2008), 36--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Yaroslav Ganin and Victor Lempitsky. 2014. $ $ N^ 4$ $-Fields: Neural Network Nearest Neighbor Fields for Image Transforms. (2014), 536--551.Google ScholarGoogle Scholar
  17. Lianli Gao, Zhao Guo, Hanwang Zhang, Xing Xu, and Heng Tao Shen. 2017. Video Captioning With Attention-Based LS™ and Semantic Consistency. IEEE Trans. Multimedia , Vol. 19, 9 (2017), 2045--2055.Google ScholarGoogle ScholarCross RefCross Ref
  18. Lianli Gao, Jingkuan Song, Feiping Nie, Yan Yan, Nicu Sebe, and Heng Tao Shen. 2015. Optimal graph learning with partial tags and multiple features for image and video annotation. In CVPR. 4371--4379.Google ScholarGoogle Scholar
  19. Lianli Gao, Jingkuan Song, Feiping Nie, Fuhao Zou, Nicu Sebe, and Heng Tao Shen. 2016. Graph-without-cut: An Ideal Graph Learning for Image Segmentation. In AAAI . 1188--1194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Lianli Gao, Jingkuan Song, Dongxiang Zhang, and Heng Tao Shen. 2018. Coarse-to-fine Image Co-segmentation with Intra and Inter Rank Constraints. In IJCAI . 719--725.Google ScholarGoogle Scholar
  21. Saurabh Gupta, Pablo Arbelaez, and Jitendra Malik. 2013. Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images. In CVPR . 564--571. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Saurabh Gupta, Ross B. Girshick, Pablo André s Arbelá ez, and Jitendra Malik. 2014. Learning Rich Features from RGB-D Images for Object Detection and Segmentation. In ECCV . 345--360.Google ScholarGoogle Scholar
  23. Sam Hallman and Charless C. Fowlkes. 2015. Oriented edge forests for boundary detection. In CVPR. 1732--1740.Google ScholarGoogle Scholar
  24. Kaiming He, Georgia Gkioxari, Piotr Dollá r, and Ross B. Girshick. 2017. Mask R-CNN. In ICCV . 2980--2988.Google ScholarGoogle Scholar
  25. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In CVPR. 770--778.Google ScholarGoogle Scholar
  26. Jason Kuen, Zhenhua Wang, and Gang Wang. 2016. Recurrent Attentional Networks for Saliency Detection. In CVPR. 3668--3677.Google ScholarGoogle Scholar
  27. Joseph J. Lim, C. Lawrence Zitnick, and Piotr Dollá r. 2013. Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection. In CVPR . 3158--3165. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Guosheng Lin, Anton Milan, Chunhua Shen, and Ian D. Reid. 2017. RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation. In CVPR . 5168--5177.Google ScholarGoogle Scholar
  29. Wei Liu, Andrew Rabinovich, and Alexander C. Berg. 2015. ParseNet: Looking Wider to See Better. CoRR , Vol. abs/1506.04579 (2015).Google ScholarGoogle Scholar
  30. Yun Liu, Ming-Ming Cheng, Xiaowei Hu, Kai Wang, and Xiang Bai. 2017. Richer Convolutional Features for Edge Detection. In CVPR. 5872--5881.Google ScholarGoogle Scholar
  31. Kevis-Kokitsi Maninis, Jordi Pont-Tuset, Pablo Arbeláez, and Luc Van Gool. 2018. Convolutional oriented boundaries: From image segmentation to high-level tasks. IEEE Trans. Pattern Anal. Mach. Intell. , Vol. 40, 4 (2018), 819--833.Google ScholarGoogle ScholarCross RefCross Ref
  32. David R. Martin, Charless C. Fowlkes, and Jitendra Malik. 2004. Learning to Detect Natural Image Boundaries Using Local Brightness, Color, and Texture Cues. IEEE Trans. Pattern Anal. Mach. Intell. , Vol. 26, 5 (2004), 530--549. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Pushmeet Kohli Nathan Silberman, Derek Hoiem and Rob Fergus. 2012. Indoor Segmentation and Support Inference from RGBD Images. In ECCV .Google ScholarGoogle Scholar
  34. Shaoqing Ren, Kaiming He, Ross B. Girshick, and Jian Sun. 2017. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. , Vol. 39, 6 (2017), 1137--1149. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Mohammad Javad Shafiee, Brendan Chywl, Francis Li, and Alexander Wong. 2017. Fast YOLO: A Fast You Only Look Once System for Real-time Embedded Object Detection in Video. CoRR , Vol. abs/1709.05943 (2017).Google ScholarGoogle Scholar
  36. Qi Shan, Brian Curless, Yasutaka Furukawa, Carlos Herná ndez, and Steven M. Seitz. 2014. Occluding Contours for Multi-view Stereo. In CVPR. 4002--4009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Evan Shelhamer, Jonathan Long, and Trevor Darrell. 2017. Fully Convolutional Networks for Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. , Vol. 39, 4 (2017), 640--651. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Wei Shen, Xinggang Wang, Yan Wang, Xiang Bai, and Zhijiang Zhang. 2015. DeepContour: A deep convolutional feature learned by positive-sharing loss for contour detection. In CVPR. 3982--3991.Google ScholarGoogle Scholar
  39. Jingkuan Song, Lianli Gao, Mihai Marian Puscas, Feiping Nie, Fumin Shen, and Nicu Sebe. 2016. Joint Graph Learning and Video Segmentation via Multiple Cues and Topology Calibration. In ACM Multimedia. 831--840. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Jingkuan Song, Tao He, Lianli Gao, Xing Xu, and Heng Tao Shen. 2018. Deep Region Hashing for Generic Instance Search from Images. In AAAI .Google ScholarGoogle Scholar
  41. Fei Wang, Mengqing Jiang, Chen Qian, Shuo Yang, Cheng Li, Honggang Zhang, Xiaogang Wang, and Xiaoou Tang. 2017. Residual Attention Network for Image Classification. In CVPR . 6450--6458.Google ScholarGoogle Scholar
  42. Yunchao Wei, Xiaodan Liang, Yunpeng Chen, Xiaohui Shen, Ming-Ming Cheng, Jiashi Feng, Yao Zhao, and Shuicheng Yan. 2017. STC: A Simple to Complex Framework for Weakly-Supervised Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. , Vol. 39, 11 (2017), 2314--2320.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Saining Xie and Zhuowen Tu. 2017. Holistically-Nested Edge Detection. International Journal of Computer Vision , Vol. 125, 1--3 (2017), 3--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C. Courville, Ruslan Salakhutdinov, Richard S. Zemel, and Yoshua Bengio. 2015. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. In ICML . 2048--2057. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Jimei Yang, Brian L. Price, Scott Cohen, Honglak Lee, and Ming-Hsuan Yang. 2016b. Object Contour Detection with a Fully Convolutional Encoder-Decoder Network. In CVPR . 193--202.Google ScholarGoogle Scholar
  46. Wei Yang, Shuang Li, Wanli Ouyang, Hongsheng Li, and Xiaogang Wang. 2017. Learning Feature Pyramids for Human Pose Estimation. In ICCV . 1290--1299.Google ScholarGoogle Scholar
  47. Zichao Yang, Xiaodong He, Jianfeng Gao, Li Deng, and Alexander J. Smola. 2016a. Stacked Attention Networks for Image Question Answering. In CVPR. 21--29.Google ScholarGoogle Scholar
  48. Quanzeng You, Hailin Jin, Zhaowen Wang, Chen Fang, and Jiebo Luo. 2016. Image Captioning with Semantic Attention. In CVPR. 4651--4659.Google ScholarGoogle Scholar
  49. Qiyang Zhao. 2015. Segmenting natural images with the least effort as humans. In BMVC. 110.1--110.12.Google ScholarGoogle Scholar

Index Terms

  1. Cumulative Nets for Edge Detection

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      MM '18: Proceedings of the 26th ACM international conference on Multimedia
      October 2018
      2167 pages
      ISBN:9781450356657
      DOI:10.1145/3240508

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 15 October 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      MM '18 Paper Acceptance Rate209of757submissions,28%Overall Acceptance Rate995of4,171submissions,24%

      Upcoming Conference

      MM '24
      MM '24: The 32nd ACM International Conference on Multimedia
      October 28 - November 1, 2024
      Melbourne , VIC , Australia

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader