research-article

Cumulative Nets for Edge Detection

Authors:
Jingkuan Song

University of Electronic Science and Technology of China, Chengdu, China

University of Electronic Science and Technology of China, Chengdu, China
View Profile

,
Zhilong Zhou

University of Electronic Science and Technology of China, Chengdu, China

University of Electronic Science and Technology of China, Chengdu, China
View Profile

,
Lianli Gao

University of Electronic Science and Technology of China, Chengdu, China

University of Electronic Science and Technology of China, Chengdu, China
View Profile

,
Xing Xu

University of Electronic Science and Technology of China, chengdu, China

University of Electronic Science and Technology of China, chengdu, China
View Profile

,
Heng Tao Shen

University of Electronic Science and Technology of China, Chengdu, China

University of Electronic Science and Technology of China, Chengdu, China
View Profile

MM '18: Proceedings of the 26th ACM international conference on MultimediaOctober 2018Pages 1847–1855https://doi.org/10.1145/3240508.3240688

Published:15 October 2018Publication History

MM '18: Proceedings of the 26th ACM international conference on Multimedia

Pages 1847–1855

ABSTRACT

Lots of recent progress have been made by using Convolutional Neural Networks (CNN) for edge detection. Due to the nature of hierarchical representations learned in CNN, it is intuitive to design side networks utilizing the richer convolutional features to improve the edge detection. However, different side networks are isolated, and the final results are usually weighted sum of the side outputs with uneven qualities. To tackle these issues, we propose a Cumulative Network (C-Net), which learns the side network cumulatively based on current visual features and low-level side outputs, to gradually remove detailed or sharp boundaries to enable high-resolution and accurate edge detection. Therefore, the lower-level edge information is cumulatively inherited while the superfluous details are progressively abandoned. In fact, recursively Learningwhere to remove superfluous details from the current edge map with the supervision of a higher-level visual feature is challenging. Furthermore, we employ atrous convolution (AC) and atrous convolution pyramid pooling (ASPP) to robustly detect object boundaries at multiple scales and aspect ratios. Also, cumulatively refining edges using high-level visual information and lower-lever edge maps is achieved by our designed cumulative residual attention (CRA) block. Experimental results show that our C-Net sets new records for edge detection on both two benchmark datasets: BSDS500 (i.e., .819 ODS, .835 OIS and .862 AP) and NYUDV2 (i.e., .762 ODS, .781 OIS, .797 AP). C-Net has great potential to be applied to other deep learning based applications, e.g., image classification and segmentation.

References

Mart'i n Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Gregory S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian J. Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jó zefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané , Rajat Monga, Sherry Moore, Derek Gordon Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul A. Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda B. Vié gas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. CoRR , Vol. abs/1603.04467 (2016).Google Scholar
Pablo Arbelaez, Michael Maire, Charless C. Fowlkes, and Jitendra Malik. 2011. Contour Detection and Hierarchical Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. , Vol. 33, 5 (2011), 898--916. Google ScholarDigital Library
Pablo André s Arbelá ez, Jordi Pont-Tuset, Jonathan T. Barron, Ferran Marqué s, and Jitendra Malik. 2014. Multiscale Combinatorial Grouping. In CVPR. 328--335. Google ScholarDigital Library
Gedas Bertasius, Jianbo Shi, and Lorenzo Torresani. 2015a. DeepEdge: A multi-scale bifurcated deep network for top-down contour detection. In CVPR . 4380--4389.Google Scholar
Gedas Bertasius, Jianbo Shi, and Lorenzo Torresani. 2015b. High-for-Low and Low-for-High: Efficient Boundary Detection from Deep Object Features and Its Applications to High-Level Vision. In ICCV. 504--512. Google ScholarDigital Library
John Canny. 1987. A computational approach to edge detection. In Readings in Computer Vision . Elsevier, 184--203. Google ScholarDigital Library
Liang-Chieh Chen, Jonathan T. Barron, George Papandreou, Kevin Murphy, and Alan L. Yuille. 2016. Semantic Image Segmentation with Task-Specific Edge Detection Using CNNs and a Discriminatively Trained Domain Transform. In CVPR. 4545--4554.Google Scholar
Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille. 2018. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. , Vol. 40, 4 (2018), 834--848.Google ScholarCross Ref
Ming-Ming Cheng, Yun Liu, Qibin Hou, Jiawang Bian, Philip H. S. Torr, Shi-Min Hu, and Zhuowen Tu. 2016. HFS: Hierarchical Feature Selection for Efficient Image Segmentation. In ECCV . 867--882.Google Scholar
Xiao Chu, Wei Yang, Wanli Ouyang, Cheng Ma, Alan L. Yuille, and Xiaogang Wang. 2017. Multi-context Attention for Human Pose Estimation. In CVPR. 5669--5678.Google Scholar
Dorin Comaniciu and Peter Meer. 2002. Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. , Vol. 24, 5 (2002), 603--619. Google ScholarDigital Library
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Fei-Fei Li. 2009. ImageNet: A large-scale hierarchical image database. In CVPR. 248--255.Google Scholar
Piotr Dollá r and C. Lawrence Zitnick. 2015. Fast Edge Detection Using Structured Forests. IEEE Trans. Pattern Anal. Mach. Intell. , Vol. 37, 8 (2015), 1558--1570.Google ScholarCross Ref
Pedro F Felzenszwalb and Daniel P Huttenlocher. 2004. Efficient graph-based image segmentation. International journal of computer vision , Vol. 59, 2 (2004), 167--181. Google ScholarDigital Library
Vittorio Ferrari, L. Fevrier, Fré dé ric Jurie, and Cordelia Schmid. 2008. Groups of Adjacent Contour Segments for Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. , Vol. 30, 1 (2008), 36--51. Google ScholarDigital Library
Yaroslav Ganin and Victor Lempitsky. 2014. $ $ N^ 4$ $-Fields: Neural Network Nearest Neighbor Fields for Image Transforms. (2014), 536--551.Google Scholar
Lianli Gao, Zhao Guo, Hanwang Zhang, Xing Xu, and Heng Tao Shen. 2017. Video Captioning With Attention-Based LS™ and Semantic Consistency. IEEE Trans. Multimedia , Vol. 19, 9 (2017), 2045--2055.Google ScholarCross Ref
Lianli Gao, Jingkuan Song, Feiping Nie, Yan Yan, Nicu Sebe, and Heng Tao Shen. 2015. Optimal graph learning with partial tags and multiple features for image and video annotation. In CVPR. 4371--4379.Google Scholar
Lianli Gao, Jingkuan Song, Feiping Nie, Fuhao Zou, Nicu Sebe, and Heng Tao Shen. 2016. Graph-without-cut: An Ideal Graph Learning for Image Segmentation. In AAAI . 1188--1194. Google ScholarDigital Library
Lianli Gao, Jingkuan Song, Dongxiang Zhang, and Heng Tao Shen. 2018. Coarse-to-fine Image Co-segmentation with Intra and Inter Rank Constraints. In IJCAI . 719--725.Google Scholar
Saurabh Gupta, Pablo Arbelaez, and Jitendra Malik. 2013. Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images. In CVPR . 564--571. Google ScholarDigital Library
Saurabh Gupta, Ross B. Girshick, Pablo André s Arbelá ez, and Jitendra Malik. 2014. Learning Rich Features from RGB-D Images for Object Detection and Segmentation. In ECCV . 345--360.Google Scholar
Sam Hallman and Charless C. Fowlkes. 2015. Oriented edge forests for boundary detection. In CVPR. 1732--1740.Google Scholar
Kaiming He, Georgia Gkioxari, Piotr Dollá r, and Ross B. Girshick. 2017. Mask R-CNN. In ICCV . 2980--2988.Google Scholar
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In CVPR. 770--778.Google Scholar
Jason Kuen, Zhenhua Wang, and Gang Wang. 2016. Recurrent Attentional Networks for Saliency Detection. In CVPR. 3668--3677.Google Scholar
Joseph J. Lim, C. Lawrence Zitnick, and Piotr Dollá r. 2013. Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection. In CVPR . 3158--3165. Google ScholarDigital Library
Guosheng Lin, Anton Milan, Chunhua Shen, and Ian D. Reid. 2017. RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation. In CVPR . 5168--5177.Google Scholar
Wei Liu, Andrew Rabinovich, and Alexander C. Berg. 2015. ParseNet: Looking Wider to See Better. CoRR , Vol. abs/1506.04579 (2015).Google Scholar
Yun Liu, Ming-Ming Cheng, Xiaowei Hu, Kai Wang, and Xiang Bai. 2017. Richer Convolutional Features for Edge Detection. In CVPR. 5872--5881.Google Scholar
Kevis-Kokitsi Maninis, Jordi Pont-Tuset, Pablo Arbeláez, and Luc Van Gool. 2018. Convolutional oriented boundaries: From image segmentation to high-level tasks. IEEE Trans. Pattern Anal. Mach. Intell. , Vol. 40, 4 (2018), 819--833.Google ScholarCross Ref
David R. Martin, Charless C. Fowlkes, and Jitendra Malik. 2004. Learning to Detect Natural Image Boundaries Using Local Brightness, Color, and Texture Cues. IEEE Trans. Pattern Anal. Mach. Intell. , Vol. 26, 5 (2004), 530--549. Google ScholarDigital Library
Pushmeet Kohli Nathan Silberman, Derek Hoiem and Rob Fergus. 2012. Indoor Segmentation and Support Inference from RGBD Images. In ECCV .Google Scholar
Shaoqing Ren, Kaiming He, Ross B. Girshick, and Jian Sun. 2017. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. , Vol. 39, 6 (2017), 1137--1149. Google ScholarDigital Library
Mohammad Javad Shafiee, Brendan Chywl, Francis Li, and Alexander Wong. 2017. Fast YOLO: A Fast You Only Look Once System for Real-time Embedded Object Detection in Video. CoRR , Vol. abs/1709.05943 (2017).Google Scholar
Qi Shan, Brian Curless, Yasutaka Furukawa, Carlos Herná ndez, and Steven M. Seitz. 2014. Occluding Contours for Multi-view Stereo. In CVPR. 4002--4009. Google ScholarDigital Library
Evan Shelhamer, Jonathan Long, and Trevor Darrell. 2017. Fully Convolutional Networks for Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. , Vol. 39, 4 (2017), 640--651. Google ScholarDigital Library
Wei Shen, Xinggang Wang, Yan Wang, Xiang Bai, and Zhijiang Zhang. 2015. DeepContour: A deep convolutional feature learned by positive-sharing loss for contour detection. In CVPR. 3982--3991.Google Scholar
Jingkuan Song, Lianli Gao, Mihai Marian Puscas, Feiping Nie, Fumin Shen, and Nicu Sebe. 2016. Joint Graph Learning and Video Segmentation via Multiple Cues and Topology Calibration. In ACM Multimedia. 831--840. Google ScholarDigital Library
Jingkuan Song, Tao He, Lianli Gao, Xing Xu, and Heng Tao Shen. 2018. Deep Region Hashing for Generic Instance Search from Images. In AAAI .Google Scholar
Fei Wang, Mengqing Jiang, Chen Qian, Shuo Yang, Cheng Li, Honggang Zhang, Xiaogang Wang, and Xiaoou Tang. 2017. Residual Attention Network for Image Classification. In CVPR . 6450--6458.Google Scholar
Yunchao Wei, Xiaodan Liang, Yunpeng Chen, Xiaohui Shen, Ming-Ming Cheng, Jiashi Feng, Yao Zhao, and Shuicheng Yan. 2017. STC: A Simple to Complex Framework for Weakly-Supervised Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. , Vol. 39, 11 (2017), 2314--2320.Google ScholarDigital Library
Saining Xie and Zhuowen Tu. 2017. Holistically-Nested Edge Detection. International Journal of Computer Vision , Vol. 125, 1--3 (2017), 3--18. Google ScholarDigital Library
Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C. Courville, Ruslan Salakhutdinov, Richard S. Zemel, and Yoshua Bengio. 2015. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. In ICML . 2048--2057. Google ScholarDigital Library
Jimei Yang, Brian L. Price, Scott Cohen, Honglak Lee, and Ming-Hsuan Yang. 2016b. Object Contour Detection with a Fully Convolutional Encoder-Decoder Network. In CVPR . 193--202.Google Scholar
Wei Yang, Shuang Li, Wanli Ouyang, Hongsheng Li, and Xiaogang Wang. 2017. Learning Feature Pyramids for Human Pose Estimation. In ICCV . 1290--1299.Google Scholar
Zichao Yang, Xiaodong He, Jianfeng Gao, Li Deng, and Alexander J. Smola. 2016a. Stacked Attention Networks for Image Question Answering. In CVPR. 21--29.Google Scholar
Quanzeng You, Hailin Jin, Zhaowen Wang, Chen Fang, and Jiebo Luo. 2016. Image Captioning with Semantic Attention. In CVPR. 4651--4659.Google Scholar
Qiyang Zhao. 2015. Segmenting natural images with the least effort as humans. In BMVC. 110.1--110.12.Google Scholar

Index Terms

Cumulative Nets for Edge Detection
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

Edge detection: wavelets versus conventional methods on DSP processors

Edge detection is a cornerstone in any computer, robotic or machine vision system. Real time edge detection is a pre-process to many critical applications, such as assembly line inspection and surveillance. Wavelets-based algorithms are replacing ...
Read More
Image steganography using deep learning based edge detection
Abstract
This paper introduces a deep learning-based Steganography method for hiding secret information within the cover image. For this, we use a convolutional neural network (CNN) with Deep Supervision based edge detector, which can retain more edge ...
Read More
An Edge Detection Method Based on Lifting Wavelet and Mathematical Morphology
CMSP '11: Proceedings of the 2011 International Conference on Multimedia and Signal Processing - Volume 02

An edge detection method based on lifting wavelet and mathematical morphology is proposed in this paper. Firstly, the original image is decomposed by lifting wavelet transformation; secondly, the edge pixels in low-frequency sub-image are detected by ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '18: Proceedings of the 26th ACM international conference on Multimedia
October 2018
2167 pages
ISBN:9781450356657
DOI:10.1145/3240508
General Chairs:
Susanne Boll
University of Oldenburg, Germany
,
Kyoung Mu Lee
Seoul National University, Korea
,
Jiebo Luo
University of Rochester, USA
,
Wenwu Zhu
Tsinghua University, China
,
Program Chairs:
Hyeran Byun
Yonsei University, Korea
,
Chang Wen Chen
State Univ. Of New York at Buffalo, USA
,
Rainer Lienhart
University of Augsburg, Germany
,
Tao Mei
JD AI, China
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 October 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
cnn
cumulative residual attention
edge detection
Qualifiers
- research-article
Conference

Acceptance Rates
MM '18 Paper Acceptance Rate209of757submissions,28%Overall Acceptance Rate995of4,171submissions,24%
More
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 259
  Total Downloads
- Downloads (Last 12 months)7
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Cumulative Nets for Edge Detection

MM '18: Proceedings of the 26th ACM international conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Edge detection: wavelets versus conventional methods on DSP processors

Image steganography using deep learning based edge detection

An Edge Detection Method Based on Lifting Wavelet and Mathematical Morphology

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media