Skip to main content
Log in

Fast contour detection with supervised attention learning

  • Original Research Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

Recent advances in deep convolutional neural networks have led to significant success in many computer vision tasks, including edge detection. However, the existing edge detectors neglected the structural relationships among pixels, especially those among contour pixels. Inspired by human perception, this work points out the importance of learning structural relationships and proposes a novel real-time attention edge detection (AED) framework. Firstly, an elaborately designed attention mask is employed to capture the structural relationships among pixels at edges. Secondly, in the decoding phase of our encoder–decoder model, a new module called dense upsampling group convolution is designed to tackle the problem of information loss due to stride downsampling. And then, the detailed structural information can be preserved even it is ever destroyed in the encoding phase. The proposed relationship learning module introduces negligible computation overhead, and as a result, the proposed AED meets the requirement of real-time execution with only 0.65M parameters. With the proposed model, an optimal dataset scale F-score of 79.5 is obtained on the BSDS500 dataset with an inference speed of 105 frames per second, which is significantly faster than existing methods with comparable accuracy. In addition, a state-of-the-art performance is achieved on the BSDS500 (81.6) and NYU Depth (77.0) datasets when using a heavier model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 6, 679–698 (1986)

    Article  Google Scholar 

  2. Zhang, Y., Chu, J., Leng, L., Miao, J.: Mask-refined R-CNN: a network for refining object details in instance segmentation. Sensors 20(4), 1010 (2020)

    Article  Google Scholar 

  3. Tu, Z., Xie, W., Cao, J., Van Gemeren, C., Poppe, R., Veltkamp, R.C.: Variational method for joint optical flow estimation and edge-aware image restoration. Pattern Recognit. 65, 11–25 (2017)

    Article  Google Scholar 

  4. Dollár, P., Zitnick, C.L.: Fast edge detection using structured forests. IEEE Trans. Pattern Anal. Mach. Intell. 37(8), 1558–1570 (2015)

    Article  Google Scholar 

  5. Xie, S., Tu, Z.: Holistically-nested edge detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1395–1403 (2015)

  6. Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 898–916 (2011)

    Article  Google Scholar 

  7. Liu, Y., Cheng, M.M., Hu, X., Wang, K., Bai, X.: Richer convolutional features for edge detection. In: Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on, pp. 5872–5881. IEEE (2017)

  8. Wang, F., Jiang, M., Qian, C., Yang, S., Li, C., Zhang, H., Wang, X., Tang, X.: Residual attention network for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3156–3164 (2017)

  9. Wang, P., Chen, P., Yuan, Y., Liu, D., Huang, Z., Hou, X., Cottrell, G.: Understanding convolution for semantic segmentation. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1451–1460. IEEE (2018)

  10. Leng, L., Zhang, J., Xu, J., Khan, M.K., Alghathbar, K.: Dynamic weighted discrimination power analysis in DCT domain for face and palmprint recognition. In: 2010 International Conference on Information and Communication Technology Convergence (ICTC), pp. 467–471. IEEE (2010)

  11. Deng, R., Shen, C., Liu, S., Wang, H., Liu, X.: Learning to predict crisp boundaries. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 562–578 (2018)

  12. Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)

  13. Sobel, I.: Camera Models and Machine Perception. Technical report, Computer Science Department, Technion (1972)

  14. Zhang, W., Zhao, Y., Breckon, T.P., Chen, L.: Noise robust image edge detection based upon the automatic anisotropic Gaussian kernels. Pattern Recognit. 63, 193–205 (2017)

    Article  Google Scholar 

  15. Wang, F.P., Shui, P.L.: Noise-robust color edge detector using gradient matrix and anisotropic Gaussian directional derivative matrix. Pattern Recognit. 52, 346–357 (2016)

    Article  Google Scholar 

  16. Martin, D.R., Fowlkes, C.C., Malik, J.: Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Trans. Pattern Anal. Mach. Intell. 26(5), 530–549 (2004)

    Article  Google Scholar 

  17. Lim, J.J., Zitnick, C.L., Dollár, P.: Sketch tokens: a learned mid-level representation for contour and object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3158–3165 (2013)

  18. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

  19. Wang, Y., Zhao, X., Huang, K.: Deep crisp boundaries. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3892–3900 (2017)

  20. Zhang, P., Liu, W., Wang, H., Lei, Y., Lu, H.: Deep gated attention networks for large-scale street-level scene segmentation. Pattern Recognit. 88, 702–714 (2019)

    Article  Google Scholar 

  21. Xie, S., Hu, H., Wu, Y.: Deep multi-path convolutional neural network joint with salient region attention for facial expression recognition. Pattern Recognit. 92, 177–191 (2019)

    Article  Google Scholar 

  22. Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFS with Gaussian edge potentials. In: Advances in neural information processing systems, pp. 109–117 (2011)

  23. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  24. Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: European Conference on Computer Vision, pp. 483–499. Springer, Berlin (2016)

  25. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)

  26. Ma, N., Zhang, X., Zheng, H.T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018)

  27. Pinheiro, P.O., Lin, T.Y., Collobert, R., Dollár, P.: Learning to refine object segments. In: European Conference on Computer Vision, pp. 75–91. Springer, Berlin (2016)

  28. Chu, J., Guo, Z., Leng, L.: Object detection based on multi-layer convolution feature fusion and online hard example mining. IEEE Access 6, 19959–19967 (2018)

    Article  Google Scholar 

  29. Leng, L., Li, M., Kim, C., Bi, X.: Dual-source discrimination power analysis for multi-instance contactless palmprint recognition. Multimed. Tools Appl. 76(1), 333–354 (2017)

    Article  Google Scholar 

  30. Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: European Conference on Computer Vision, pp. 746–760. Springer, Berlin (2012)

  31. Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J.,Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga,L., et al.: Pytorch: An imperative style, high-performancedeep learning library. In: Advances in neural informationprocessing systems, pp. 8026–8037 (2019)

  32. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  33. Mottaghi, R., Chen, X., Liu, X., Cho, N.G., Lee, S.W., Fidler, S., Urtasun, R., Yuille, A.: The role of context for object detection and semantic segmentation in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 891–898 (2014)

  34. Hallman, S., Fowlkes, C.C.: Oriented edge forests for boundary detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1732–1740 (2015)

  35. Shen, W., Wang, X., Wang, Y., Bai, X., Zhang, Z.: Deepcontour: a deep convolutional feature learned by positive-sharing loss for contour detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3982–3991 (2015)

  36. Bertasius, G., Shi, J., Torresani, L.: Deepedge: a multi-scale bifurcated deep network for top-down contour detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4380–4389 (2015)

  37. Bertasius, G., Shi, J., Torresani, L.: High-for-low and low-for-high: efficient boundary detection from deep object features and its applications to high-level vision. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 504–512 (2015)

  38. Yang, J., Price, B., Cohen, S., Lee, H., Yang, M.H.: Object contour detection with a fully convolutional encoder–decoder network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 193–202 (2016)

  39. Gupta, S., Girshick, R., Arbeláez, P., Malik, J.: Learning rich features from RGB-D images for object detection and segmentation. In: European Conference on Computer Vision, pp. 345–360. Springer, Berlin (2014)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mingyu You.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, R., You, M. Fast contour detection with supervised attention learning. J Real-Time Image Proc 18, 647–657 (2021). https://doi.org/10.1007/s11554-020-00980-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-020-00980-1

Keywords

Navigation