Skip to main content
Log in

A New Semantic Edge Aware Network for Object Affordance Detection

  • Regular paper
  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

This paper presents a new object affordance detection framework for robotic applications. In robotic manipulation tasks, the robot needs the accurate edge information of objects. Therefore, in order to improve the edge quality of the affordance detection results, semantic edge detection is introduced into our affordance detection framework. In order to take full advantage of the duality between affordance detection task and semantic edge detection task, our framework employs two key components to couple these two tasks: spatial gradient fusion module and shared gradient attention module. In the spatial gradient fusion module, the gradient features derived from the segmentation block are fused with the features output by the edge block to suppress non-semantic edges in semantic edge detection. In the shared gradient attention module, the edge consistency between affordance detection results and semantic edge detection results is enhanced by sharing gradient attention weights. Our experiments show that our framework can output affordance detection results with better edge quality. In particular, our method achieves state-of-the-art performance on the IIT-AFF benchmark, in terms of both affordance mask (F-score) and semantic edge (F-score) quality, improving by 0.52% and 3.22% over strong baselines. Furthermore, we demonstrate the effectiveness of our framework in real robotic applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Data Availability

The public IIT-AFF dataset is used in this project.

References

  1. Gibson, J.J.: The Ecological Approach to Visual Perception: Classic Edition. Psychology Press (2014)

    Book  Google Scholar 

  2. Nguyen, A., Kanoulas, D., Caldwell, D.G., Tsagarakis, N.G.: Object-based affordances detection with convolutional neural networks and dense conditional random fields. In: 2017 IEEE/RSJ international conference on intelligent robots and systems, IROS 2017, September 24, 2017 - September 28, 2017, Vancouver, BC, Canada 2017. IEEE international conference on intelligent robots and systems, pp. 5908-5915. Institute of Electrical and Electronics Engineers Inc.

  3. Do, T.-T., Nguyen, A., Reid, I.: AffordanceNet: an end-to-end deep learning approach for object affordance detection. In: 2018 IEEE international conference on robotics and automation, ICRA 2018, may 21, 2018 - may 25, 2018, Brisbane, QLD, Australia 2018. Proceedings - IEEE international conference on robotics and automation, pp. 5882-5889. Institute of Electrical and Electronics Engineers Inc.

  4. Zhao, X., Cao, Y., Kang, Y.: Object affordance detection with relationship-aware network. Neural Comput. & Applic. 32(18), 14321–14333 (2020). https://doi.org/10.1007/s00521-019-04336-0

    Article  Google Scholar 

  5. Chu, F.J., Xu, R.N., Seguin, L., Vela, P.A.: Toward affordance detection and ranking on novel objects for real-world robotic manipulation. IEEE Robot. Autom. Lett. 4(4), 4070–4077 (2019). https://doi.org/10.1109/lra.2019.2930364

    Article  Google Scholar 

  6. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: IEEE conference on computer vision and pattern recognition, CVPR 2015, June 7, 2015 - June 12, 2015, Boston, MA, United States 2015. Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp. 431-440. IEEE Computer Society

  7. Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: 15th European conference on computer vision, ECCV 2018, September 8, 2018 - September 14, 2018, Munich, Germany 2018. Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), pp. 833-851. Springer Verlag

  8. Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H.: Dual attention network for scene segmentation. In: 32nd IEEE/CVF conference on computer vision and pattern recognition, CVPR 2019, June 16, 2019 - June 20, 2019, Long Beach, CA, United States 2019. Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp. 3141-3149. IEEE Computer Society

  9. Hou, Q., Zhang, L., Cheng, M.-M., Feng, J.: Strip pooling: rethinking spatial pooling for scene parsing. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, CVPR 2020, June 14, 2020 - June 19, 2020, virtual, online, United States 2020. Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp. 4002-4011. IEEE Computer Society

  10. Yuan, Y., Chen, X., Wang, J.: Object-contextual representations for semantic segmentation. In: 16th European conference on computer vision, ECCV 2020, august 23, 2020 - august 28, 2020, Glasgow, United Kingdom 2020. Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), pp. 173-190. Springer Science and Business Media Deutschland GmbH

  11. Teichmann, M., Weber, M., Zollner, M., Cipolla, R., Urtasun, R.: MultiNet: real-time joint semantic reasoning for autonomous driving. In: 2018 IEEE intelligent vehicles symposium, IV 2018, September 26, 2018 - September 30, 2018, Changshu, Suzhou, China 2018. IEEE intelligent vehicles symposium, proceedings, pp. 1013-1020. Institute of Electrical and Electronics Engineers Inc.

  12. Choi, S., Kim, J.T., Choo, J.: Cars Can't Fly up in the sky: improving urban-scene segmentation via height-driven attention networks. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, CVPR 2020, June 14, 2020 - June 19, 2020, virtual, online, United States 2020. Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp. 9370-9380. IEEE Computer Society

  13. Zhao, A., Balakrishnan, G., Durand, F., Guttag, J.V., Dalca, A.V.: Data augmentation using learned transformations for one-shot medical image segmentation. In: 32nd IEEE/CVF conference on computer vision and pattern recognition, CVPR 2019, June 16, 2019 - June 20, 2019, Long Beach, CA, United States 2019. Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp. 8535-8545. IEEE Computer Society

  14. Dong, J., Cong, Y., Sun, G., Zhong, B., Xu, X.: What can be transferred: unsupervised domain adaptation for endoscopic lesions segmentation. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, CVPR 2020, June 14, 2020 - June 19, 2020, virtual, online, United States 2020. Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp. 4022-4031. IEEE Computer Society

  15. Hong, Z.-W., Chen, Y.-M., Yang, H.-K., Su, S.-Y., Shann, T.-Y., Chang, Y.-H., Ho, B.H.-L., Tu, C.-C., Hsiao, T.-C., Hsiao, H.-W., Lai, S.-P., Chang, Y.-C., Lee, C.-Y.: Virtual-to-real: Learning to control in visual semantic segmentation. In: 27th International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13, 2018 - July 19, 2018, Stockholm, Sweden 2018. IJCAI International Joint Conference on Artificial Intelligence, pp. 4912–4920. International Joint Conferences on Artificial Intelligence

  16. Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: 18th international conference on medical image computing and computer-assisted intervention, MICCAI 2015, October 5, 2015 - October 9, 2015, Munich, Germany 2015. Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), pp. 234-241. Springer Verlag

  17. Lin, G., Milan, A., Shen, C., Reid, I.: RefineNet: multi-path refinement networks for high-resolution semantic segmentation. In: 30th IEEE conference on computer vision and pattern recognition, CVPR 2017, July 21, 2017 - July 26, 2017, Honolulu, HI, United States 2017. Proceedings - 30th IEEE conference on computer vision and pattern recognition, CVPR 2017, pp. 5168-5177. Institute of Electrical and Electronics Engineers Inc.

  18. Liu, J., He, J., Zhang, J., Ren, J.S., Li, H.: EfficientFCN: holistically-guided decoding for semantic segmentation. In: 16th European conference on computer vision, ECCV 2020, august 23, 2020 - august 28, 2020, Glasgow, United Kingdom 2020. Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), pp. 1-17. Springer Science and Business Media Deutschland GmbH

  19. Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018). https://doi.org/10.1109/TPAMI.2017.2699184

    Article  Google Scholar 

  20. Yu, F., Koltun, V., Funkhouser, T.: Dilated residual networks. In: 30th IEEE conference on computer vision and pattern recognition, CVPR 2017, July 21, 2017 - July 26, 2017, Honolulu, HI, United States 2017. Proceedings - 30th IEEE conference on computer vision and pattern recognition, CVPR 2017, pp. 636-644. Institute of Electrical and Electronics Engineers Inc.

  21. Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017). https://doi.org/10.1109/TPAMI.2016.2644615

    Article  Google Scholar 

  22. Li, H., Xiong, P., Fan, H., Sun, J.: DFANet: deep feature aggregation for real-time semantic segmentation. In: 32nd IEEE/CVF conference on computer vision and pattern recognition, CVPR 2019, June 16, 2019 - June 20, 2019, Long Beach, CA, United States 2019. Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp. 9514-9523. IEEE Computer Society

  23. Li, X., Li, X., Zhang, L., Cheng, G., Shi, J., Lin, Z., Tan, S., Tong, Y.: Improving semantic segmentation via decoupled body and edge supervision. In: 16th European conference on computer vision, ECCV 2020, august 23, 2020 - august 28, 2020, Glasgow, United Kingdom 2020. Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), pp. 435-452. Springer Science and Business Media Deutschland GmbH

  24. Kokic, M., Stork, J.A., Haustein, J.A., Kragic, D.: Affordance detection for task-specific grasping using deep learning. In: 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids), 15–17 Nov. 2017, pp. 91–98 (2017).

  25. Song, H.O., Fritz, M., Goehring, D., Darrell, T.: Learning to detect visual grasp affordance. IEEE Trans. Autom. Sci. Eng. 13(2), 798–809 (2016). https://doi.org/10.1109/TASE.2015.2396014

    Article  Google Scholar 

  26. Bergamini, L., Sposato, M., Pellicciari, M., Peruzzini, M., Calderara, S., Schmidt, J.: Deep learning-based method for vision-guided robotic grasping of unknown objects. Adv. Eng. Inform. 44, 14 (2020). https://doi.org/10.1016/j.aei.2020.101052

    Article  Google Scholar 

  27. Nguyen, A., Kanoulas, D., Caldwell, D.G., Tsagarakis, N.G.: Detecting object affordances with Convolutional Neural Networks. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 9–14 Oct. 2016, pp. 2765–2770 (2016)

  28. Gall, J., Sawatzky, J.: Adaptive Binarization for weakly supervised affordance segmentation. In: 16th IEEE international conference on computer vision workshops, ICCVW 2017, October 22, 2017 - October 29, 2017, Venice, Italy 2017. Proceedings - 2017 IEEE international conference on computer vision workshops, ICCVW 2017, pp. 1383-1391. Institute of Electrical and Electronics Engineers Inc.

  29. Sawatzky, J., Srikantha, A., Gall, J.: Weakly supervised affordance detection. In: 30th IEEE conference on computer vision and pattern recognition, CVPR 2017, July 21, 2017 - July 26, 2017, Honolulu, HI, United States 2017. Proceedings - 30th IEEE conference on computer vision and pattern recognition, CVPR 2017, pp. 5197-5206. Institute of Electrical and Electronics Engineers Inc.

  30. Sawatzky, J., Garbade, M., Gall, J.: Ex Paucis Plura: learning affordance segmentation from very few examples. In: 40th German conference on pattern recognition, GCPR 2018, October 9, 2018 - October 12, 2018, Stuttgart, Germany 2019. Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), pp. 169-184. Springer Verlag

  31. Xie, S., Tu, Z.: Holistically-nested edge detection. In: 15th IEEE international conference on computer vision, ICCV 2015, December 11, 2015 - December 18, 2015, Santiago, Chile 2015. Proceedings of the IEEE international conference on computer vision, pp. 1395-1403. Institute of Electrical and Electronics Engineers Inc.

  32. Yu, Z., Feng, C., Liu, M.-Y., Ramalingam, S.: CASENet: deep category-aware semantic edge detection. In: 30th IEEE conference on computer vision and pattern recognition, CVPR 2017, July 21, 2017 - July 26, 2017, Honolulu, HI, United States 2017. Proceedings - 30th IEEE conference on computer vision and pattern recognition, CVPR 2017, pp. 1761-1770. Institute of Electrical and Electronics Engineers Inc.

  33. Hu, Y., Chen, Y., Li, X., Feng, J.: Dynamic feature fusion for semantic edge detection. In: 28th international joint conference on artificial intelligence, IJCAI 2019, august 10, 2019 - august 16, 2019, Macao, China 2019. IJCAI international joint conference on artificial intelligence, pp. 782-788. International Joint Conferences on Artificial Intelligence

  34. Cheng, D., Meng, G., Xiang, S., Pan, C.: FusionNet: edge aware deep convolutional networks for semantic segmentation of remote Sensing Harbor images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 10(12), 5769–5783 (2017). https://doi.org/10.1109/JSTARS.2017.2747599

    Article  Google Scholar 

  35. Takikawa, T., Acuna, D., Jampani, V., Fidler, S.: Gated-SCNN: gated shape CNNs for semantic segmentation. In: 17th IEEE/CVF international conference on computer vision, ICCV 2019, October 27, 2019 - November 2, 2019, Seoul, Korea, republic of 2019. Proceedings of the IEEE international conference on computer vision, pp. 5228-5237. Institute of Electrical and Electronics Engineers Inc.

  36. Zhen, M., Wang, J., Zhou, L., Li, S., Shen, T., Shang, J., Fang, T., Quan, L.: Joint semantic segmentation and boundary detection using iterative pyramid contexts. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, CVPR 2020, June 14, 2020 - June 19, 2020, virtual, online, United States 2020. Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp. 13663-13672. IEEE Computer Society

  37. Li, X., Zhao, H., Han, L., Tong, Y., Tan, S., Yang, K.: Gated fully fusion for semantic segmentation. Proc. AAAI Conf. Artif. Intell. 34(07), 11418–11425 (2020). https://doi.org/10.1609/aaai.v34i07.6805

    Article  Google Scholar 

  38. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations, ICLR 2015, May 7, 2015 - May 9, 2015, San Diego, CA, United states 2015. 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings. International Conference on Learning Representations, ICLR

  39. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 29th IEEE conference on computer vision and pattern recognition, CVPR 2016, June 26, 2016 - July 1, 2016, Las Vegas, NV, United States 2016. Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp. 770-778. IEEE Computer Society

  40. Zagoruyko, S., Komodakis, N.: Wide residual networks. In: 27th British machine vision conference, BMVC 2016, September 19, 2016 - September 22, 2016, York, United Kingdom 2016. British machine vision conference 2016, BMVC 2016, pp. 87.81-87.12. British machine vision conference, BMVC

  41. Myers, A., Teo, C.L., Fermuller, C., Aloimonos, Y.: Affordance detection of tool parts from geometric features. In: 2015 IEEE international conference on robotics and automation, ICRA 2015, may 26, 2015 - may 30, 2015, Seattle, WA, United States 2015. Proceedings - IEEE international conference on robotics and automation, pp. 1374-1381. Institute of Electrical and Electronics Engineers Inc.

  42. Acuna, D., Kar, A., Fidler, S.: Devil is in the edges: learning semantic boundaries from noisy annotations. In: 32nd IEEE/CVF conference on computer vision and pattern recognition, CVPR 2019, June 16, 2019 - June 20, 2019, Long Beach, CA, United States 2019. Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp. 11067-11075. IEEE Computer Society

Download references

Code Availability

The project is not complete yet, and we will release some of the source code when the project is complete.

Funding

No funding was received.

Author information

Authors and Affiliations

Authors

Contributions

Congcong Yin is responsible for the main research work and paper writing.

Qiuju Zhang is responsible for the guidance of the project.

Wenqiang Ren is responsible for assisting in setting up the experimental platform.

Corresponding author

Correspondence to Qiuju Zhang.

Ethics declarations

Ethics Approval

Not applicable.

Consent to Participate

Not applicable.

Consent for Publication

All authors have read this manuscript and would like to have it considered exclusively for publication in Journal of Intelligent & Robotic Systems.

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yin, C., Zhang, Q. & Ren, W. A New Semantic Edge Aware Network for Object Affordance Detection. J Intell Robot Syst 104, 2 (2022). https://doi.org/10.1007/s10846-021-01525-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10846-021-01525-9

Keywords

Navigation