Skip to main content
Log in

Object affordance detection with boundary-preserving network for robotic manipulation tasks

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Object affordance detection aims to identify, locate and segment the functional regions of objects, so that robots can understand and manipulate objects like humans. The affordance detection task has two main challenges: (1) Due to the need to provide accurate positioning information for the robot to manipulate objects, the affordance segmentation results are required to have high boundary quality. (2) Different kinds of objects have significant differences in appearances, but may have the same affordance. Correspondingly, parts with the same appearance may have different affordances. The existing methods regard affordance detection as an image segmentation problem, without focusing on the boundary quality of detection results. In addition, most of the existing methods do not consider the potential relationship between object categories and object affordances. Aiming at the above problems, we propose a boundary-preserving network (BPN) to provide affordance masks with better boundary quality for robots to manipulate objects. Our framework contains three new components: the IoU (Intersection-over-Union) branch, the affordance boundary branch and the relationship attention module. The IoU branch is used to predict the IoU score of each object bounding box. The affordance boundary branch is used to guide the network to learn the boundary features of objects. The relationship attention module is used to enhance the feature representation capability of the network by exploring the potential relationship between object categories and object affordances. Experiments show that our method is helpful to improve the boundary quality of the predicted affordance masks. On the IIT-AFF dataset, the performance of the proposed BPN is 2.32% (F-score) and 2.89% (F-score) higher than that of the strong baseline in terms of affordance masks and the boundaries of affordance masks, respectively. Furthermore, the real-world robot manipulation experiments show that the proposed BPN can provide accurate affordance information for robots to manipulate objects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

The public IIT-AFF dataset is used in this project.

Code availability

The project is not complete yet, and we will release some of the source code when the project is complete.

References

  1. Gibson JJ (2014) The ecological approach to visual perception, classic. Psychology Press, Hove

    Book  Google Scholar 

  2. Ferretti, G.: A distinction concerning vision-for-action and affordance perception. Consciousness and Cognition 87 (2021). doi:https://doi.org/10.1016/j.concog.2020.103028

  3. Hassanin M, Khan S, Tahtali M (2021) Visual affordance and function understanding: a survey. Acm Comput Surv. https://doi.org/10.1145/3446370

    Article  Google Scholar 

  4. Do TT, Nguyen A, Reid I (2018) AffordanceNet: an end-to-end deep learning approach for object affordance detection. In: 2018 IEEE international conference on robotics and automation (ICRA), 21–25 May 2018, pp 5882–5889

  5. Chu F, Xu R, Vela PA (2019) Learning affordance segmentation for real-world robotic manipulation via synthetic images. IEEE Robot Autom Lett 4(2):1140–1147. https://doi.org/10.1109/LRA.2019.2894439

    Article  Google Scholar 

  6. Minh CND, Gilani SZ, Islam SMS, Suter D (2020) Learning affordance segmentation: an investigative study. In: 2020 digital image computing: techniques and applications (DICTA), 29 Nov.–2 Dec. 2020, pp 1–8

  7. Zhao X, Cao Y, Kang Y (2020) Object affordance detection with relationship-aware network. Neural Comput Appl 32(18):14321–14333. https://doi.org/10.1007/s00521-019-04336-0

    Article  Google Scholar 

  8. Gu QP, Su JH, Yuan L (2021) Visual affordance detection using an efficient attention convolutional neural network. Neurocomputing 440:36–44. https://doi.org/10.1016/j.neucom.2021.01.018

    Article  Google Scholar 

  9. Chen X, Lian Y, Jiao L, Wang H, Gao Y, Lingling S (2020) Supervised edge attention network for accurate image instance segmentation. In: Vedaldi A, Bischof H, Brox T, Frahm J-M (eds) Computer vision: ECCV 2020, 2020, pp. 617–631. Springer International Publishing, Cham

  10. He K, Gkioxari G, Dollár P, Girshick R (2020) Mask R-CNN. IEEE Trans Pattern Anal Mach Intell 42(2):386–397. https://doi.org/10.1109/TPAMI.2018.2844175

    Article  Google Scholar 

  11. Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031

    Article  Google Scholar 

  12. Cai Z, Vasconcelos N (2018) Cascade R-CNN: delving into high quality object detection. In: 2018 IEEE/CVF conference on computer vision and pattern recognition (CVPR), 18–23 June 2018, pp 6154–6162

  13. Wu Y, Chen Y, Yuan L, Liu Z, Wang L, Li H, Fu Y (2020) Rethinking classification and localization for object detection. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), 13–19 June 2020, pp 10183–10192

  14. Jiang B, Luo R, Mao J, Xiao T, Jiang Y (2018) Acquisition of localization confidence for accurate object detection. In: Computer vision: ECCV 2018, pp 816–832. Springer International Publishing, Cham

  15. Chi C, Zeng X, Bruniaux P, Tartare G (2021) A study on segmentation and refinement of key human body parts by integrating manual measurements. Ergonomics. https://doi.org/10.1080/00140139.2021.1963489

    Article  Google Scholar 

  16. Ruiz E, Mayol-Cuevas W (2020) Geometric affordance perception: leveraging deep 3D saliency with the interaction tensor. Front Neurorobot. https://doi.org/10.3389/fnbot.2020.00045

    Article  Google Scholar 

  17. Qian K, Jing XS, Duan YH, Zhou B, Fang F, Xia J, Ma XD (2020) Grasp pose detection with affordance-based task constraint learning in single-view point clouds. J Intell Rob Syst 100(1):145–163. https://doi.org/10.1007/s10846-020-01202-3

    Article  Google Scholar 

  18. Corona E, Pumarola A, Alenyà G, Moreno-Noguer F, Rogez G (2020) GanHand: predicting human grasp affordances in multi-object scenes. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), 13–19 June 2020, pp. 5030–5040

  19. Montesano L, Lopes M, Bernardino A, Santos-Victor J (2008) Learning object affordances: from sensory-motor coordination to imitation. IEEE Trans Rob 24(1):15–26. https://doi.org/10.1109/TRO.2007.914848

    Article  Google Scholar 

  20. Chang O (2015) A bio-inspired robot with visual perception of affordances. In: Agapito L, Bronstein MM, Rother C (eds) Computer vision: ECCV 2014 workshops, pp 420–426. Springer International Publishing, Cham

  21. Song HO, Fritz M, Goehring D, Darrell T (2016) Learning to detect visual grasp affordance. IEEE Trans Autom Sci Eng 13(2):798–809. https://doi.org/10.1109/TASE.2015.2396014

    Article  Google Scholar 

  22. Myers A, Teo CL, Fermüller C, Aloimonos Y (2015) Affordance detection of tool parts from geometric features. In: 2015 IEEE international conference on robotics and automation (ICRA), 26–30 May 2015, pp 1374–1381

  23. Lakani SR, Rodríguez-Sánchez AJ, Piater J (2017) Can affordances guide object decomposition into semantically meaningful parts? In: 2017 IEEE winter conference on applications of computer vision (WACV), 24–31 March 2017, pp 82–90

  24. Lakani SR, Rodríguez-Sánchez AJ, Piater J (2018) Exercising affordances of objects: a part-based approach. IEEE Robot Autom Lett 3(4):3465–3472. https://doi.org/10.1109/LRA.2018.2853639

    Article  Google Scholar 

  25. Rezapour Lakani S, Rodríguez-Sánchez AJ, Piater J (2019) Towards affordance detection for robot manipulation using affordance for parts and parts for affordance. Auton Robot 43(5):1155–1172. https://doi.org/10.1007/s10514-018-9787-5

    Article  Google Scholar 

  26. Iizuka M, Hashimoto M (2018) Detection of semantic grasping-parameter using part-affordance recognition. In: 2018 19th International conference on research and education in mechatronics (REM), 7–8 June 2018, pp 136–140

  27. Iizuka M, Akizuki S, Hashimoto M (2019) Accuracy improvement of functional attribute recognition by dense CRF considering object shape. Electron Commun Jpn 102(3):56–62. https://doi.org/10.1002/ecj.12151

    Article  Google Scholar 

  28. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), 27–30 June 2016, pp 770–778

  29. Huang G, Liu Z, Maaten LVD, Weinberger KQ (2017) Densely connected convolutional networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), 21–26 July 2017, pp 2261–2269

  30. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, 18–23 June 2018, pp 7132–7141

  31. Tan M, Le Q (2019) Efficientnet: rethinking model scaling for convolutional neural networks. In: International conference on machine learning 2019, pp 6105–6114. PMLR

  32. Zhang H, Wu C, Zhang Z, Zhu Y, Lin H, Zhang Z, Sun Y, He T, Mueller J, Manmatha RJ (2020) Resnest: split-attention networks. arXiv preprint arXiv:2004.08955 (2020)

  33. Ardón P, Pairet È, Lohan KS, Ramamoorthy S, Petrick RJ (2020) Affordances in robotic tasks--a survey. arXiv:2004.07400

  34. Lenz I, Lee H, Saxena A (2015) Deep learning for detecting robotic grasps. Int J Robot Res 34(4–5):705–724. https://doi.org/10.1177/0278364914549607

    Article  Google Scholar 

  35. Redmon J, Angelova A (2015) Real-time grasp detection using convolutional neural networks. In: 2015 IEEE international conference on robotics and automation (ICRA), 26–30 May 2015, pp 1316–1322

  36. Guo D, Sun F, Liu H, Kong T, Fang B, Xi N (2017) A hybrid deep architecture for robotic grasp detection. In: 2017 IEEE international conference on robotics and automation (ICRA), 29 May–3 June 2017, pp 1609–1614

  37. Chu F, Xu R, Vela PA (2018) Real-world multiobject, multigrasp detection. IEEE Robot Autom Lett 3(4):3355–3362. https://doi.org/10.1109/LRA.2018.2852777

    Article  Google Scholar 

  38. Watson J, Hughes J, Iida F (2017) Real-world, real-time robotic grasping with convolutional neural networks. In: Towards autonomous robotic systems, pp 617–626. Springer International Publishing, Cham

  39. Ardón PÈP, Petillot Y, Petrick RPA, Ramamoorthy S, Lohan KS (2021) Self-assessment of grasp affordance transfer. In: 2020 IEEE/RSJ international conference on intelligent robots and systems (IROS), 24 Oct.–24 Jan. 2021, pp 9385–9392

  40. Nguyen A, Kanoulas D, Caldwell DG, Tsagarakis NG (2016) Detecting object affordances with convolutional neural networks. In: 2016 IEEE/RSJ international conference on intelligent robots and systems (IROS), 9–14 Oct. 2016, pp. 2765–2770

  41. Nguyen A, Kanoulas D, Caldwell DG, Tsagarakis NG (2017) Object-based affordances detection with convolutional neural networks and dense conditional random fields. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), 24–28 Sept. 2017, pp. 5908–5915

  42. Chu FJ, Xu R, Seguin L, Vela PA (2019) Toward affordance detection and ranking on novel objects for real-world robotic manipulation. IEEE Robot Autom Lett 4(4):4070–4077. https://doi.org/10.1109/LRA.2019.2930364

    Article  Google Scholar 

  43. Yin C, Zhang Q, Ren W (2021) A new semantic edge aware network for object affordance detection. J Intell Rob Syst 104(1):2. https://doi.org/10.1007/s10846-021-01525-9

    Article  Google Scholar 

  44. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: 3rd international conference on learning representations (ICLR 2015), 7–9 May 2015, pp 1–14

  45. Lin T, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), 21–26 July 2017, pp 936–944

  46. Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), 15–20 June 2019, pp 3141–3149

  47. Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D (2020) Distance-IoU loss: faster and better learning for bounding box regression. Proc AAAI Conf Artif Intell 34(07):12993–13000. https://doi.org/10.1609/aaai.v34i07.6999

    Article  Google Scholar 

  48. Cheng T, Wang X, Huang L, Liu W (2020) Boundary-preserving mask R-CNN. In: Vedaldi A, Bischof H, Brox T, Frahm J-M (eds) Computer vision: ECCV 2020. Springer, Cham, pp 660–676

    Chapter  Google Scholar 

  49. Zhen M, Wang J, Zhou L, Li S, Shen T, Shang J, Fang T, Quan L (2020) Joint semantic segmentation and boundary detection using iterative pyramid contexts. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), 13–19 June 2020, pp. 13663–13672

  50. Casas S, Gulino C, Suo S, Urtasun R (2020) The importance of prior knowledge in precise multimodal prediction. In: 2020 IEEE/RSJ international conference on intelligent robots and systems (IROS), 24 Oct.–24 Jan. 2021, pp 2295–2302

  51. Woo S, Park J, Lee J-Y, Kweon IS (2018) CBAM: convolutional block attention module. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer vision: ECCV 2018. Springer International Publishing, Cham, pp 3–19

    Chapter  Google Scholar 

  52. Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-CAM: visual explanations from deep networks via gradient-based localization. In: 2017 IEEE international conference on computer vision (ICCV), 22–29 Oct. 2017, pp. 618–626

Download references

Funding

No funding was received for conducting this study.

Author information

Authors and Affiliations

Authors

Contributions

CY is responsible for the main research work and paper writing. QZ is responsible for the guidance of the project.

Corresponding author

Correspondence to Qiuju Zhang.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflicts of interest to this work.

Ethics approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

All authors have read this manuscript and would like to have it considered exclusively for publication in neural computing and applications.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yin, C., Zhang, Q. Object affordance detection with boundary-preserving network for robotic manipulation tasks. Neural Comput & Applic 34, 17963–17980 (2022). https://doi.org/10.1007/s00521-022-07446-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07446-4

Keywords

Navigation