Object affordance detection with boundary-preserving network for robotic manipulation tasks

Yin, Congcong; Zhang, Qiuju

doi:10.1007/s00521-022-07446-4

Object affordance detection with boundary-preserving network for robotic manipulation tasks

Original Article
Published: 07 June 2022

Volume 34, pages 17963–17980, (2022)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

557 Accesses
6 Citations
1 Altmetric
Explore all metrics

Abstract

Object affordance detection aims to identify, locate and segment the functional regions of objects, so that robots can understand and manipulate objects like humans. The affordance detection task has two main challenges: (1) Due to the need to provide accurate positioning information for the robot to manipulate objects, the affordance segmentation results are required to have high boundary quality. (2) Different kinds of objects have significant differences in appearances, but may have the same affordance. Correspondingly, parts with the same appearance may have different affordances. The existing methods regard affordance detection as an image segmentation problem, without focusing on the boundary quality of detection results. In addition, most of the existing methods do not consider the potential relationship between object categories and object affordances. Aiming at the above problems, we propose a boundary-preserving network (BPN) to provide affordance masks with better boundary quality for robots to manipulate objects. Our framework contains three new components: the IoU (Intersection-over-Union) branch, the affordance boundary branch and the relationship attention module. The IoU branch is used to predict the IoU score of each object bounding box. The affordance boundary branch is used to guide the network to learn the boundary features of objects. The relationship attention module is used to enhance the feature representation capability of the network by exploring the potential relationship between object categories and object affordances. Experiments show that our method is helpful to improve the boundary quality of the predicted affordance masks. On the IIT-AFF dataset, the performance of the proposed BPN is 2.32% (F-score) and 2.89% (F-score) higher than that of the strong baseline in terms of affordance masks and the boundaries of affordance masks, respectively. Furthermore, the real-world robot manipulation experiments show that the proposed BPN can provide accurate affordance information for robots to manipulate objects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fusion-Mask-RCNN: Visual robotic grasping in cluttered scenes

Article 05 August 2023

A New Semantic Edge Aware Network for Object Affordance Detection

Article 08 December 2021

One-Shot Object Affordance Detection in the Wild

Article 08 August 2022

Data availability

The public IIT-AFF dataset is used in this project.

Code availability

The project is not complete yet, and we will release some of the source code when the project is complete.

References

Gibson JJ (2014) The ecological approach to visual perception, classic. Psychology Press, Hove
Book Google Scholar
Ferretti, G.: A distinction concerning vision-for-action and affordance perception. Consciousness and Cognition 87 (2021). doi:https://doi.org/10.1016/j.concog.2020.103028
Hassanin M, Khan S, Tahtali M (2021) Visual affordance and function understanding: a survey. Acm Comput Surv. https://doi.org/10.1145/3446370
Article Google Scholar
Do TT, Nguyen A, Reid I (2018) AffordanceNet: an end-to-end deep learning approach for object affordance detection. In: 2018 IEEE international conference on robotics and automation (ICRA), 21–25 May 2018, pp 5882–5889
Chu F, Xu R, Vela PA (2019) Learning affordance segmentation for real-world robotic manipulation via synthetic images. IEEE Robot Autom Lett 4(2):1140–1147. https://doi.org/10.1109/LRA.2019.2894439
Article Google Scholar
Minh CND, Gilani SZ, Islam SMS, Suter D (2020) Learning affordance segmentation: an investigative study. In: 2020 digital image computing: techniques and applications (DICTA), 29 Nov.–2 Dec. 2020, pp 1–8
Zhao X, Cao Y, Kang Y (2020) Object affordance detection with relationship-aware network. Neural Comput Appl 32(18):14321–14333. https://doi.org/10.1007/s00521-019-04336-0
Article Google Scholar
Gu QP, Su JH, Yuan L (2021) Visual affordance detection using an efficient attention convolutional neural network. Neurocomputing 440:36–44. https://doi.org/10.1016/j.neucom.2021.01.018
Article Google Scholar
Chen X, Lian Y, Jiao L, Wang H, Gao Y, Lingling S (2020) Supervised edge attention network for accurate image instance segmentation. In: Vedaldi A, Bischof H, Brox T, Frahm J-M (eds) Computer vision: ECCV 2020, 2020, pp. 617–631. Springer International Publishing, Cham
He K, Gkioxari G, Dollár P, Girshick R (2020) Mask R-CNN. IEEE Trans Pattern Anal Mach Intell 42(2):386–397. https://doi.org/10.1109/TPAMI.2018.2844175
Article Google Scholar
Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031
Article Google Scholar
Cai Z, Vasconcelos N (2018) Cascade R-CNN: delving into high quality object detection. In: 2018 IEEE/CVF conference on computer vision and pattern recognition (CVPR), 18–23 June 2018, pp 6154–6162
Wu Y, Chen Y, Yuan L, Liu Z, Wang L, Li H, Fu Y (2020) Rethinking classification and localization for object detection. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), 13–19 June 2020, pp 10183–10192
Jiang B, Luo R, Mao J, Xiao T, Jiang Y (2018) Acquisition of localization confidence for accurate object detection. In: Computer vision: ECCV 2018, pp 816–832. Springer International Publishing, Cham
Chi C, Zeng X, Bruniaux P, Tartare G (2021) A study on segmentation and refinement of key human body parts by integrating manual measurements. Ergonomics. https://doi.org/10.1080/00140139.2021.1963489
Article Google Scholar
Ruiz E, Mayol-Cuevas W (2020) Geometric affordance perception: leveraging deep 3D saliency with the interaction tensor. Front Neurorobot. https://doi.org/10.3389/fnbot.2020.00045
Article Google Scholar
Qian K, Jing XS, Duan YH, Zhou B, Fang F, Xia J, Ma XD (2020) Grasp pose detection with affordance-based task constraint learning in single-view point clouds. J Intell Rob Syst 100(1):145–163. https://doi.org/10.1007/s10846-020-01202-3
Article Google Scholar
Corona E, Pumarola A, Alenyà G, Moreno-Noguer F, Rogez G (2020) GanHand: predicting human grasp affordances in multi-object scenes. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), 13–19 June 2020, pp. 5030–5040
Montesano L, Lopes M, Bernardino A, Santos-Victor J (2008) Learning object affordances: from sensory-motor coordination to imitation. IEEE Trans Rob 24(1):15–26. https://doi.org/10.1109/TRO.2007.914848
Article Google Scholar
Chang O (2015) A bio-inspired robot with visual perception of affordances. In: Agapito L, Bronstein MM, Rother C (eds) Computer vision: ECCV 2014 workshops, pp 420–426. Springer International Publishing, Cham
Song HO, Fritz M, Goehring D, Darrell T (2016) Learning to detect visual grasp affordance. IEEE Trans Autom Sci Eng 13(2):798–809. https://doi.org/10.1109/TASE.2015.2396014
Article Google Scholar
Myers A, Teo CL, Fermüller C, Aloimonos Y (2015) Affordance detection of tool parts from geometric features. In: 2015 IEEE international conference on robotics and automation (ICRA), 26–30 May 2015, pp 1374–1381
Lakani SR, Rodríguez-Sánchez AJ, Piater J (2017) Can affordances guide object decomposition into semantically meaningful parts? In: 2017 IEEE winter conference on applications of computer vision (WACV), 24–31 March 2017, pp 82–90
Lakani SR, Rodríguez-Sánchez AJ, Piater J (2018) Exercising affordances of objects: a part-based approach. IEEE Robot Autom Lett 3(4):3465–3472. https://doi.org/10.1109/LRA.2018.2853639
Article Google Scholar
Rezapour Lakani S, Rodríguez-Sánchez AJ, Piater J (2019) Towards affordance detection for robot manipulation using affordance for parts and parts for affordance. Auton Robot 43(5):1155–1172. https://doi.org/10.1007/s10514-018-9787-5
Article Google Scholar
Iizuka M, Hashimoto M (2018) Detection of semantic grasping-parameter using part-affordance recognition. In: 2018 19th International conference on research and education in mechatronics (REM), 7–8 June 2018, pp 136–140
Iizuka M, Akizuki S, Hashimoto M (2019) Accuracy improvement of functional attribute recognition by dense CRF considering object shape. Electron Commun Jpn 102(3):56–62. https://doi.org/10.1002/ecj.12151
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), 27–30 June 2016, pp 770–778
Huang G, Liu Z, Maaten LVD, Weinberger KQ (2017) Densely connected convolutional networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), 21–26 July 2017, pp 2261–2269
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, 18–23 June 2018, pp 7132–7141
Tan M, Le Q (2019) Efficientnet: rethinking model scaling for convolutional neural networks. In: International conference on machine learning 2019, pp 6105–6114. PMLR
Zhang H, Wu C, Zhang Z, Zhu Y, Lin H, Zhang Z, Sun Y, He T, Mueller J, Manmatha RJ (2020) Resnest: split-attention networks. arXiv preprint arXiv:2004.08955 (2020)
Ardón P, Pairet È, Lohan KS, Ramamoorthy S, Petrick RJ (2020) Affordances in robotic tasks--a survey. arXiv:2004.07400
Lenz I, Lee H, Saxena A (2015) Deep learning for detecting robotic grasps. Int J Robot Res 34(4–5):705–724. https://doi.org/10.1177/0278364914549607
Article Google Scholar
Redmon J, Angelova A (2015) Real-time grasp detection using convolutional neural networks. In: 2015 IEEE international conference on robotics and automation (ICRA), 26–30 May 2015, pp 1316–1322
Guo D, Sun F, Liu H, Kong T, Fang B, Xi N (2017) A hybrid deep architecture for robotic grasp detection. In: 2017 IEEE international conference on robotics and automation (ICRA), 29 May–3 June 2017, pp 1609–1614
Chu F, Xu R, Vela PA (2018) Real-world multiobject, multigrasp detection. IEEE Robot Autom Lett 3(4):3355–3362. https://doi.org/10.1109/LRA.2018.2852777
Article Google Scholar
Watson J, Hughes J, Iida F (2017) Real-world, real-time robotic grasping with convolutional neural networks. In: Towards autonomous robotic systems, pp 617–626. Springer International Publishing, Cham
Ardón PÈP, Petillot Y, Petrick RPA, Ramamoorthy S, Lohan KS (2021) Self-assessment of grasp affordance transfer. In: 2020 IEEE/RSJ international conference on intelligent robots and systems (IROS), 24 Oct.–24 Jan. 2021, pp 9385–9392
Nguyen A, Kanoulas D, Caldwell DG, Tsagarakis NG (2016) Detecting object affordances with convolutional neural networks. In: 2016 IEEE/RSJ international conference on intelligent robots and systems (IROS), 9–14 Oct. 2016, pp. 2765–2770
Nguyen A, Kanoulas D, Caldwell DG, Tsagarakis NG (2017) Object-based affordances detection with convolutional neural networks and dense conditional random fields. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), 24–28 Sept. 2017, pp. 5908–5915
Chu FJ, Xu R, Seguin L, Vela PA (2019) Toward affordance detection and ranking on novel objects for real-world robotic manipulation. IEEE Robot Autom Lett 4(4):4070–4077. https://doi.org/10.1109/LRA.2019.2930364
Article Google Scholar
Yin C, Zhang Q, Ren W (2021) A new semantic edge aware network for object affordance detection. J Intell Rob Syst 104(1):2. https://doi.org/10.1007/s10846-021-01525-9
Article Google Scholar
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: 3rd international conference on learning representations (ICLR 2015), 7–9 May 2015, pp 1–14
Lin T, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), 21–26 July 2017, pp 936–944
Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), 15–20 June 2019, pp 3141–3149
Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D (2020) Distance-IoU loss: faster and better learning for bounding box regression. Proc AAAI Conf Artif Intell 34(07):12993–13000. https://doi.org/10.1609/aaai.v34i07.6999
Article Google Scholar
Cheng T, Wang X, Huang L, Liu W (2020) Boundary-preserving mask R-CNN. In: Vedaldi A, Bischof H, Brox T, Frahm J-M (eds) Computer vision: ECCV 2020. Springer, Cham, pp 660–676
Chapter Google Scholar
Zhen M, Wang J, Zhou L, Li S, Shen T, Shang J, Fang T, Quan L (2020) Joint semantic segmentation and boundary detection using iterative pyramid contexts. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), 13–19 June 2020, pp. 13663–13672
Casas S, Gulino C, Suo S, Urtasun R (2020) The importance of prior knowledge in precise multimodal prediction. In: 2020 IEEE/RSJ international conference on intelligent robots and systems (IROS), 24 Oct.–24 Jan. 2021, pp 2295–2302
Woo S, Park J, Lee J-Y, Kweon IS (2018) CBAM: convolutional block attention module. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer vision: ECCV 2018. Springer International Publishing, Cham, pp 3–19
Chapter Google Scholar
Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-CAM: visual explanations from deep networks via gradient-based localization. In: 2017 IEEE international conference on computer vision (ICCV), 22–29 Oct. 2017, pp. 618–626

Download references

Funding

No funding was received for conducting this study.

Author information

Authors and Affiliations

School of Mechanical Engineering, Jiangnan University, Wuxi, 214122, China
Congcong Yin & Qiuju Zhang
Jiangsu Province Key Laboratory of Advanced Food Manufacturing Equipment and Technology, Jiangnan University, Wuxi, 214122, China
Congcong Yin & Qiuju Zhang

Authors

Congcong Yin
View author publications
You can also search for this author in PubMed Google Scholar
Qiuju Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

CY is responsible for the main research work and paper writing. QZ is responsible for the guidance of the project.

Corresponding author

Correspondence to Qiuju Zhang.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflicts of interest to this work.

Ethics approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

All authors have read this manuscript and would like to have it considered exclusively for publication in neural computing and applications.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yin, C., Zhang, Q. Object affordance detection with boundary-preserving network for robotic manipulation tasks. Neural Comput & Applic 34, 17963–17980 (2022). https://doi.org/10.1007/s00521-022-07446-4

Download citation

Received: 19 October 2021
Accepted: 14 May 2022
Published: 07 June 2022
Issue Date: October 2022
DOI: https://doi.org/10.1007/s00521-022-07446-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object affordance detection with boundary-preserving network for robotic manipulation tasks

Abstract

Access this article

Similar content being viewed by others

Fusion-Mask-RCNN: Visual robotic grasping in cluttered scenes

A New Semantic Edge Aware Network for Object Affordance Detection

One-Shot Object Affordance Detection in the Wild

Data availability

Code availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Ethics approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Object affordance detection with boundary-preserving network for robotic manipulation tasks

Abstract

Access this article

Similar content being viewed by others

Fusion-Mask-RCNN: Visual robotic grasping in cluttered scenes

A New Semantic Edge Aware Network for Object Affordance Detection

One-Shot Object Affordance Detection in the Wild

Data availability

Code availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Ethics approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation