Skip to main content
Log in

Object affordance detection with relationship-aware network

  • Extreme Learning Machine and Deep Learning Networks
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Object affordance detection, which aims to understand functional attributes of objects, is of great significance for an autonomous robot to achieve a humanoid object manipulation. In this paper, we propose a novel relationship-aware convolutional neural network, which takes the symbiotic relationship between multiple affordances and the combinational relationship between the affordance and objectness into consideration, to predict the most probable affordance label for each pixel in the object. Different from the existing CNN-based methods that rely on separate and intermediate object detection step, our proposed network directly produces the pixel-wise affordance maps from an input image in an end-to-end manner. Specifically, there are three key components in our proposed network: Coord-ASPP module introducing CoordConv in atrous spatial pyramid pooling (ASPP) to refine the feature maps, relationship-aware module linking the affordances and corresponding objects to explore the relationships, and online sequential extreme learning machine auxiliary attention module focusing on individual affordances further to assist relationship-aware module. The experimental results on two public datasets have shown the merits of each module and demonstrated the superiority of our relationship-aware network against the state of the arts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Gibson JJ (1977) The theory of affordances. Hilldale, USA

    Google Scholar 

  2. Myers A, Teo CL, Fermuller C, Aloimonos Y (2015) Affordance detection of tool parts from geometric features. In: IEEE conference on robotics and automation (ICRA), pp 1374–1381

  3. Do T, Nguyen AT, Reid ID (2018) AffordanceNet: an end-to-end deep learning approach for object affordance detection. In: IEEE conference on robotics and automation (ICRA), pp 1–5

  4. Nguyen A, Kanoulas D, Caldwell DG, Tsagarakis NG (2016) Detecting object affordances with convolutional neural networks. In: IEEE/RSJ conference on intelligent robots and systems (IROS), pp 2765-2770

  5. Roy A, Todorovic S (2016) A multi-scale CNN for affordance segmentation in RGB images. In: The European conference on computer vision (ECCV), pp 186–201

  6. Trung TP, Thanh-Toan D, Niko S, Ian R (2018) Scenecut: joint geometric and object segmentation for indoor scenes. In: IEEE conference on robotics and automation (ICRA), pp 1–9

  7. Tucker H, James MR, Aaron B (2011) Affordance prediction via learned object attributes. In: IEEE conference on robotics and automation workshops

  8. Hedvig K, Romero J, Danica K (2011) Visual object-action recognition: inferring object affordances from human demonstration. Comput Vis Image Underst 115(1):81–90

    Article  Google Scholar 

  9. Montesano L, Lopes M, Bernardino A, Santosvictor J (2008) Learning object affordances: from sensory-motor coordination to imitation. IEEE Trans Rob 24(1):15–26

    Article  Google Scholar 

  10. Schoeler M, Wörgötter F (2016) Bootstrapping the semantics of tools: affordance analysis of real world objects on a per-part basis. IEEE Trans Cognit Dev Syst 8(2):84–98

    Article  Google Scholar 

  11. Lenz I, Lee H, Saxena A (2013) Deep learning for detecting robotic grasps. Int J Robot Res 34(4–5):705–724

    Google Scholar 

  12. Levine S, Pastor P, Krizhevsky A, Quillen D (2016) Learning hand-eye coordination for robotic grasping with large-scale data collection. In: International symposium on experimental robotics, pp 173–184

  13. Sawatzky J, Srikantha A, Gall J. (2017) Weakly supervised affordance detection. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 5197–5206

  14. Koppula HS, Gupta R, Saxena A (2013) Learning human activities and object affordances from RGB-D videos. Int J Robot Res 32(8):951–970

    Article  Google Scholar 

  15. Nguyen A, Kanoulas D, Caldwell DG, Tsagarakis NG (2017) Object-based affordances detection with convolutional neural networks and dense conditional random fields. In: Intelligent robots and systems, pp 5908–5915

  16. Lecun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436

    Article  Google Scholar 

  17. Huang GB et al (2012) Extreme Learning Machine for Regression and Multiclass Classification. IEEE Trans Syst Man Cybern—Part B: Cybern 42(2):513–529

    Article  Google Scholar 

  18. Zhang L, Zhang D, Tian FC (2016) SVM and ELM: who wins? object recognition with deep convolutional features from imagenet. Proc ELM 2015 1:249–263

    Article  Google Scholar 

  19. Duan MX, Li KL, Li KQ (2018) An Ensemble CNN2ELM for Age Estimation. IEEE Trans Inf Forensics Secur 13(3):758–772

    Article  Google Scholar 

  20. Chang P, Zhang J, Hu J, Song Z (2018) A deep neural network based on ELM for semi-supervised learning of image classification. Neural Process Lett 48(1):375–388

    Article  Google Scholar 

  21. Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2016) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848

    Article  Google Scholar 

  22. Liu R, Lehman J, Molino P, Such FP, Frank E, Sergeev A, Yosinski J (2018) An intriguing failing of convolutional neural networks and the CoordConv solution. In: Advances in neural information processing systems, pp 9605–9616

  23. Luong T, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Empirical methods in natural language processing, pp 1412–1421

  24. Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1):489–501

    Article  Google Scholar 

  25. Liang NY, Huang GB, Saratchandran P, Sundararajan N (2006) A fast and accurate online sequential learning algorithm for feedforward networks. IEEE Trans Neural Networks 17(6):1411–1423

    Article  Google Scholar 

  26. Huynh HT, Won Y (2011) Regularized online sequential learning algorithm for single-hidden layer feedforward neural networks. Pattern Recogn Lett 32(14):1930–1935

    Article  Google Scholar 

  27. Zhang H, Dana K, Shi J, Zhang Z, Wang X, Tyagi A, et al (2018) Context encoding for semantic segmentation. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 7151–7160

  28. Ran M, Zelnikmanor L, Tal A (2014) How to evaluate foreground maps. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 248–255

  29. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  30. He K, Zhang X, Ren S, Sun J (2016) Deep Residual Learning for Image Recognition. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778

Download references

Acknowledgements

This work was supported by National Key R&D Program of China under Grant 2017YFB130092, National Natural Science Foundation of China (NSFC) under Grants 61872327 and 61472380 as well as the Fundamental Research Funds for the Central Universities under Grant WK2380000001.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang Cao.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest to this work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, X., Cao, Y. & Kang, Y. Object affordance detection with relationship-aware network. Neural Comput & Applic 32, 14321–14333 (2020). https://doi.org/10.1007/s00521-019-04336-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-019-04336-0

Keywords

Navigation