Skip to main content
Log in

RGRN: Relation-aware graph reasoning network for object detection

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In the field of object detection, due to the complexity of realistic scenarios, the objects are mostly obscured and semantic-confusable. The existing CNNs-based object detectors focus only on the information within the region proposal and ignore the auxiliary role of objects-objects relationships, leading to difficulties distinguishing difficult samples in complex spaces. Accordingly, in this paper, we propose a novel relation-aware graph reasoning network (RGRN) to adaptively discover and integrate key semantic and spatial relationships in images. Specifically, in order to realize information interaction and relational reasoning between nodes, we design two parallel modules: the semantic relational reasoning module (SRRM) and the spatial relational reasoning module (SPRM). SRRM mines the semantic relationships between objects by discriminating the semantic similarity between graph nodes, and SPRM finds the spatial relationships between objects by the relative positions between nodes. Our method considers the relative spatial location and semantic correlation between objects, which can easily embed in existing networks in real-time to improve performance. Solid experiments verify the effectiveness of our method, which achieves around 16\(\%\) improvement on MS COCO and 10\(\%\) on PASCAL VOC in terms of mAP and outperforms the state-of-the-art relation-based methods, which indicates the superiority and effectiveness of RGRN.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availibility statement

The data that support the findings of this study are available from the Chu J upon reasonable request.

References

  1. Girshick R (2015) Fast r-cnn, In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 1440–1448

  2. Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: towards realtime object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell (6):1137–1149

  3. Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2117–2125

  4. Dai J, Qi H, Xiong Y, Li Y, Zhang G, Hu H, Wei Y (2017) Deformable convolutional networks, In Proceedings of the IEEE international conference on computer vision, pp 764–773

  5. Zhang Y, Chu J, Leng L, Miao J (2020) Mask-refined r-cnn: a network for refining object details in instance segmentation. Sensors 20(4):1010

    Article  Google Scholar 

  6. Chu J, Guo Z, Leng L (2018) Object detection based on multi-layer convolution feature fusion and online hard example mining. IEEE Access 6:19–959

    Article  Google Scholar 

  7. Park H-J, Choi Y-J, Lee Y-W, Kim B-G (2022) SSFPN: scale sequence (\(s^{}\)2) feature based feature pyramid network for object detection, arXiv preprint arXiv:2208.11533

  8. Bell S, Zitnick CL, Bala K, Girshick R (2016) Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks, In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 2874–2883

  9. Ouyang W, Luo P, Zeng X, Qiu S, Tian Y, Li H, Yang S, Wang Z, Xiong Y, Qian C et al (2014) Deepid-net: multi-stage and deformable deep convolutional neural networks for object detection, arXiv preprint arXiv:1409.3505

  10. Chen X, Li L-J, Fei-Fei L, Gupta A (2018) Iterative visual reasoning beyond convolutions, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 7239–7248

  11. Chen Z, Zhang J, Tao D (2021) Recursive context routing for object detection. Int J Comput Vision 129(1):142–160

    Article  MATH  Google Scholar 

  12. Chen S, Li Z, Huang F, Zhang C, Ma H (2020) Improving object detection with relation mining network. In: 2020 IEEE International Conference on Data Mining (ICDM). IEEE, pp 52–61

  13. Cao P, Zhu Z, Wang Z, Zhu Y, Niu Q (2022) Applications of graph convolutional networks in computer vision. Neural Comput Appl 34:1–19

    Article  Google Scholar 

  14. Wang H, Qin K, Zakari RY, Lu G, Yin J (2022) Deep neural network-based relation extraction: an overview. Neural Comput Appl. https://doi.org/10.1007/s00521-021-06667-3

    Article  Google Scholar 

  15. Pise AA, Vadapalli H, Sanders I (2021) Relational reasoning using neural networks: a survey. Intern J Uncertain Fuzziness Knowl-Based Syst 29(Suppl 2):237–258

    Article  MathSciNet  Google Scholar 

  16. Li J, Wei Y, Liang X, Dong J, Xu T, Feng J, Yan S (2016) Attentive contexts for object detection. IEEE Trans Multimed 19(5):944–954

    Article  Google Scholar 

  17. Liu Y, Wang R, Shan S, Chen X (2018) Structure inference net: object detection using scene-level context and instance-level relationships, In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 6985–6994

  18. Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context, In: European conference on computer vision (ECCV). Springer, pp 740–755

  19. Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338

    Article  Google Scholar 

  20. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: single shot multibox detector, In: European conference on computer vision (ECCV). Springer, pp 21–37

  21. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 779–788

  22. Mi L, Chen Z (2020) Hierarchical graph attention network for visual relationship detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 13 886–13 895

  23. Lin X, Zou Q, Xu X (2021) Action-guided attention mining and relation reasoning network for human-object interaction detection. In: Proceedings of the Twenty-Ninth international conference on international joint conferences on artificial intelligence (IJCAI), pp 1104–1110

  24. Zhai Q, Li X, Yang F, Chen C, Cheng H, Fan D-P (2021) Mutual graph learning for camouflaged object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 12 997–13 007

  25. Chen X, Gupta A (2017) Spatial memory for context reasoning in object detection. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 4086–4096

  26. Li Z, Du X, Cao Y (2020) Gar: graph assisted reasoning for object detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision (WACV), pp 1295–1304

  27. Jang S, Battulga L, Nasridinov A (2020) Detection of dangerous situations using deep learning model with relational inference. J Multimed Inf Syst 7(3):205–214

    Article  Google Scholar 

  28. Liu F, Liu J, Wang W, Lu H (2021) Hair: hierarchical visual-semantic relational reasoning for video question answering. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1698–1707

  29. Santoro A, Raposo D, Barrett DG et al (2017) A simple neural network module for relational reasoning[J]. Adv Neural Inf Process Syst (30):1–0

  30. Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G (2008) The graph neural network model. IEEE Trans Neural Netw 20(1):61–80

    Article  Google Scholar 

  31. Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks, arXiv preprint arXiv:1609.02907

  32. Choi Y-J, Lee Y-W, Kim B-G (2021) Residual-based graph convolutional network for emotion recognition in conversation for smart internet of things. Big Data 9(4):279–288

    Article  Google Scholar 

  33. Velickovic P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. ICLR 1050:1–12

    Google Scholar 

  34. Arnab A, Sun C, Schmid C (2021) Unified graph structured models for video understanding. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp 8117–8126

  35. Tian S, Kang L, Xing X, Tian J, Fan C, Zhang Y (2021) A relation-augmented embedded graph attention network for remote sensing object detection. IEEE Trans Geosci Remote Sens 60:1–18

    Google Scholar 

  36. Jain V, Kaliyar RK, Goswami A, Narang P, Sharma Y (2022) Aenet: an attention-enabled neural architecture for fake news detection using contextual features. Neural Comput Appl 34(1):771–782

    Article  Google Scholar 

  37. Li L, Gan Z, Cheng Y, Liu J (2019) Relation-aware graph attention network for visual question answering. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp 10 313–10 322

  38. Ladicky L, Russell C, Kohli P, Torr PH (2010) Graph cut based inference with co-occurrence statistics. In: Computer Vision–ECCV 2010: 11th European conference on computer vision, Heraklion, Crete, Greece, September 5-11, 2010, Proceedings, Part V 11. Springer, pp 239–253

  39. Chen Z-M, Wei X-S, Wang P, Guo Y (2019) Multi-label image recognition with graph convolutional networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 5177–5186

  40. Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645

    Article  Google Scholar 

  41. Hu H, Gu J, Zhang Z, Dai J, . Wei Y (2018) Relation networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3588–3597

  42. Xu H, Jiang C, Liang X, Li Z (2019) Spatial-aware graph relation network for large-scale object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 9298–9307

  43. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008

  44. Zhao G, Ge W, Yu Y (2021) Graphfpn: Graph feature pyramid network for object detection. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp 2763–2772

  45. Liang T, Wang Y, Tang Z, Hu G, Ling H (2021) Opanas: One-shot path aggregation network architecture search for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10 195–10 203

  46. Liu H, You X, Wang T, Li Y (2022) Object detection via inner-inter relational reasoning network. Image Vis Comput 130:104615

    Article  Google Scholar 

  47. PASZKE A, GROSS S, MASSA F et al (2019) Pytorch: an imperative style, high performance deep learning library[J]. arXiv preprint arXiv:1912.01703

  48. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556

  49. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778

  50. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L, Imagenet: A large-scale hierarchical image database. In: (2009) IEEE conference on computer vision and pattern recognition (CVPR). IEEE 2009:248–255

  51. S. Zhang, L. Wen, X. Bian, Z. Lei, and S. Z. Li, Single-shot refinement neural network for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp 4203–4212

  52. Liu S, Huang D et al (2018) Receptive field block net for accurate and fast object detection. In: Proceedings of the European conference on computer vision (ECCV), pp 385–400

  53. Ma W, Wu Y, Cen F, Wang G (2020) Mdfn: multi-scale deep feature learning network for object detection. Pattern Recogn 100:107149

    Article  Google Scholar 

  54. Fu C-Y, Liu W, Ranga A, Tyagi A, Berg AC (2017) Dssd: deconvolutional single shot detector, arXiv preprint arXiv:1701.06659

  55. Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q (2019) Centernet: keypoint triplets for object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6569–6578

  56. Hoiem D, Chodpathumwan Y, Dai Q (2012) Diagnosing error in object detectors. In European conference on computer vision (ECCV). Springer, pp 340–353

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant NO. 62162045 and 61866028) and Technology Innovation Guidance Program Project of Jiangxi Province (Special Project of Technology Cooperation) (Grant No. 20212BDH81003).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Chu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, J., Chu, J., Leng, L. et al. RGRN: Relation-aware graph reasoning network for object detection. Neural Comput & Applic 35, 16671–16688 (2023). https://doi.org/10.1007/s00521-023-08550-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08550-9

Keywords