Abstract
Object detection using convolutional neural networks addresses the recognition problem solely in terms of feature extraction and disregards knowledge and experience to explore higher-level relationships between objects. This paper proposed a knowledge graph network based on a graph convolution network to improve the accuracy of baseline detectors. This network can be integrated into any object detection framework. First, this paper created an experience memory module to store information about categories in the database. When inputting the image to the database, an experience vector for it was obtained. The experience data graph was then constructed by counting the co-occurrences of labels in the dataset. Finally, a graph convolutional neural network was used to extract the relationship between the experience vector and the data graph matrix. This relational pattern can help the baseline detector perform better. Several classical object detectors were then evaluated using the COCO, VOC, and KITTI datasets. The results indicated a significant increase for the baseline detector in mAP using the knowledge graph network.


















Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
The raw/processed data required to reproduce these findings cannot beshared at this time as the data also forms part of an ongoing study.
References
Prakash A, Chitta K, Geiger A (2021) Multi-modal fusion transformer for end-to-end autonomous driving. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7077–7087
Chen B-H, Huang S-C (2014) An advanced moving object detection algorithm for automatic traffic monitoring in real-world limited bandwidth networks. IEEE Trans Multimed 16(3):837–847
Chekroun R, Toromanoff M, Hornauer S, Moutarde F (2021) Gri: general reinforced imitation and its application to vision-based autonomous driving. arXiv:2111.08575
Cho Y, Jeong J, Kim A (2018) Model-assisted multiband fusion for single image enhancement and applications to robot vision. IEEE Robot Autom Lett 3(4):2822–2829
Zhang B, Qian J (2021) Autoencoder-based unsupervised clustering and hashing. Appl Intell 51(1):493–505
Kawulok M, Benecki P, Piechaczek S, Hrynczenko K, Kostrzewa D, Nalepa J (2019) Deep learning for multiple-image super-resolution. IEEE Geosci Remote Sens Lett 17(6):1062–1066
Kamath U, Liu J, Whitaker J (2019) Deep learning for NLP and speech recognition, vol 84. Springer
Mahmud M, Kaiser MS, Hussain A, Vassanelli S (2018) Applications of deep learning and reinforcement learning to biological data. IEEE Trans Neur Netw Learn Syst 29(6):2063–2079
He Z, Zhang L (2019) Multi-adversarial faster-rcnn for unrestricted object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6668–6677
Chen Y, Li W, Sakaridis C, Dai D, Van Gool L (2018) Domain adaptive faster r-cnn for object detection in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3339–3348
Eggert C, Brehm S, Winschel A, Zecha D, Lienhart R (2017) A closer look: Small object detection in faster r-cnn. In: 2017 IEEE International conference on multimedia and expo (ICME), pp 421–426. IEEE
Pang Y, Wang T, Anwer RM, Khan FS, Shao L (2019) Efficient featurized image pyramid network for single shot detector. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7336–7344
Xu K, Wang X, Liu X, Cao C, Li H, Peng H, Wang D (2021) A dedicated hardware accelerator for real-time acceleration of yolov2. J Real-Time Image Process 18(3):481–492
Farhadi A, Redmon J (2018) Yolov3: an incremental improvement. In: Computer vision and pattern recognition, pp 1804–2767. Springer, Berlin
Ge Z, Liu S, Wang F, Li Z, Sun J (2021) Yolox: exceeding yolo series in 2021. arXiv:2107.08430
Chen X, Gupta A (2017) Spatial memory for context reasoning in object detection. In: Proceedings of the IEEE international conference on computer vision, pp 4086–4096
Hu H, Gu J, Zhang Z, Dai J, Wei Y (2018) Relation networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3588–3597
Chen X, Li L-J, Fei-Fei L, Gupta A (2018) Iterative visual reasoning beyond convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7239–7248
Chen Z-M, Wei X-S, Wang P, Guo Y (2019) Multi-label image recognition with graph convolutional networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5177–5186
Xu H, Jiang C, Liang X, Lin L, Li Z (2019) Reasoning-rcnn: unifying adaptive global reasoning into large-scale object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6419–6428
Hossain SI, Akhand M, Shuvo M, Siddique N, Adeli H (2019) Optimization of university course scheduling problem using particle swarm optimization with selective search. Exp Syst Applic 127:9–24
Wang X, Shrivastava A, Gupta A (2017) A-fast-rcnn: hard positive generation via adversary for object detection, pp 2606–2615
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Adv Neur Inform Process Syst 28:91–99
Ye Y, Chen H, Zhang C, Hao X, Zhang Z (2020) Sarpnet: shape attention regional proposal network for lidar-based 3d object detection. Neurocomputing 379:53–63
Wang J, Song L, Li Z, Sun H, Sun J, Zheng N (2021) End-to-end object detection with fully convolutional network. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 15849–15858
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
Singh B, Davis LS (2018) An analysis of scale invariance in object detection snip. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3578–3587
Liu W, Liao S, Ren W, Hu W, Yu Y (2019) High-level semantic feature detection: a new perspective for pedestrian detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5187–5196
Kim S-W, Kook H-K, Sun J-Y, Kang M-C, Ko S-J (2018) Parallel feature pyramid network for object detection. In: Proceedings of the European conference on computer vision (ECCV), pp 234–250
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: Single shot multibox detector. In: European conference on computer vision, pp 21–37. Springer
Guo C, Fan B, Zhang Q, Xiang S, Pan C (2020) Augfpn: improving multi-scale feature learning for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12595–12604
He X, Yang J, Kasabov N (2020) Application of an improved focal loss in vehicle detection. In: International conference on artificial intelligence and soft computing, pp 114–123. Springer
Chen Z-H, You Z-H, Guo Z-H, Yi H-C, Luo G-X, Wang Y-B (2020) Prediction of drug–target interactions from multi-molecular network based on deep walk embedding model. Front Bioeng Biotechnol 8:338
De Winter S, Decuypere T, Mitrović S, Baesens B, De Weerdt J (2018) Combining temporal aspects of dynamic networks with node2vec for a more efficient dynamic link prediction. In: 2018 IEEE/ACM International conference on advances in social networks analysis and mining (ASONAM), p IEEE
Qiu J, Dong Y, Ma H, Li J, Wang K, Tang J (2018) Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In: Proceedings of the eleventh ACM international conference on web search and data mining, pp 459–467
Wu F, Souza A, Zhang T, Fifty C, Yu T, Weinberger K (2019) Simplifying graph convolutional networks. In: International conference on machine learning, pp 6861–6871. PMLR
Alippi C, Disabato S, Roveri M (2018) Moving convolutional neural networks to embedded systems: the alexnet and vgg-16 case. In: 2018 17th ACM/IEEE international conference on information processing in sensor networks (IPSN). IEEE, pp 212–223
Liu S, Huang D, Wang Y (2019) Learning spatial fusion for single-shot object detection. arXiv:1911.09516
Liu S, Huang D, et al. (2018) Receptive field block net for accurate and fast object detection. In: Proceedings of the European conference on computer vision (ECCV), pp 385–400
Huang Z, Wang J, Fu X, Yu T, Guo Y, Wang R (2020) Dc-spp-yolo: dense connection and spatial pyramid pooling based yolo for object detection. Inform Sci 522:241–258
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: European conference on computer vision. Springer, pp 213–229
Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision, pp 740–755. Springer
Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The kitti vision benchmark suite. In: 2012 IEEE Conference on computer vision and pattern recognition, pp 3354–3361. IEEE
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China under Grant U1808206, 61972097, and U21A20472, in part by the National Key Research and Development Plan of China under Grant 2021YFB3600503, in part by the Natural Science Foundation of Fujian Province under Grant 2021J01612 and 2020J01494, in part by the Major Science and Technology Project of Fujian Province under Grant 2021HZ022007.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, J., Tan, G., Ke, X. et al. Object detection based on knowledge graph network. Appl Intell 53, 15045–15066 (2023). https://doi.org/10.1007/s10489-022-04116-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-04116-9