skip to main content
research-article

CoBjeason: Reasoning Covered Object in Image by Multi-Agent Collaboration Based on Informed Knowledge Graph

Published: 28 February 2024 Publication History

Abstract

Object detection is a widely studied problem in existing works. However, in this paper, we turn to a more challenging problem of “Covered Object Reasoning”, aimed at reasoning the category label of target object in the given image particularly when it has been totally covered (or invisible). To resolve this problem, we propose CoBjeason to seize the opportunity when visual reasoning meets the knowledge graph, where “empirical cognition” on common visual contexts have been incorporated as knowledge graph to conduct reinforced multi-hop reasoning via two collaborative agents. Such two agents, for one thing, stand at the covered object (or unknown entity) to observe the surrounding visual cues in the given image and gradually select entities and relations from the global gallery-level knowledge graph which contains entity-pairs frequently occurring across the entire image-collection, so as to infer the main structure of image-level knowledge graph forward expanded from the unknown entity. In turn, for another, based on the reasoned image-level knowledge graph, the semantic context among entities will be aggregated backward into unknown entity to select an appropriate entity from the global gallery-level knowledge graph as the reasoning result. Moreover, such two agents will collaborate with each other, securing that the above Forward & Backward Reasoning will step towards the same destination of the higher performance on covered object reasoning. To our best knowledge, this is the first work on Covered Object Reasoning with Knowledge Graphs and reinforced Multi-Agent collaboration. Particularly, our study on Covered Object Reasoning and the proposed model CoBjeason could offer novel insights into more basic Computer Vision (CV) tasks, such as Semantic Segmentation with better understanding on the current scene when some objects are blurred or covered, Visual Question Answering with enhancement on the inference in more complicated visual context when some objects are covered or invisible, and Image Caption Generation with the augmentation on the richness of visual context for images containing partially visible objects. The improvement on the above basic CV tasks can further refine more complicated ones involved with nuanced visual interpretation like Autonomous Driving, where the recognition and reasoning on partially visible or covered object are critical. According to the experimental results, our proposed CoBjeason can achieve the best overall ranking performance on covered object reasoning compared with other models, meanwhile enjoying the advantage of lower “exploration cost”, with the insensitivity against the long-tail covered objects and the acceptable time complexity.

References

[1]
Ryunosuke Amo, Sara Matias, Akihiro Yamanaka, Kenji F. Tanaka, Naoshige Uchida, and Mitsuko Watabe-Uchida. 2022. A gradual temporal shift of dopamine responses mirrors the progression of temporal difference error in machine learning. Nature Neuroscience 25, 8 (2022), 1082–1092.
[2]
Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. Advances in Neural Information Processing Systems 26 (2013).
[3]
Arthur Bucker, Rogerio Bonatti, and Sebastian Scherer. 2021. Do you see what I see? Coordinating multiple aerial cameras for robot cinematography. In 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 7972–7979.
[4]
Riquan Chen, Tianshui Chen, Xiaolu Hui, Hefeng Wu, Guanbin Li, and Liang Lin. 2020. Knowledge graph transfer network for few-shot recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 10575–10582.
[5]
Shengjia Chen, Zhixin Li, and Zhenjun Tang. 2020. Relation R-CNN: A graph based relation-aware network for object detection. IEEE Signal Processing Letters 27 (2020), 1680–1684.
[6]
Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, and Andrew McCallum. 2018. Go for a walk and arrive at the answer: Reasoning over paths in knowledge bases using reinforcement learning. In International Conference on Learning Representations.
[7]
Jingliang Duan, Yang Guan, Shengbo Eben Li, Yangang Ren, Qi Sun, and Bo Cheng. 2021. Distributional soft actor-critic: Off-policy reinforcement learning for addressing value estimation errors. IEEE Transactions on Neural Networks and Learning Systems 33, 11 (2021), 6584–6598.
[8]
Manfred Eppe, Christian Gumbsch, Matthias Kerzel, Phuong D. H. Nguyen, Martin V. Butz, and Stefan Wermter. 2022. Intelligent problem-solving as integrated hierarchical reinforcement learning. Nature Machine Intelligence 4, 1 (2022), 11–20.
[9]
Yuan Fang, Kingsley Kuan, Jie Lin, Cheston Tan, and Vijay Chandrasekhar. 2017. Object detection meets knowledge graphs. (2017). In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence: Melbourne, Australia, August 19, (2017), Vol. 25. 1661–1667.
[10]
William Fedus, Prajit Ramachandran, Rishabh Agarwal, Yoshua Bengio, Hugo Larochelle, Mark Rowland, and Will Dabney. 2020. Revisiting fundamentals of experience replay. In International Conference on Machine Learning. PMLR, 3061–3071.
[11]
Cong Fu, Tong Chen, Meng Qu, Woojeong Jin, and Xiang Ren. 2019. Collaborative policy learning for open knowledge graph reasoning. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2672–2681.
[12]
Junyu Gao, Tianzhu Zhang, and Changsheng Xu. 2019. I know the relationships: Zero-shot action recognition via two-stream graph convolutional networks and knowledge graphs. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 8303–8311.
[13]
Shaoxiong Ji, Shirui Pan, Erik Cambria, Pekka Marttinen, and Philip S. Yu. 2021. A survey on knowledge graphs: Representation, acquisition, and applications. IEEE Transactions on Neural Networks and Learning Systems 33, 2 (2021), 494–514.
[14]
Agustinus Kristiadi, Matthias Hein, and Philipp Hennig. 2020. Being Bayesian, even just a bit, fixes overconfidence in ReLU networks. In International Conference on Machine Learning. PMLR, 5436–5446.
[15]
Yu Lei and Wenjie Li. 2019. Interactive recommendation with user-specific deep reinforcement learning. ACM Transactions on Knowledge Discovery from Data (TKDD) 13, 6 (2019), 1–15.
[16]
Xujia Li, Yanyan Shen, and Lei Chen. 2021. Mcore: Multi-agent collaborative learning for knowledge-graph-enhanced recommendation. In 2021 IEEE International Conference on Data Mining (ICDM). IEEE, 330–339.
[17]
Yakun Li, Lei Hou, and Juanzi Li. 2023. Preference-aware graph attention networks for cross-domain recommendations with collaborative knowledge graph. ACM Transactions on Information Systems 41, 3 (2023), 1–26.
[18]
Yanan Li, Jun Yu, Yibing Zhan, and Zhi Chen. 2021. Relationship graph learning network for visual relationship detection. In Proceedings of the 2nd ACM International Conference on Multimedia in Asia. 1–7.
[19]
Zixuan Li, Xiaolong Jin, Saiping Guan, Yuanzhuo Wang, and Xueqi Cheng. 2018. Path reasoning over knowledge graph: A multi-agent and reinforcement learning based method. In 2018 IEEE International Conference on Data Mining Workshops (ICDMW). IEEE, 929–936.
[20]
Fan Liu, Zhiyong Cheng, Lei Zhu, Zan Gao, and Liqiang Nie. 2021. Interest-aware message-passing GCN for recommendation. In Proceedings of the Web Conference 2021. 1296–1305.
[21]
Pei-Chi Lo and Ee-Peng Lim. 2023. Contextual path retrieval: A contextual entity relation embedding-based approach. ACM Transactions on Information Systems 41, 1 (2023), 1–38.
[22]
Ting Ma, Longtao Huang, Qianqian Lu, and Songlin Hu. 2023. KR-GCN: Knowledge-aware reasoning with graph convolution network for explainable recommendation. ACM Transactions on Information Systems 41, 1 (2023), 1–27.
[23]
Kenneth Marino, Ruslan Salakhutdinov, and Abhinav Gupta. 2017. The more you know: Using knowledge graphs for image classification. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, 20–28.
[24]
Kemal Oksuz, Baris Can Cam, Sinan Kalkan, and Emre Akbas. 2020. Imbalance problems in object detection: A review. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 10 (2020), 3388–3415.
[25]
Heechang Ryu, Hayong Shin, and Jinkyoo Park. 2020. Multi-agent actor-critic with hierarchical graph attention network. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 7236–7243.
[26]
Alberto Santos, Ana R. Colaço, Annelaura B. Nielsen, Lili Niu, Maximilian Strauss, Philipp E. Geyer, Fabian Coscia, Nicolai J. Wewer Albrechtsen, Filip Mundt, Lars Juhl Jensen, and Matthias Mann. 2022. A knowledge graph to interpret clinical proteomics data. Nature Biotechnology 40, 5 (2022), 692–702.
[27]
Qian Sun, Le Zhang, Huan Yu, Weijia Zhang, Yu Mei, and Hui Xiong. 2023. Hierarchical reinforcement learning for dynamic autonomous vehicle navigation at intelligent intersections. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 4852–4861.
[28]
Kun Tian, Chenghao Zhang, Ying Wang, Shiming Xiang, and Chunhong Pan. 2021. Knowledge mining and transferring for domain adaptive object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9133–9142.
[29]
Théo Trouillon, Johannes Welbl, Sebastian Riedel, Éric Gaussier, and Guillaume Bouchard. 2016. Complex embeddings for simple link prediction. In International Conference on Machine Learning. PMLR, 2071–2080.
[30]
Laura Von Rueden, Sebastian Mayer, Katharina Beckh, Bogdan Georgiev, Sven Giesselbach, Raoul Heese, Birgit Kirsch, Julius Pfrommer, Annika Pick, Rajkumar Ramamurthy, Michal Walczak, Jochen Garcke, Christian Bauckhage, and Jannis Schuecker. 2021. Informed machine learning–a taxonomy and survey of integrating prior knowledge into learning systems. IEEE Transactions on Knowledge and Data Engineering 35, 1 (2021), 614–633.
[31]
Guojia Wan and Bo Du. 2021. GaussianPath: A Bayesian multi-hop reasoning framework for knowledge graph reasoning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 4393–4401.
[32]
Guojia Wan, Bo Du, Shirui Pan, and Gholameza Haffari. 2020. Reinforcement learning based meta-path discovery in large-scale heterogeneous information networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 6094–6101.
[33]
Guojia Wan, Shirui Pan, Chen Gong, Chuan Zhou, and Gholamreza Haffari. 2021. Reasoning like human: Hierarchical reinforcement learning for knowledge graph reasoning. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence. 1926–1932.
[34]
Jiapu Wang, Boyue Wang, Junbin Gao, Yongli Hu, and Baocai Yin. 2023. Multi-concept representation learning for knowledge graph completion. ACM Transactions on Knowledge Discovery from Data 17, 1 (2023), 1–19.
[35]
Meng Wang, Sen Wang, Han Yang, Zheng Zhang, Xi Chen, and Guilin Qi. 2021. Is visual context really helpful for knowledge graph? A representation learning perspective. In Proceedings of the 29th ACM International Conference on Multimedia. 2735–2743.
[36]
Xiting Wang, Kunpeng Liu, Dongjie Wang, Le Wu, Yanjie Fu, and Xing Xie. 2022. Multi-level recommendation reasoning over knowledge graphs with reinforcement learning. In Proceedings of the ACM Web Conference 2022. 2098–2108.
[37]
Xiu-Shen Wei, Yi-Zhe Song, Oisin Mac Aodha, Jianxin Wu, Yuxin Peng, Jinhui Tang, Jian Yang, and Serge Belongie. 2021. Fine-grained image analysis with deep learning: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 12 (2021), 8927–8948.
[38]
Lei Xi, Junnan Wu, Yanchun Xu, and Hongbin Sun. 2020. Automatic generation control based on multiple neural networks with actor-critic strategy. IEEE Transactions on Neural Networks and Learning Systems 32, 6 (2020), 2483–2493.
[39]
Dong Xie and Xiangnan Zhong. 2020. Semicentralized deep deterministic policy gradient in cooperative StarCraft games. IEEE Transactions on Neural Networks and Learning Systems 33, 4 (2020), 1584–1593.
[40]
Huaqing Xiong, Lin Zhao, Yingbin Liang, and Wei Zhang. 2020. Finite-time analysis for double Q-learning. Advances in Neural Information Processing Systems 33 (2020), 16628–16638.
[41]
Yunyang Xiong, Hanxiao Liu, Suyog Gupta, Berkin Akin, Gabriel Bender, Yongzhe Wang, Pieter-Jan Kindermans, Mingxing Tan, Vikas Singh, and Bo Chen. 2021. MobileDets: Searching for object detection architectures for mobile accelerators. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3825–3834.
[42]
Hang Xu, Chenhan Jiang, Xiaodan Liang, and Zhenguo Li. 2019. Spatial-aware graph relation network for large-scale object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9298–9307.
[43]
Hang Xu, ChenHan Jiang, Xiaodan Liang, Liang Lin, and Zhenguo Li. 2019. Reasoning-RCNN: Unifying adaptive global reasoning into large-scale object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6419–6428.
[44]
Caixia Yan, Qinghua Zheng, Xiaojun Chang, Minnan Luo, Chung-Hsing Yeh, and Alexander G. Hauptman. 2020. Semantics-preserving graph propagation for zero-shot object detection. IEEE Transactions on Image Processing 29 (2020), 8163–8176.
[45]
Bishan Yang, Scott Wen-tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. 2015. Embedding entities and relations for learning and inference in knowledge bases. In Proceedings of the International Conference on Learning Representations (ICLR) 2015.
[46]
Weixin Zeng, Xiang Zhao, Jiuyang Tang, Xuemin Lin, and Paul Groth. 2021. Reinforcement learning–based collective entity alignment with adaptive features. ACM Transactions on Information Systems (TOIS) 39, 3 (2021), 1–31.
[47]
Denghui Zhang, Zixuan Yuan, Hao Liu, Hui Xiong, and Xiaodong Lin. 2022. Learning to walk with dual agents for knowledge graph reasoning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 5932–5941.
[48]
Kangzhi Zhao, Xiting Wang, Yuren Zhang, Li Zhao, Zheng Liu, Chunxiao Xing, and Xing Xie. 2020. Leveraging demonstrations for reinforcement recommendation reasoning over knowledge graphs. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 239–248.
[49]
Zhaohui Zheng, Ping Wang, Wei Liu, Jinze Li, Rongguang Ye, and Dongwei Ren. 2020. Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 12993–13000.
[50]
Chenchen Zhu, Fangyi Chen, Uzair Ahmed, Zhiqiang Shen, and Marios Savvides. 2021. Semantic relation reasoning for shot-stable few-shot object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8782–8791.
[51]
Zhengxia Zou, Keyan Chen, Zhenwei Shi, Yuhong Guo, and Jieping Ye. 2023. Object detection in 20 years: A survey. Proc. IEEE 111, 3 (2023), 257–276.

Index Terms

  1. CoBjeason: Reasoning Covered Object in Image by Multi-Agent Collaboration Based on Informed Knowledge Graph

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Knowledge Discovery from Data
    ACM Transactions on Knowledge Discovery from Data  Volume 18, Issue 5
    June 2024
    699 pages
    EISSN:1556-472X
    DOI:10.1145/3613659
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 February 2024
    Online AM: 26 January 2024
    Accepted: 22 January 2024
    Revised: 07 December 2023
    Received: 17 July 2023
    Published in TKDD Volume 18, Issue 5

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Covered object reasoning
    2. visual reasoning
    3. multi-hop knowledge graph reasoning
    4. multi-agent reinforcement learning

    Qualifiers

    • Research-article

    Funding Sources

    • National Natural Science Foundation of China
    • Natural Science Foundation of Jiangsu Province (Basic Research Program)

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 328
      Total Downloads
    • Downloads (Last 12 months)261
    • Downloads (Last 6 weeks)17
    Reflects downloads up to 27 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media