Complete interest propagation from part for visual relation of interest detection

Zhou, You; Yu, Fan

doi:10.1007/s13042-022-01603-w

Complete interest propagation from part for visual relation of interest detection

Original Article
Published: 02 August 2022

Volume 14, pages 455–465, (2023)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

178 Accesses
1 Altmetric
Explore all metrics

Abstract

Visual relation detection (VRD) is proposed to describe an image with visual relation triplets in the form of <subject, predicate, object>. As a further extension of the traditional VRD task, visual relation of interest detection (VROID) is proposed to obtain visual relations of interest, i.e., visual relations are semantically important for expressing the main content of an image. In this paper, we propose a complete interest propagation from part (CIPFP) method for VROID, which exploits semantic parts and propagates interest along part-instance-relation. Specifically, the interest in CIPFP is propagated from parts to part pairs, from parts to instances, from part pairs to instance pairs, from instances to instance pairs, from parts to relation triplets and from instance pairs to relation triplets. We conduct substantial experiments to validate the effectiveness of the CIPFP method and the components in CIPFP.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Taxonomy of Part and Attribute Discovery Techniques

Multi-task Compositional Network for Visual Relationship Detection

Article 30 July 2020

Iterative Visual Relationship Detection via Commonsense Knowledge Graph

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Abdelkarim S, Achlioptas P, Huang J, Li B, Church K, Elhoseiny M (2020) Long-tail visual relationship recognition with a visiolinguistic hubless loss. arXiv preprint arXiv:2004.00436
Baier S, Ma Y, Tresp V (2017) Improving visual relationship detection using semantic modeling of scene descriptions. In: International semantic web conference. Springer, pp 53–68
Chen L, Zhang H, Xiao J, He X, Pu S, Chang SF (2018) Scene dynamics: Counterfactual critic multi-agent training for scene graph generation. arXiv preprint arXiv:1812.023473
Chen T, Yu W, Chen R, Lin L (2019) Knowledge-embedded routing network for scene graph generation. In: IEEE conference on computer vision and pattern recognition, pp 6163–6171
Chen X, Mottaghi R, Liu X, Fidler S, Urtasun R, Yuille A (2014) Detect what you can: detecting and representing objects using holistic models and body parts. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1971–1978
Cong W, Wang W, Lee WC (2018) Scene graph generation via conditional random fields. arXiv preprint arXiv:1811.08075
Cui Z, Xu C, Zheng W, Yang J (2018) Context-dependent diffusion network for visual relationship detection. In: Proceedings of the 26th ACM international conference on Multimedia, pp 1475–1482
Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338
Article Google Scholar
Fang, H.S., Cao, J., Tai, Y.W., Lu, C.: Pairwise body-part attention for recognizing human-object interactions. In: European conference on computer vision, pp 51–67 (2018)
Farhadi A, Redmon J (2018) Yolov3: an incremental improvement. computer vision and pattern recognition, cite as
Girshick R (2015) Fast r-cnn. In: IEEE international conference on computer vision, pp 1440–1448
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE conference on computer vision and pattern recognition, pp 580–587
Gkanatsios N, Pitsikalis V, Koutras P, Maragos P (2019) Attention-translation-relation network for scalable scene graph generation. In: Proceedings of the IEEE international conference on computer vision workshops
Han C, Shen F, Liu L, Yang Y, Shen HT (2018) Visual spatial attention network for relationship detection. In: ACM international conference on multimedia, pp 510–518
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916
Article Google Scholar
He T, Gao L, Song J, Cai J, Li YF (2020) Learning from the scene and borrowing from the rich: tackling the long tail in scene graph generation. arXiv preprint arXiv:2006.07585
Jae Hwang S, Ravi SN, Tao Z, Kim HJ, Collins MD, Singh V (2018) Tensorize, factorize and regularize: robust visual relationship learning. In: IEEE conference on computer vision and pattern recognition, pp 1014–1023
Jung J, Park J (2020) Improving visual relationship detection using linguistic and spatial cues. ETRI J 42(3):399–410
Article Google Scholar
Kirillov A, He K, Girshick R, Rother C, Dollár P (2019) Panoptic segmentation. In: IEEE conference on computer vision and pattern recognition, pp 9404–9413
Li Y, Chen X, Zhu Z, Xie L, Huang G, Du D, Wang X (2019) Attention-guided unified network for panoptic segmentation. In: IEEE conference on computer vision and pattern recognition, pp 7026–7035
Li Y, Ouyang W, Zhou B, Shi J, Zhang C, Wang X (2018) Factorizable net: an efficient subgraph-based framework for scene graph generation. In: European conference on computer vision, pp 335–351
Li YL, Xu L, Liu X, Huang X, Xu Y, Wang S, Fang HS, Ma Z, Chen M, Lu C (2020) Pastanet: toward human activity knowledge engine. In: IEEE conference on computer vision and pattern recognition, pp 382–391
Liang X, Lee L, Xing EP (2017) Deep variation-structured reinforcement learning for visual relationship and attribute detection. In: IEEE conference on computer vision and pattern recognition, pp 848–857
Liang, Y., Bai, Y., Zhang, W., Qian, X., Zhu, L., Mei, T (2019) Vrr-vg: Refocusing visually-relevant relationships. In: Proceedings of the IEEE international conference on computer vision, pp 10403–10412
Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
Lu, C., Krishna, R., Bernstein, M., Fei-Fei, L (2016) Visual relationship detection with language priors. In: European conference on computer vision. Springer, pp 852–869
Michieli U, Borsato E, Rossi L, Zanuttigh P (2020) Gmnet: graph matching network for large scale part semantic segmentation in the wild. In: European conference on computer vision. Springer, pp 397–414
Morabia K, Arora J, Vijaykumar T (2020) Attention-based joint detection of object and semantic part. arXiv preprint arXiv:2007.02419
Plesse F, Ginsca A, Delezoide B, Prêteux F (2018) Learning prototypes for visual relationship detection. In: International conference on content-based multimedia indexing. IEEE, pp 1–6
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: IEEE conference on computer vision and pattern recognition, pp 779–788 (2016)
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: IEEE conference on computer vision and pattern recognition, pp 7263–7271
Ren S, He K, Girshick R, Sun J (2016) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Article Google Scholar
Tajrobehkar M, Tang K, Zhang H, Lim JH (2021) Align r-cnn: a pairwise head network for visual relationship detection. In: IEEE transactions on multimedia
Tang K, Niu Y, Huang J, Shi J, Zhang H (2020) Unbiased scene graph generation from biased training. In: IEEE conference on computer vision and pattern recognition, pp 3716–3725
Tang K, Zhang H, Wu B, Luo W, Liu W (2019) Learning to compose dynamic tree structures for visual contexts. In: IEEE conference on computer vision and pattern recognition, pp 6619–6628
Tian H, Xu N, Liu AA, Zhang Y (2020) Part-aware interactive learning for scene graph generation. In: ACM international conference on multimedia, pp 3155–3163
Tian Z, Shen C, Chen H, He T (2019) Fcos: fully convolutional one-stage object detection. In: IEEE international conference on computer vision, pp 9627–9636
Wan B, Zhou D, Liu Y, Li R, He X (2019) Pose-aware multi-level feature network for human object interaction detection. In: IEEE international conference on computer vision, pp 9469–9478
Wang P, Shen X, Lin Z, Cohen S, Price B, Yuille AL (2015) Joint object and part segmentation using deep learned potentials. In: IEEE international conference on computer vision, pp 1573–1581
Wang W, Liu R, Wang M, Wang S, Chang X, Chen Y (2020) Memory-based network for scene graph with unbalanced relations. In: ACM international conference on multimedia, pp 2400–2408
Wang W, Wang M, Wang S, Long G, Yao L, Qi G, Chen Y (2020) One-shot learning for long-tail visual relation detection. AAAI Conf Artif Intell 34:12225–12232
Google Scholar
Wang W, Wang R, Shan S, Chen X (2020) Sketching image gist: human-mimetic hierarchical scene graph generation. In: European conference on computer vision, pp 222–239
Wen B, Luo J, Liu X, Huang L (2020) Unbiased scene graph generation via rich and fair semantic extraction. arXiv preprint arXiv:2002.00176
Xiong Y, Liao R, Zhao H, Hu R, Bai M, Yumer E, Urtasun R (2019) Upsnet: a unified panoptic segmentation network. In: IEEE conference on computer vision and pattern recognition, pp 8818–8826
Xu D, Zhu Y, Choy CB, Fei-Fei L (2017) Scene graph generation by iterative message passing. In: IEEE conference on computer vision and pattern recognition, pp 5410–5419
Yang J, Lu J, Lee S, Batra D, Parikh D (2018) Graph r-cnn for scene graph generation. In: European conference on computer vision, pp 670–685
Yao Q, Gong X (2018) Exploiting lstm for joint object and semantic part detection. In: Asian conference on computer vision. Springer, pp 498–512
Yin G, Sheng L, Liu B, Yu N, Wang X, Shao J, Change Loy C (2018) Zoom-net: mining deep feature interactions for visual relationship recognition. In: Proceedings of the European conference on computer vision (ECCV), pp 322–338
Yu F, Wang H, Ren T, Tang J, Wu G (2020) Visual relation of interest detection. In: ACM international conference on multimedia, pp 1386–1394
Yu J, Chai Y, Hu Y, Wu Q (2020) Cogtree: cognition tree loss for unbiased scene graph generation. arXiv preprint arXiv:2009.07526
Yu R, Li A, Morariu VI, Davis LS (2017) Visual relationship detection with internal and external linguistic knowledge distillation. In: IEEE international conference on computer vision, pp 1974–1982
Zellers R, Yatskar M, Thomson S, Choi Y (2018) Neural motifs: scene graph parsing with global context. In: IEEE conference on computer vision and pattern recognition, pp 5831–5840
Zhan Y, Yu J, Yu T, Tao D (2019) On exploring undetermined relationships for visual relationship detection. In: IEEE conference on computer vision and pattern recognition, pp 5128–5137
Zhang J, Elhoseiny M, Cohen S, Chang W, Elgammal A (2017) Relationship proposal networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5678–5686
Zhang J, Zhang Y, Wu B, Fan Y, Shen F, Shen HT (2020) Dual resgcn for balanced scene graphgeneration. arXiv preprint arXiv:2011.04234 (2020)
Zhao,Y, Li J, Zhang Y, Tian Y (2019) Multi-class part parsing with joint boundary-semantic awareness. In: IEEE international conference on computer vision, pp 9177–9186
Zheng S, Chen S, Jin Q (2019) Visual relation detection with multi-level attention. In: ACM international conference on multimedia, pp 121–129
Zhou Y, Fan Y (2021) Visual relation of interest detection based on part detection. In: International symposium on artificial intelligence and robotics
Zhu Y, Jiang S, Li X (2017) Visual relationship detection with object spatial distribution. In: IEEE international conference on multimedia and expo. IEEE, pp 379–384
Zhuang B, Liu L, Shen C, Reid I (2017) Towards context-aware interaction recognition for visual relationship detection. In: IEEE international conference on computer vision, pp 589–598

Download references

Acknowledgements

This work is supported by National Science Foundation of China (62072232), Natural Science Foundation of Jiangsu Province (BK20191248) and Collaborative Innovation Center of Novel Software Technology and Industrialization.

Author information

Authors and Affiliations

Jiangsu Vocational Institute of Commerce, Nanjing, China
You Zhou
State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China
Fan Yu

Authors

You Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Fan Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fan Yu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, Y., Yu, F. Complete interest propagation from part for visual relation of interest detection. Int. J. Mach. Learn. & Cyber. 14, 455–465 (2023). https://doi.org/10.1007/s13042-022-01603-w

Download citation

Received: 01 July 2021
Accepted: 17 June 2022
Published: 02 August 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s13042-022-01603-w

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Complete interest propagation from part for visual relation of interest detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Taxonomy of Part and Attribute Discovery Techniques

Multi-task Compositional Network for Visual Relationship Detection

Iterative Visual Relationship Detection via Commonsense Knowledge Graph

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Complete interest propagation from part for visual relation of interest detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Taxonomy of Part and Attribute Discovery Techniques

Multi-task Compositional Network for Visual Relationship Detection

Iterative Visual Relationship Detection via Commonsense Knowledge Graph

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation