Skip to main content

An Effective Dynamic Reweighting Method for Unbiased Scene Graph Generation

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14425))

Included in the following conference series:

  • 906 Accesses

Abstract

Despite the remarkable advancements in Scene Graph Generation (SGG) in recent years, the precise capture and modeling of long-tail object relationships remain persistent challenges in the field. Conventional methods generally employ resampling and reweighting techniques to achieve unbiased predictions. Existing reweighting methods in SGG calculate weights based on the class distribution of the dataset. And they focus on the reweighting of the related samples while overlooking the reweighting of the samples whose objects are unrelated. However, the sample distribution during the training process is inconsistent with the class distribution of the dataset, and the reweighting of samples whose objects are unrelated should not be overlooked. In this paper, we propose a novel method named Dynamic Reweighting based on the Sample Distribution (DRSD). The DRSD method calculates the weights of classes based on the sample distribution during the training process and incorporates reweighting for the samples whose objects are unrelated. Specifically, we utilize a sample queue mechanism to record and update the sample distribution and introduce a transition mechanism to ensure training stability. The experiments conducted on the Visual Genome dataset demonstrate the effectiveness of our method. Our method exhibits model-agnostic characteristics and yields significant performance improvements on three benchmark models (Motif, VCTree, and Transformer). Specifically, it achieves an increase of \(23.4\%\), \(25.1\%\), and \(27.6\%\) on the mR@100 metric for the Predicate Classification task, achieving \(40.9\%\), \(41.2\%\), and \(43.4\%\), respectively. Moreover, our method outperforms the state-of-the-art reweighting method in SGG, i.e. FGPL, by \(3\%\).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://homes.cs.washington.edu/~ranjay/visualgenome/index.html.

References

  1. Abedi, A., Karshenas, H., Adibi, P.: Multi-modal reward for visual relationships-based image captioning. arXiv preprint arXiv:2303.10766 (2023)

  2. Chen, S., Jin, Q., Wang, P., Wu, Q.: Say as you wish: fine-grained control of image caption generation with abstract scene graphs. In: CVPR (2020)

    Google Scholar 

  3. Chen, T., Yu, W., Chen, R., Lin, L.: Knowledge-embedded routing network for scene graph generation. In: CVPR (2019)

    Google Scholar 

  4. Deng, Y., et al.: Hierarchical memory learning for fine-grained scene graph generation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV. LNCS, vol. 13687, pp. 266–283. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19812-0_16

  5. Dong, X., Gan, T., Song, X., Wu, J., Cheng, Y., Nie, L.: Stacked hybrid-attention and group collaborative learning for unbiased scene graph generation. In: CVPR (2022)

    Google Scholar 

  6. Guo, Y., Chen, J., Zhang, H., Jiang, Y.G.: Visual relations augmented cross-modal retrieval. In: ICMR (2020)

    Google Scholar 

  7. Guo, Y., et al.: From general to specific: informative scene graph generation via balance adjustment. In: ICCV (2021)

    Google Scholar 

  8. Hildebrandt, M., Li, H., Koner, R., Tresp, V., Günnemann, S.: Scene graph reasoning for visual question answering. arXiv preprint arXiv:2007.01072 (2020)

  9. Krishna, R., et al.: Visual genome: connecting language and vision using crowdsourced dense image annotations. IJCV (2017)

    Google Scholar 

  10. Lertnattee, V., Theeramunkong, T.: Analysis of inverse class frequency in centroid-based text classification. In: ISCIT (2004)

    Google Scholar 

  11. Li, M., Qi, Y.: XPNet: cross-domain prototypical network for zero-shot sketch-based image retrieval. In: Yu, S., et al. (eds.) PRCV. LNCS, vol. 13534, pp. 394–410. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-18907-4_31

  12. Li, R., Zhang, S., Wan, B., He, X.: Bipartite graph network with adaptive message passing for unbiased scene graph generation. In: CVPR (2021)

    Google Scholar 

  13. Lyu, X., et al.: Fine-grained predicates learning for scene graph generation. In: CVPR (2022)

    Google Scholar 

  14. Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: EMNLP (2014)

    Google Scholar 

  15. Schroeder, B., Tripathi, S.: Structured query-based image retrieval using scene graphs. In: CVPRW (2020)

    Google Scholar 

  16. Song, J., Zeng, P., Gao, L., Shen, H.T.: From pixels to objects: cubic visual attention for visual question answering. arXiv preprint arXiv:2206.01923 (2022)

  17. Song, X., Chen, J., Wu, Z., Jiang, Y.G.: Spatial-temporal graphs for cross-modal Text2Video retrieval. IEEE T-MM (2021)

    Google Scholar 

  18. Tang, K.: A scene graph generation codebase in Pytorch (2020). https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch

  19. Tang, K., Niu, Y., Huang, J., Shi, J., Zhang, H.: Unbiased scene graph generation from biased training. In: CVPR (2020)

    Google Scholar 

  20. Tang, K., Zhang, H., Wu, B., Luo, W., Liu, W.: Learning to compose dynamic tree structures for visual contexts. In: CVPR (2019)

    Google Scholar 

  21. Teney, D., Liu, L., van Den Hengel, A.: Graph-structured representations for visual question answering. In: CVPR (2017)

    Google Scholar 

  22. Vaswani, A., et al.: Attention is all you need. In: NeurIPS (2017)

    Google Scholar 

  23. Wang, J., et al.: Seesaw loss for long-tailed instance segmentation. In: CVPR (2021)

    Google Scholar 

  24. Xu, P., Chang, X., Guo, L., Huang, P.Y., Chen, X., Hauptmann, A.G.: A survey of scene graph: generation and application. TNNLS (2020)

    Google Scholar 

  25. Yan, S., et al.: PCPL: predicate-correlation perception learning for unbiased scene graph generation. In: ACM MM (2020)

    Google Scholar 

  26. Yang, X., et al.: Transforming visual scene graphs to image captions. arXiv preprint arXiv:2305.02177 (2023)

  27. Yang, X., Tang, K., Zhang, H., Cai, J.: Auto-encoding scene graphs for image captioning. In: CVPR (2019)

    Google Scholar 

  28. Yu, J., Chai, Y., Wang, Y., Hu, Y., Wu, Q.: CogTree: cognition tree loss for unbiased scene graph generation. arXiv preprint arXiv:2009.07526 (2020)

  29. Zellers, R., Yatskar, M., Thomson, S., Choi, Y.: Neural motifs: scene graph parsing with global context. In: CVPR (2018)

    Google Scholar 

  30. Zhang, A., et al.: Fine-grained scene graph generation with data transfer. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) ECCV. LNCS, vol. 13687, pp. 409–424. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19812-0_24

  31. Zhang, C., Chao, W.L., Xuan, D.: An empirical study on leveraging scene graphs for visual question answering. arXiv preprint arXiv:1907.12133 (2019)

  32. Zhou, X., Li, S., Chen, H., Zhu, A.: Disentangled OCR: a more granular information for “text”-to-image retrieval. In: PRCV (2022)

    Google Scholar 

Download references

Acknowledgments

This work was supported by National Key R &D Program of China under Grant 2022ZD0115502, by the National Natural Science Foundation of China under Grant U21A20514 and 62122010, and by the FuXiaQuan National Independent Innovation Demonstration Zone Collaborative Innovation Platform Project under Grant 3502ZCQXT2022008.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hanzi Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hu, L., Liu, S., Wang, H. (2024). An Effective Dynamic Reweighting Method for Unbiased Scene Graph Generation. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14425. Springer, Singapore. https://doi.org/10.1007/978-981-99-8429-9_28

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8429-9_28

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8428-2

  • Online ISBN: 978-981-99-8429-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics