Skip to main content

Advertisement

Log in

Debiasing vision-language models for vision tasks: a survey

  • Letter
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

  1. Radford A, Kim J W, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J, Krueger G, Sutskever I. Learning transferable visual models from natural language supervision. In: Proceedings of the 38th International Conference on Machine Learning. 2021, 8748–8763

    Google Scholar 

  2. Seth A, Hemani M, Agarwal C. DeAR: debiasing vision-language models with additive residuals. In: Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023, 6820–6829

    Google Scholar 

  3. Zhu B, Tang K, Sun Q, Zhang H. Generalized logit adjustment: Calibrating fine-tuned models by removing label bias in foundation models. In: Proceedings of the 37th Conference on Neural Information Processing Systems. 2023, 64663–64680

    Google Scholar 

  4. Allingham J U, Ren J, Dusenberry M W, Gu X, Cui Y, Tran D, Liu J Z, Lakshminarayanan B. A simple zero-shot prompt weighting technique to improve prompt ensembling in text-image models. In: Proceedings of the 40th International Conference on Machine Learning. 2023, 26

    Google Scholar 

  5. Wang J, Liu Y, Wang X. Are gender-neutral queries really gender-neutral? Mitigating gender bias in image search. In: Proceedings of 2021 Conference on Empirical Methods in Natural Language Processing. 2021, 1995–2008

    Chapter  Google Scholar 

  6. Wang X, Wu Z, Lian L, Yu S X. Debiased learning from naturally imbalanced pseudo-labels. In: Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, 14627–14637

    Google Scholar 

  7. Cui J, Zhu B, Wen X, Qi X, Yu B, Zhang H. Classes are not equal: an empirical study on image recognition fairness. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2024, 23283–23292

    Google Scholar 

  8. Zhu B, Niu Y, Lee S, Hur M, Zhang H. Debiased fine-tuning for vision-language models by prompt regularization. In: Proceedings of the 37th AAAI Conference on Artificial Intelligence. 2023, 3834–3842

    Google Scholar 

  9. Zhang M, Ré C. Contrastive adapters for foundation model group robustness. In: Proceedings of the 36th International Conference on Neural Information Processing Systems. 2022, 1576

    Google Scholar 

  10. Chuang C Y, Jampani V, Li Y, Torralba A, Jegelka S. Debiasing vision-language models via biased prompts. 2023, arXiv preprint arXiv: 2302.00070

    Google Scholar 

  11. Parashar S, Lin Z, Liu T, Dong X, Li Y, Ramanan D, Caverlee J, Kong S. The neglected tails in vision-language models. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024, 12988–12997

    Google Scholar 

  12. Berg H, Hall S, Bhalgat Y, Kirk H, Shtedritski A, Bain M. A prompt array keeps the bias away: Debiasing vision-language models with adversarial learning. In: Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing. 2022, 806–822

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Beier Zhu.

Ethics declarations

Competing interests The authors declare that they have no competing interests or financial conflicts to disclose.

Electronic Supplementary Material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, B., Zhang, H. Debiasing vision-language models for vision tasks: a survey. Front. Comput. Sci. 19, 191321 (2025). https://doi.org/10.1007/s11704-024-40051-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-024-40051-3