Skip to main content

Class Is Invariant to Context and Vice Versa: On Learning Invariance for Out-Of-Distribution Generalization

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13685))

Included in the following conference series:

Abstract

Out-Of-Distribution generalization (OOD) is all about learning invariance against environmental changes. If the context (In this paper, the word “context” denotes any class-agnostic attributes such as color, texture and background. The formal definition can be found in Appendix, A.2.) in every class is evenly distributed, OOD would be trivial because the context can be easily removed due to an underlying principle: class is invariant to context. However, collecting such a balanced dataset is impractical. Learning on imbalanced data makes the model bias to context and thus hurts OOD. Therefore, the key to OOD is context balance. We argue that the widely adopted assumption in prior work—the context bias can be directly annotated or estimated from biased class prediction—renders the context incomplete or even incorrect. In contrast, we point out the ever-overlooked other side of the above principle: context is also invariant to class, which motivates us to consider the classes (which are already labeled) as the varying environments (The word “environments” [2] denotes the subsets of training data built by some criteria. In this paper, we take a class as an environment—our key idea.) to resolve context bias (without context labels). We implement this idea by minimizing the contrastive loss of intra-class sample similarity while assuring this similarity to be invariant across all classes. On benchmarks with various context biases and domain gaps, we show that a simple re-weighting based classifier equipped with our context estimation achieves state-of-the-art performance. We provide the theoretical justifications in Appendix and codes on Github: https://github.com/simpleshinobu/IRMCon.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    It is also known as causal or stable feature in literature [49, 65, 70].

References

  1. Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., Marchand, M.: Domain-adversarial neural networks. In: NIPS (2014)

    Google Scholar 

  2. Arjovsky, M., Bottou, L., Gulrajani, I., Lopez-Paz, D.: Invariant risk minimization. arXiv preprint arXiv:1907.02893 (2019)

  3. Austin, P.C.: An introduction to propensity score methods for reducing the effects of confounding in observational studies. In: Multivariate Behavioral Research (2011)

    Google Scholar 

  4. Bahng, H., Chun, S., Yun, S., Choo, J., Oh, S.J.: Learning de-biased representations with biased representations. In: ICML (2020)

    Google Scholar 

  5. Ben-David, S., Blitzer, J., Crammer, K., Pereira, F., et al.: Analysis of representations for domain adaptation. In: NIPS (2007)

    Google Scholar 

  6. Carlucci, F.M., D’Innocente, A., Bucci, S., Caputo, B., Tommasi, T.: Domain generalization by solving Jigsaw puzzles. In: CVPR (2019)

    Google Scholar 

  7. Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607. PMLR (2020)

    Google Scholar 

  8. Clark, C., Yatskar, M., Zettlemoyer, L.: Don’t take the easy way out: ensemble based methods for avoiding known dataset biases. arXiv preprint arXiv:1909.03683 (2019)

  9. Cubuk, E.D., Zoph, B., Shlens, J., Le, Q.V.: RandAugment: practical automated data augmentation with a reduced search space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 702–703 (2020)

    Google Scholar 

  10. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)

    Google Scholar 

  11. Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F.A., Brendel, W.: ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. arXiv preprint arXiv:1811.12231 (2018)

  12. Gong, M., Zhang, K., Liu, T., Tao, D., Glymour, C., Schölkopf, B.: Domain adaptation with conditional transferable components. In: ICML (2016)

    Google Scholar 

  13. Grill, J.B., et al.: Bootstrap your own latent - a new approach to self-supervised learning. Adv. Neural. Inf. Process. Syst. 33, 21271–21284 (2020)

    Google Scholar 

  14. Gulrajani, I., Lopez-Paz, D.: In search of lost domain generalization. In: ICLR (2021)

    Google Scholar 

  15. He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: CVPR (2020)

    Google Scholar 

  16. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  17. He, Y., Shen, Z., Cui, P.: Towards non-IID image classification: a dataset and baselines. Pattern Recogn. 110, 107383 (2021)

    Article  Google Scholar 

  18. Hendrycks, D., Dietterich, T.: Benchmarking neural network robustness to common corruptions and perturbations. In: ICLR (2019)

    Google Scholar 

  19. Hendrycks, D., Gimpel, K.: A baseline for detecting misclassified and out-of-distribution examples in neural networks. In: ICLR (2017)

    Google Scholar 

  20. Higgins, I., et al.: Towards a definition of disentangled representations. arXiv preprint arXiv:1812.02230 (2018)

  21. Huang, Z., Wang, H., Xing, E.P., Huang, D.: Self-challenging improves cross-domain generalization. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12347, pp. 124–140. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58536-5_8

    Chapter  Google Scholar 

  22. Jung, Y., Tian, J., Bareinboim, E.: Learning causal effects via weighted empirical risk minimization. In: NIPS (2020)

    Google Scholar 

  23. Khan, S.H., Hayat, M., Bennamoun, M., Sohel, F.A., Togneri, R.: Cost-sensitive learning of deep feature representations from imbalanced data. IEEE Trans. Neural Netw. Learn. Syst. (2017)

    Google Scholar 

  24. Kim, B., Kim, H., Kim, K., Kim, S., Kim, J.: Learning not to learn: training deep neural networks with biased data. In: CVPR, pp. 9012–9020 (2019)

    Google Scholar 

  25. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  26. Krueger, D., et al.: Out-of-distribution generalization via risk extrapolation (rex). In: International Conference on Machine Learning (2021)

    Google Scholar 

  27. Lee, J., Kim, E., Lee, J., Lee, J., Choo, J.: Learning debiased representation via disentangled feature augmentation. In: NIPS (2021)

    Google Scholar 

  28. Li, D., Yang, Y., Song, Y.Z., Hospedales, T.M.: Deeper, broader and artier domain generalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5542–5550 (2017)

    Google Scholar 

  29. Li, D., Yang, Y., Song, Y.Z., Hospedales, T.M.: Learning to generalize: meta-learning for domain generalization. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  30. Li, H., Pan, S.J., Wang, S., Kot, A.C.: Domain generalization with adversarial feature learning. In: CVPR (2018)

    Google Scholar 

  31. Li, Y., et al.: Deep domain generalization via conditional invariant adversarial networks. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11219, pp. 647–663. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01267-0_38

    Chapter  Google Scholar 

  32. Li, Y., Vasconcelos, N.: Repair: removing representation bias by dataset resampling. In: CVPR, pp. 9572–9581 (2019)

    Google Scholar 

  33. Liang, S., Li, Y., Srikant, R.: Enhancing the reliability of out-of-distribution image detection in neural networks. In: ICLR (2018)

    Google Scholar 

  34. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  35. Little, R.J., Rubin, D.B.: Statistical Analysis with Missing Data, vol. 793. Wiley, Hoboken (2019)

    Google Scholar 

  36. Liu, J., Hu, Z., Cui, P., Li, B., Shen, Z.: Heterogeneous risk minimization. In: ICML (2021)

    Google Scholar 

  37. Liu, Z., Miao, Z., Zhan, X., Wang, J., Gong, B., Yu, S.X.: Large-scale long-tailed recognition in an open world. In: CVPR (2019)

    Google Scholar 

  38. Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(11) (2008)

    Google Scholar 

  39. Mahajan, D., Tople, S., Sharma, A.: Domain generalization using causal matching. In: International Conference on Machine Learning, pp. 7313–7324. PMLR (2021)

    Google Scholar 

  40. Muandet, K., Balduzzi, D., Schölkopf, B.: Domain generalization via invariant feature representation. In: ICML (2013)

    Google Scholar 

  41. Nam, J., Cha, H., Ahn, S., Lee, J., Shin, J.: Learning from failure: training debiased classifier from biased classifier. In: NIPS (2020)

    Google Scholar 

  42. Okumura, R., Okada, M., Taniguchi, T.: Domain-adversarial and-conditional state space model for imitation learning. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE (2020)

    Google Scholar 

  43. Oord, A.v.d., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018)

  44. Peters, J., Bühlmann, P., Meinshausen, N.: Causal inference by using invariant prediction: identification and confidence intervals. J. Roy. Stat. Soc. Ser. B (Stat. Methodol.) 947–1012 (2016)

    Google Scholar 

  45. Pezeshki, M., Kaba, S.O., Bengio, Y., Courville, A., Precup, D., Lajoie, G.: Gradient starvation: a learning proclivity in neural networks. In: NIPS (2021)

    Google Scholar 

  46. Pfister, N., Bühlmann, P., Peters, J.: Invariant causal prediction for sequential data. J. Am. Stat. Assoc. 114(527), 1264–1276 (2019)

    Article  MathSciNet  Google Scholar 

  47. Recht, B., Roelofs, R., Schmidt, L., Shankar, V.: Do imagenet classifiers generalize to imagenet? In: ICML (2019)

    Google Scholar 

  48. Sagawa, S., Koh, P.W., Hashimoto, T.B., Liang, P.: Distributionally robust neural networks for group shifts: on the importance of regularization for worst-case generalization. In: ICLR (2020)

    Google Scholar 

  49. Schölkopf, B., et al.: Toward causal representation learning. Proc. IEEE 109(5), 612–634 (2021)

    Article  Google Scholar 

  50. Seaman, S.R., Vansteelandt, S.: Introduction to double robust methods for incomplete data. Stat. Sci. Rev. J. Inst. Math. Stat. (2018)

    Google Scholar 

  51. Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: ICCV (2017)

    Google Scholar 

  52. Shen, Z., et al.: Towards out-of-distribution generalization: a survey. arXiv preprint arXiv:2108.13624 (2021)

  53. Shi, Y., et al.: Gradient matching for domain generalization. arXiv preprint arXiv:2104.09937 (2021)

  54. Sun, B., Saenko, K.: Deep CORAL: correlation alignment for deep domain adaptation. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 443–450. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49409-8_35

    Chapter  Google Scholar 

  55. Suter, R., Miladinovic, D., Schölkopf, B., Bauer, S.: Robustly disentangled causal mechanisms: validating deep representations for interventional robustness. In: ICML (2019)

    Google Scholar 

  56. Tang, K., Huang, J., Zhang, H.: Long-tailed classification by keeping the good and removing the bad momentum causal effect. In: NIPS (2020)

    Google Scholar 

  57. Tartaglione, E., Barbano, C.A., Grangetto, M.: End: entangling and disentangling deep representations for bias correction. In: CVPR (2021)

    Google Scholar 

  58. Tzeng, E., Hoffman, J., Saenko, K., Darrell, T.: Adversarial discriminative domain adaptation. In: CVPR (2017)

    Google Scholar 

  59. Vapnik, V.: Principles of risk minimization for learning theory. In: Advances in Neural Information Processing Systems (1992)

    Google Scholar 

  60. Volpi, R., Murino, V.: Addressing model vulnerability to distributional shifts over image transformation sets. In: ICCV (2019)

    Google Scholar 

  61. Volpi, R., Namkoong, H., Sener, O., Duchi, J., Murino, V., Savarese, S.: Generalizing to unseen domains via adversarial data augmentation. In: NIPS (2018)

    Google Scholar 

  62. Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD birds-200-2011 dataset. California Institute of Technology (2011)

    Google Scholar 

  63. Wang, H., He, Z., Lipton, Z.C., Xing, E.P.: Learning robust representations by projecting superficial statistics out. arXiv preprint arXiv:1903.06256 (2019)

  64. Wang, T., Yue, Z., Huang, J., Sun, Q., Zhang, H.: Self-supervised learning disentangled group representation as feature. In: NIPS (2021)

    Google Scholar 

  65. Wang, T., Zhou, C., Sun, Q., Zhang, H.: Causal attention for unbiased visual recognition. In: ICCV (2021)

    Google Scholar 

  66. Xian, Y., Lampert, C.H., Schiele, B., Akata, Z.: Zero-shot learning-a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans. Pattern Anal. Mach. Intell. (2018)

    Google Scholar 

  67. Xu, Y., Jaakkola, T.: Learning representations that support robust transfer of predictors. arXiv preprint arXiv:2110.09940 (2021)

  68. Yan, S., Song, H., Li, N., Zou, L., Ren, L.: Improve unsupervised domain adaptation with mixup training. arXiv preprint arXiv:2001.00677 (2020)

  69. Yue, Z., Sun, Q., Hua, X.S., Zhang, H.: Transporting causal mechanisms for unsupervised domain adaptation. In: ICCV (2021)

    Google Scholar 

  70. Zhang, X., Cui, P., Xu, R., Zhou, L., He, Y., Shen, Z.: Deep stable learning for out-of-distribution generalization. In: CVPR (2021)

    Google Scholar 

  71. Zhang, Z., Sabuncu, M.R.: Generalized cross entropy loss for training deep neural networks with noisy labels. In: NIPS (2018)

    Google Scholar 

Download references

Acknowledgements

This research was supported by the Alibaba-NTU Singapore Joint Research Institute (JRI), and Artificial Intelligence Singapore (AISG), Alibaba Innovative Research (AIR) programme, A*STAR under its AME YIRG Grant (Project No. A20E6c0101).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiaxin Qi .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 813 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Qi, J., Tang, K., Sun, Q., Hua, XS., Zhang, H. (2022). Class Is Invariant to Context and Vice Versa: On Learning Invariance for Out-Of-Distribution Generalization. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13685. Springer, Cham. https://doi.org/10.1007/978-3-031-19806-9_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-19806-9_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-19805-2

  • Online ISBN: 978-3-031-19806-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics