An Information Theoretical View for Out-of-Distribution Detection

Hu, Jinjing; Liu, Wenrui; Chang, Hong; Ma, Bingpeng; Shan, Shiguang; Chen, Xilin

doi:10.1007/978-3-031-73001-6_24

Jinjing Hu^13,14,
Wenrui Liu^14,15,
Hong Chang^14,15,
Bingpeng Ma¹⁵,
Shiguang Shan^14,15 &
…
Xilin Chen^14,15

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15113))

Included in the following conference series:

European Conference on Computer Vision

199 Accesses

Abstract

Detecting out-of-distribution (OOD) inputs are pivotal for real-world applications. However, due to the inaccessibility of OODs during training phase, applying supervised binary classification with in-distribution (ID) and OOD labels is not feasible. Therefore, previous works typically employ the proxy ID classification task to learn feature representation for OOD detection task. In this study, we delve into the relationship between the two tasks through the lens of Information Theory. Our analysis reveals that optimizing the classification objective could inevitably cause the over-confidence and undesired compression of OOD detection-relevant information. To address these two problems, we propose OOD Entropy Regularization (OER) to regularize the information captured in classification-oriented representation learning for detecting OOD samples. Both theoretical analyses and experimental results underscore the consistent improvement of OER on OOD detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Revisiting Deep Ensemble for Out-of-Distribution Detection: A Loss Landscape Perspective

Article 15 July 2024

Exploring feature sparsity for out-of-distribution detection

Article Open access 18 November 2024

SCOD: From Heuristics to Theory

References

Ahmad, I., Lin, P.E.: A nonparametric estimation of the entropy for absolutely continuous distributions (corresp.). IEEE Trans. Inf. Theory 22(3), 372–375 (1976)
Google Scholar
Barber, D., Agakov, F.: The im algorithm: a variational approach to information maximization. In: Proceedings of the 16th International Conference on Neural Information Processing Systems, pp. 201–208 (2003)
Google Scholar
Barber, D., Agakov, F.: The im algorithm: a variational approach to information maximization. In: Advances in Neural Information Processing Systems, vol. 16, no. 320, p. 201 (2004)
Google Scholar
Belghazi, M.I., et al.: Mutual information neural estimation. In: International Conference on Machine Learning, pp. 531–540. PMLR (2018)
Google Scholar
Cai, M., Li, Y.: Out-of-distribution detection via frequency-regularized generative models. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 5521–5530 (2023)
Google Scholar
Cen, J., et al.: Enlarging instance-specific and class-specific information for open-set action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15295–15304 (2023)
Google Scholar
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607 (2020)
Google Scholar
Cimpoi, M., Maji, S., Kokkinos, I., Mohamed, S., Vedaldi, A.: Describing textures in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3606–3613 (2014)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Google Scholar
Djurisic, A., Bozanic, N., Ashok, A., Liu, R.: Extremely simple activation shaping for out-of-distribution detection. arXiv preprint arXiv:2209.09858 (2022)
Du, X., Gozum, G., Ming, Y., Li, Y.: Siren: shaping representations for detecting out-of-distribution objects. In: Advances in Neural Information Processing Systems, pp. 20434–20449 (2022)
Google Scholar
Du, X., Wang, Z., Cai, M., Li, Y.: VoS: learning what you don’t know by virtual outlier synthesis. arXiv preprint arXiv:2202.01197 (2022)
Federici, M., Dutta, A., Forré, P., Kushman, N., Akata, Z.: Learning robust representations via multi-view information bottleneck. arXiv preprint arXiv:2002.07017 (2020)
Filos, A., Tigkas, P., McAllister, R., Rhinehart, N., Levine, S., Gal, Y.: Can autonomous vehicles identify, recover from, and adapt to distribution shifts? In: International Conference on Machine Learning, pp. 3145–3153 (2020)
Google Scholar
Grathwohl, W., Wang, K.C., Jacobsen, J.H., Duvenaud, D., Norouzi, M., Swersky, K.: Your classifier is secretly an energy based model and you should treat it like one. arXiv preprint arXiv:1912.03263 (2019)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Hendrycks, D., et al.: Scaling out-of-distribution detection for real-world settings. arXiv preprint arXiv:1911.11132 (2019)
Hendrycks, D., Gimpel, K.: A baseline for detecting misclassified and out-of-distribution examples in neural networks. arXiv preprint arXiv:1610.02136 (2016)
Hobson, E.W.: The theory of spherical and ellipsoidal harmonics. CUP Archive (1931)
Google Scholar
Khosla, P., et al.: Supervised contrastive learning. In: Advances in Neural Information Processing Systems, pp. 18661–18673 (2020)
Google Scholar
Kirichenko, P., Izmailov, P., Wilson, A.G.: Why normalizing flows fail to detect out-of-distribution data. In: Advances in Neural Information Processing Systems, vol. 33, pp. 20578–20589 (2020)
Google Scholar
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images. Computer Science Department, University of Toronto, Technical report (2009)
Google Scholar
Kullback, S.: Information Theory and Statistics. Courier Corporation (1997)
Google Scholar
Lee, K., Lee, K., Lee, H., Shin, J.: A simple unified framework for detecting out-of-distribution samples and adversarial attacks. In: Advances in Neural Information Processing Systems (2018)
Google Scholar
Li, J., Xiong, C., Hoi, S.C.: MoPro: webly supervised learning with momentum prototypes. arXiv preprint arXiv:2009.07995 (2020)
Liang, P.P., Deng, Z., Ma, M., Zou, J., Morency, L.P., Salakhutdinov, R.: Factorized contrastive learning: going beyond multi-view redundancy. arXiv preprint arXiv:2306.05268 (2023)
Liang, S., Li, Y., Srikant, R.: Enhancing the reliability of out-of-distribution image detection in neural networks. arXiv preprint arXiv:1706.02690 (2017)
Lin, Z., Roy, S.D., Li, Y.: Mood: multi-level out-of-distribution detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 15313–15323 (2021)
Google Scholar
Liu, W., Wang, X., Owens, J., Li, Y.: Energy-based out-of-distribution detection. In: Advances in Neural Information Processing Systems, pp. 21464–21475 (2020)
Google Scholar
Liu, Z., Miao, Z., Zhan, X., Wang, J., Gong, B., Yu, S.X.: Large-scale long-tailed recognition in an open world. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2537–2546 (2019)
Google Scholar
Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983 (2016)
Mai, S., Zeng, Y., Hu, H.: Multimodal information bottleneck: learning minimal sufficient unimodal and multimodal representations. IEEE Trans. Multimed. (2022)
Google Scholar
McInnes, L., Healy, J., Melville, J.: UMAP: uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426 (2018)
Ming, Y., Sun, Y., Dia, O., Li, Y.: Cider: exploiting hyperspherical embeddings for out-of-distribution detection. arXiv preprint arXiv:2203.04450 (2022)
Nalisnick, E., Matsukawa, A., Teh, Y.W., Gorur, D., Lakshminarayanan, B.: Do deep generative models know what they don’t know? arXiv preprint arXiv:1810.09136 (2018)
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. In: NIPS Workshop on Deep Learning and Unsupervised Feature Learning (2011)
Google Scholar
Nguyen, A.T., Lu, F., Munoz, G.L., Raff, E., Nicholas, C., Holt, J.: Out of distribution data detection using dropout Bayesian neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 7877–7885 (2022)
Google Scholar
Nguyen, X., Wainwright, M.J., Jordan, M.I.: Estimating divergence functionals and the likelihood ratio by convex risk minimization. IEEE Trans. Inf. Theory 56(11), 5847–5861 (2010)
Article MathSciNet Google Scholar
van den Oord, A., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018)
Poole, B., Ozair, S., Van Den Oord, A., Alemi, A., Tucker, G.: On variational bounds of mutual information. In: International Conference on Machine Learning, pp. 5171–5180. PMLR (2019)
Google Scholar
Ren, J., Fort, S., Liu, J., Roy, A.G., Padhy, S., Lakshminarayanan, B.: A simple fix to mahalanobis distance for improving near-OOD detection. arXiv preprint arXiv:2106.09022 (2021)
Ren, J., et al.: Likelihood ratios for out-of-distribution detection. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Sehwag, V., Chiang, M., Mittal, P.: SSD: a unified framework for self-supervised outlier detection. arXiv preprint arXiv:2103.12051 (2021)
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948)
Article MathSciNet Google Scholar
Sun, Y., Guo, C., Li, Y.: React: out-of-distribution detection with rectified activations. In: Advances in Neural Information Processing Systems, pp. 144–157 (2021)
Google Scholar
Sun, Y., Li, Y.: Dice: leveraging sparsification for out-of-distribution detection. In: European Conference on Computer Vision, pp. 691–708 (2022)
Google Scholar
Sun, Y., Ming, Y., Zhu, X., Li, Y.: Out-of-distribution detection with deep nearest neighbors. In: International Conference on Machine Learning, pp. 20827–20840 (2022)
Google Scholar
Tack, J., Mo, S., Jeong, J., Shin, J.: CSI: novelty detection via contrastive learning on distributionally shifted instances. In: Advances in Neural Information Processing Systems, pp. 11839–11852 (2020)
Google Scholar
Tao, L., Du, X., Zhu, X., Li, Y.: Non-parametric outlier synthesis. arXiv preprint arXiv:2303.02966 (2023)
Tian, Y., Sun, C., Poole, B., Krishnan, D., Schmid, C., Isola, P.: What makes for good views for contrastive learning? In: Advances in Neural Information Processing Systems, vol. 33, pp. 6827–6839 (2020)
Google Scholar
Tishby, N., Pereira, F.C., Bialek, W.: The information bottleneck method. arXiv preprint physics/0004057 (2000)
Google Scholar
Tishby, N., Zaslavsky, N.: Deep learning and the information bottleneck principle. In: 2015 IEEE Information Theory Workshop, pp. 1–5 (2015)
Google Scholar
Van Horn, G., et al.: The inaturalist species classification and detection dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8769–8778 (2018)
Google Scholar
Vaze, S., Han, K., Vedaldi, A., Zisserman, A.: Open-set recognition: a good closed-set classifier is all you need? (2022)
Google Scholar
Wang, F., Liu, H.: Understanding the behaviour of contrastive loss. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2495–2504 (2021)
Google Scholar
Wang, H., Guo, X., Deng, Z.H., Lu, Y.: Rethinking minimal sufficient representation in contrastive learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16041–16050 (2022)
Google Scholar
Wang, H., Liu, W., Bocchieri, A., Li, Y.: Can multi-label classification networks know what they don’t know? In: Advances in Neural Information Processing Systems, pp. 29074–29087 (2021)
Google Scholar
Winkens, J., et al.: Contrastive training for improved out-of-distribution detection. arXiv preprint arXiv:2007.05566 (2020)
Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: large-scale scene recognition from abbey to zoo. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3485–3492 (2010)
Google Scholar
Xiao, Z., Yan, Q., Amit, Y.: Likelihood regret: an out-of-distribution detection score for variational auto-encoder. In: Advances in Neural Information Processing Systems, vol. 33, pp. 20685–20696 (2020)
Google Scholar
Xu, P., Ehinger, K.A., Zhang, Y., Finkelstein, A., Kulkarni, S.R., Xiao, J.: Turkergaze: crowdsourcing saliency with webcam based eye tracking. arXiv preprint arXiv:1504.06755 (2015)
Yang, J., et al.: Semantically coherent out-of-distribution detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 8301–8309 (2021)
Google Scholar
Yu, F., Seff, A., Zhang, Y., Song, S., Funkhouser, T., Xiao, J.: LSUN: construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365 (2015)
Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint arXiv:1605.07146 (2016)
Zhang, J., et al.: Out-of-distribution detection based on in-distribution data patterns memorization with modern hopfield energy. In: The Eleventh International Conference on Learning Representations (2022)
Google Scholar
Zhang, Y., Xu, Y., Chen, J., Xie, F., Chen, H.: Prototypical information bottlenecking and disentangling for multimodal cancer survival prediction. arXiv preprint arXiv:2401.01646 (2024)
Zhang, Z., Xiang, X.: Decoupling maxlogit for out-of-distribution detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3388–3397 (2023)
Google Scholar
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 1452–1464 (2017)
Google Scholar
Zimmerer, D., et al.: Mood 2020: a public benchmark for out-of-distribution detection and localization on medical images. IEEE Trans. Med. Imaging 2728–2738 (2022)
Google Scholar

Download references

Acknowledgments

This work is partially supported by National Key R&D Program of China no. 2021ZD0111901, and National Natural Science Foundation of China (NSFC): 62376259 and 62276246.

Author information

Authors and Affiliations

ShanghaiTech University, Shanghai, China
Jinjing Hu
Key Laboratory of AI Safety of CAS, Institute of Computing Technology, Chinese Academy of Sciences (CAS), Beijing, China
Jinjing Hu, Wenrui Liu, Hong Chang, Shiguang Shan & Xilin Chen
University of Chinese Academy of Sciences, Beijing, China
Wenrui Liu, Hong Chang, Bingpeng Ma, Shiguang Shan & Xilin Chen

Authors

Jinjing Hu
View author publications
You can also search for this author in PubMed Google Scholar
Wenrui Liu
View author publications
You can also search for this author in PubMed Google Scholar
Hong Chang
View author publications
You can also search for this author in PubMed Google Scholar
Bingpeng Ma
View author publications
You can also search for this author in PubMed Google Scholar
Shiguang Shan
View author publications
You can also search for this author in PubMed Google Scholar
Xilin Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hong Chang .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Hessen, Germany
Stefan Roth
Princeton University, Palo Alto, CA, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hu, J., Liu, W., Chang, H., Ma, B., Shan, S., Chen, X. (2025). An Information Theoretical View for Out-of-Distribution Detection. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15113. Springer, Cham. https://doi.org/10.1007/978-3-031-73001-6_24

Download citation

DOI: https://doi.org/10.1007/978-3-031-73001-6_24
Published: 27 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73000-9
Online ISBN: 978-3-031-73001-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

An Information Theoretical View for Out-of-Distribution Detection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Revisiting Deep Ensemble for Out-of-Distribution Detection: A Loss Landscape Perspective

Exploring feature sparsity for out-of-distribution detection

SCOD: From Heuristics to Theory

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

An Information Theoretical View for Out-of-Distribution Detection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Revisiting Deep Ensemble for Out-of-Distribution Detection: A Loss Landscape Perspective

Exploring feature sparsity for out-of-distribution detection

SCOD: From Heuristics to Theory

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation