Abstract
Out-of-distribution (OOD) detection is a critical issue for the stable and reliable operation of systems using a deep neural network (DNN). Although many OOD detection methods have been proposed, it remains unclear how the differences between in-distribution (ID) and OOD samples are generated by each processing step inside DNNs. We experimentally clarify this issue by investigating the layer dependence of feature representations from multiple perspectives. We find that intrinsic low dimensionalization of DNNs is essential for understanding how OOD samples become more distinct from ID samples as features propagate to deeper layers. Based on these observations, we provide a simple picture that consistently explains various properties of OOD samples. Specifically, low-dimensional weights eliminate most information from OOD samples, resulting in misclassifications due to excessive attention to dataset bias. In addition, we demonstrate the utility of dimensionality by proposing a dimensionality-aware OOD detection method based on alignment of features and weights, which consistently achieves high performance for various datasets with lower computational cost. Our implementation is publically available at https://github.com/kuematsu3/Dimensionality-aware-Projection-based-OOD-Detection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The transition layer is typically located just after the deepest pooling layer except the global average pooling. The exception is ResNet-18 where the transition layer is a little deeper. This may be due to insufficient low dimensionalization around the corresponding pooling layer.
- 2.
The layer ensemble method by Ref. [5] can uplift the detection accuracy of far-from-ID OOD samples, but it is not suitable for close-to-ID OOD detection. We checked that the AUROC value to detect CIFAR-100 OOD dataset using the ensemble method adopted by Ref. [5] is just around 0.86 for models trained by CIFAR-10. Also, the layer ensemble requires a lot of memory to save covariances, which is not suitable especially for resource-limited hardware. More seriously, the ensemble method by Ref. [5] requires some OOD samples, although it would be practically inaccessible in cases where we do not know what kinds of OOD samples are contaminated.
References
Yang, J., Zhou, K., Li, Y., Liu, Z.: Generalized Out-of-Distribution Detection: A Survey. arXiv:2110.11334
Salehi, M., Mirzaei, H., Hendrycks, D., Li, Y., Rohban, M.H., Sabokrou, M.: A Unified Survey on Anomaly, Novelty, Open-Set, and Out-of-Distribution Detection: Solutions and Future Challenges. arXiv:2110.14051
Hendrycks, D., Gimpel, K.: A baseline for detecting misclassified and out-of-distribution examples in neural networks. In: 5th International Conference on Learning Representations (2017)
Dietterich, T.G., Guyer, A.: The familiarity hypothesis: explaining the behavior of deep open set methods. Pattern Recogn. 132, 108931 (2022)
Lee, K., Lee, K., Lee, H., Shin, J.: A simple unified framework for detecting out-of-distribution samples and adversarial attacks. In: Advances in Neural Information Processing Systems (2018)
Kamoi, R., Kobayashi, K.: Why is the Mahalanobis Distance Effective for Anomaly Detection? arXiv:2003.00402
Ndiour, I., Ahuja, N., Tickoo, O.: Out-of-Distribution Detection With Subspace Techniques and Probabilistic Modeling of Features. arXiv:2012.04250
Wang, H., Li, Z., Feng, L., Zhang, W.: ViM: out-of-distribution with virtual-logit matching. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022)
Song, Y., Sebe, N., Wang, W.: RankFeat: Rank-1 Feature Removal for Out-of-distribution Detection. arXiv:2209.08590
Lin, Z., Roy, S.D., Li, Y.: MOOD: multi-level out-of-distribution detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021)
Hendrycks, D., Mazeika, M., Dietterich, T.: Deep anomaly detection with outlier exposure. In: International Conference on Learning Representations (2019)
Hendrycks, D., Mazeika, M., Kadavath, S., Song, D.: Using self-supervised learning can improve model robustness and uncertainty. In: Advances in Neural Information Processing Systems (2019)
Tack, J., Mo, S., Jeong, J., Shin, J.: CSI: novelty detection via contrastive learning on distributionally shifted instances. In: Advances in Neural Information Processing Systems (2020)
Yu, S., Lee, D., Yu, H.: Convolutional neural networks with compression complexity pooling for out-of-distribution image detection. In: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (2020)
Sastry, C.S., Oore, S.: Detecting Out-of-Distribution Examples with In-distribution Examples and Gram Matrices. arXiv:1912.12510
Ren, J., Fort, S., Liu, J., Roy, A.G., Padhy, S., Lakshminarayanan, B., A Simple Fix to Mahalanobis Distance for Improving Near-OOD Detection. arXiv:2106.09022
Rippel, O., Mertens, P., Merhof, D.: Modeling the distribution of normal data in pre-trained deep features for anomaly detection. arXiv:2005.14140
Defard, T., Setkov, A., Loesch, A., Audigier, R.: PaDiM: a Patch Distribution Modeling Framework for Anomaly Detection and Localization. arXiv:2011.08785
Cook, M., Zare, A., Gader, P.: Outlier Detection through Null Space Analysis of Neural Networks. arXiv:2007.01263
Arora, S., Ge, R., Neyshabur, B., Zhang, Y.: Stronger generalization bounds for deep nets via a compression approach. In: Proceedings of the 35th International Conference on Machine Learning (2018)
Kornblith, S., Norouzi, M., Lee, H., Hinton, G.: Similarity of neural network representations revisited. In: Proceedings of the 36th International Conference on Machine Learning (2019)
Nguyen, T., Raghu M., Kornblith, S.: Do wide and deep networks learn the same things? uncovering how neural network representations vary with width and depth. In: International Conference on Learning Representations (2021)
Raghu, M., Unterthiner, T., Kornblith, S., Zhang, C., Dosovitskiy, A.: Do vision transformers see like convolutional neural networks? In: Advances in Neural Information Processing Systems (2021)
Kornblith, S., Chen, T., Lee, H., Norouzi, M.: Why do better loss functions lead to less transferable features? In: Advances in Neural Information Processing Systems (2021)
Nguyen, T., Raghu, M., Kornblith, S.: On the Origins of the Block Structure Phenomenon in Neural Network Representations. arXiv:2202.07184
Cristianini, N., Shawe-Taylor, J., Elisseeff, A., Kandola, J.: On kernel-target alignment. In: Advances in Neural Information Processing Systems (2001)
Cortes, C., Mohri, M., Rostamizadeh, A.: Algorithms for learning kernels based on centered alignment. J. Mach. Learn. Res. 13(1), 795828 (2012)
Fefferman, C., Mitter, S., Narayanan, H.: Testing the manifold hypothesis. J. Amer. Math. Soc. 29(4), 983 (2016)
Kothapalli, V.: Neural collapse: a review on modelling principles and generalization. Trans. Mach. Learn. Res. (2023)
Liang, S., Li, Y., Srikant, R.: Enhancing the reliability of out-of-distribution image detection in neural networks. In: International Conference on Learning Representations (2018)
Liu, W., Wang, X., Owens, J., Li, Y.: Energy-based out-of-distribution detection. In: Advances in Neural Information Processing Systems (2020)
Vaze, S., Han, K., Vedaldi, A., Zisserman, A.: Open-set recognition: a good closed-set classifier is all you need. In: International Conference on Learning Representations (2022)
Sun, Y., Guo, C., Li, Y.: ReAct: out-of-distribution detection with rectified activations. In: Advances in Neural Information Processing Systems (2021)
Sun, Y., Li, Y.: DICE: leveraging sparsification for out-of-distribution detection. In: European Conference on Computer Vision (2022)
Huang, R., Li, Y.: MOS: towards scaling out-of-distribution detection for large semantic space. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021)
Suzuki, T., Abe, H., Nishimura, T.: Compression based bound for non-compressed network: unified generalization error analysis of large compressible deep neural network. In: International Conference on Learning Representations (2020)
Sanyal, A., Torr, P.H., Dokania, P.K.: Stable rank normalization for improved generalization in neural networks and GANs. In: International Conference on Learning Representations (2020)
Acknowledgments
TS was partially supported by JSPS KAKENHI (24K02905) and JST CREST (JPMJCR2015).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors have no competing interests to declare that are relevant to the content of this article.
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Uematsu, K., Haruki, K., Suzuki, T., Kimura, M., Takimoto, T., Nakagawa, H. (2024). Dimensionality-Induced Information Loss of Outliers in Deep Neural Networks. In: Bifet, A., Davis, J., Krilavičius, T., Kull, M., Ntoutsi, E., Žliobaitė, I. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2024. Lecture Notes in Computer Science(), vol 14941. Springer, Cham. https://doi.org/10.1007/978-3-031-70341-6_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-70341-6_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70340-9
Online ISBN: 978-3-031-70341-6
eBook Packages: Computer ScienceComputer Science (R0)