Layer-Wise Activation Cluster Analysis of CNNs to Detect Out-of-Distribution Samples

Lehmann, Daniel; Ebner, Marc

doi:10.1007/978-3-030-86380-7_18

Daniel Lehmann¹² &
Marc Ebner¹²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12894))

Included in the following conference series:

International Conference on Artificial Neural Networks

2125 Accesses
1 Citations

Abstract

Convolutional neural network (CNN) models are widely used for image classification. However, CNN models are vulnerable to out-of-distribution (OoD) samples. This vulnerability makes it difficult to use CNN models in safety-critical applications (e.g., autonomous driving, medical diagnostics). OoD samples occur either naturally or in an adversarial setting. Detecting OoD samples is an active area of research. Papernot and McDaniel [43] have proposed a detection method based on applying a nearest neighbor (NN) search on the layer activations of the CNN. The result of the NN search is used to identify if a sample is in-distribution or OoD. However, a NN search is slow and memory-intensive at inference. We examine a more efficient alternative detection approach based on clustering. We have conducted experiments for CNN models trained on MNIST, SVHN, and CIFAR-10. In the experiments, we have tested our approach on naturally occurring OoD samples, and several kinds of adversarial examples. We have also compared different clustering strategies. Our results show that a clustering-based approach is suitable for detecting OoD samples. This approach is faster and more memory-efficient than a NN approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ackermann, M.R., Blömer, J., Kuntze, D., Sohler, C.: Analysis of agglomerative clustering. Algorithmica 69, 184–215 (2014)
Google Scholar
Ankerst, M., Breunig, M.M., Kriegel, H.P., Sander, J.: Optics: ordering points to identify the clustering structure. In: Proceedings of SIGMOD, pp. 49–60. ACM, Philadelphia (1999)
Google Scholar
Biggio, B., et al.: Evasion attacks against machine learning at test time. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8190, pp. 387–402. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40994-3_25
Chapter Google Scholar
Blundell, C., Cornebise, J., Kavukcuoglu, K., Wierstra, D.: Weight uncertainty in neural networks. In: Bach, F., Blei, D. (eds.) ICML, vol. 37, pp. 1613–1622. PMLR, Lille (2015)
Google Scholar
Chen, B., et al.: Detecting backdoor attacks on deep neural networks by activation clustering. In: Espinoza, H., hÉigeartaigh, S.Ó., Huang, X., Hernández-Orallo, J., Castillo-Effen, M. (eds.) Workshop on SafeAI@AAAI. CEUR Workshop, vol. 2301. ceur-ws.org, Honolulu (2019)
Google Scholar
Chen, T., Navratil, J., Iyengar, V., Shanmugam, K.: Confidence scoring using whitebox meta-models with linear classifier probes. In: Chaudhuri, K., Sugiyama, M. (eds.) AISTATS, vol. 89, pp. 1467–1475. PMLR, Naha (2019)
Google Scholar
Chou, E., Tramer, F., Pellegrino, G.: Sentinet: detecting localized universal attacks against deep learning systems. ArXiv https://arxiv.org/abs/1812.00292 (2020)
Clanuwat, T., Bober-Irizar, M., Kitamoto, A., Lamb, A., Yamamoto, K., Ha, D.: Deep learning for classical Japanese literature. ArXiv https://arxiv.org/abs/1812.01718 (2018)
Cohen, G., Sapiro, G., Giryes, R.: Detecting adversarial samples using influence functions and nearest neighbors. In: CVPR, pp. 14441–14450. IEEE, Seattle (2020)
Google Scholar
Crecchi, F., Bacciu, D., Biggio, B.: Detecting adversarial examples through nonlinear dimensionality reduction. ArXiv https://arxiv.org/abs/1904.13094 (2019)
Croce, F., Hein, M.: Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In: ICML, vol. 119, pp. 2206–2216. PMLR (2020)
Google Scholar
DeVries, T., Taylor, G.W.: Learning confidence for out-of-distribution detection in neural networks. ArXiv https://arxiv.org/abs/1802.04865 (2018)
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, pp. 226–231. AAAI Press, Portland(1996)
Google Scholar
Gal, Y.: Uncertainty in deep learning. Ph.D. thesis, Univ of Cambridge (2016)
Google Scholar
Gal, Y., Ghahramani, Z.: Dropout as a bayesian approximation: representing model uncertainty in deep learning. In: Balcan, M., Weinberger, K. (eds.) ICML, vol. 48, pp. 1050–1059. PMLR, New York (2016)
Google Scholar
Goodfellow, I., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: Bengio, Y., LeCun, Y. (eds.) ICLR, San Diego, CA, USA (2015)
Google Scholar
Grosse, K., Manoharan, P., Papernot, N., Backes, M., McDaniel, P.: On the (statistical) detection of adversarial examples. ArXiv https://arxiv.org/abs/1702.06280 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778. IEEE, Las Vegas (2016)
Google Scholar
Hendrycks, D., Gimpel, K.: A baseline for detecting misclassified and out-of-distribution examples in neural networks. In: ICLR. Toulon, France (2017)
Google Scholar
Hendrycks, D., Mazeika, M., Kadavath, S., Song, D.: Using self-supervised learning can improve model robustness and uncertainty. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) NeurIPS, vol. 32, pp. 15637–15648. CAI, Vancouver(2019)
Google Scholar
Hendrycks, D., Zhao, K., Basart, S., Steinhardt, J., Song, D.: Natural adversarial examples. ArXiv https://arxiv.org/abs/1907.07174 (2020)
Huang, H., Li, Z., Wang, L., Chen, S., Dong, B., Zhou, X.: Feature space singularity for out-of-distribution detection. ArXiv https://arxiv.org/abs/2011.14654 (2020)
Kim, H.: Torchattacks: A pytorch repository for adversarial attacks. ArXiv https://arxiv.org/abs/2010.01950 (2020)
Krizhevsky, A.: Learning multiple layers of features from tiny images. Univ of Toronto, Tech. rep. (2009)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) NIPS, vol. 25, pp. 1097–1105. CAI, Lake Tahoe (2012)
Google Scholar
Kurakin, A., Goodfellow, I.J., Bengio, S.: Adversarial examples in the physical world. In: ICLR. Toulon, France (2017)
Google Scholar
LeCun, Y., Cortes, C., Burges, C.: Mnist handwritten digit database. ATT Labs [Online]. http://yann.lecun.com/exdb/mnist 2 (2010)
Lee, K., Lee, H., Lee, K., Shin, J.: Training confidence-calibrated classifiers for detecting out-of-distribution samples. In: ICLR. Vancouver, CA (2018)
Google Scholar
Lee, K., Lee, K., Lee, H., Shin, J.: A simple unified framework for detecting out-of-distribution samples and adversarial attacks. ArXiv https://arxiv.org/abs/1807.03888 (2018)
Li, X., Li, F.: Adversarial examples detection in deep networks with convolutional filter statistics. In: ICCV, pp. 5775–5783. IEEE, Venice, Italy (2017)
Google Scholar
Liang, S., Li, Y., Srikant, R.: Enhancing the reliability of out-of-distribution image detection in neural networks. In: ICLR. Vancouver, CA (2018)
Google Scholar
Liu, W., Wang, X., Owens, J., Li, Y.: Energy-based out-of-distribution detection. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H. (eds.) NeurIPS, vol. 33, pp. 21464–21475. CAI (2020)
Google Scholar
Ma, X., et al.: Characterizing adversarial subspaces using local intrinsic dimensionality. In: ICLR. Vancouver, CA (2018)
Google Scholar
van der Maaten, L.J.P.: Learning a parametric embedding by preserving local structure. In: van Dyk, D., Welling, M. (eds.) AISTATS, vol. 5, pp. 384–391. PMLR, Clearwater Beach (2009)
Google Scholar
MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Cam, L.M.L., Neyman, J. (eds.) Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. Univ of Calif Press (1967)
Google Scholar
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: ICLR. Vancouver, CA (2018)
Google Scholar
McInnes, L., Healy, J., Melville, J.: UMAP: uniform manifold approximation and projection for dimension reduction. ArXiv https://arxiv.org/abs/1802.03426 (2018)
Meng, D., Chen, H.: Magnet: A two-pronged defense against adversarial examples. In: SIGSAC, pp. 135–147. ACM, Dallas (2017)
Google Scholar
Metzen, J.H., Genewein, T., Fischer, V., Bischoff, B.: On detecting adversarial perturbations. In: ICLR. Toulon, France (2017)
Google Scholar
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. In: NIPS Workshop on Deep Learning and Unsupervised Feature Learning (2011)
Google Scholar
Nguyen, A., Yosinski, J., Clune, J.: Multifaceted feature visualization: uncovering the different types of features learned by each neuron in deep neural networks. In: Visualization for Deep Learning workshop, International Conference in Machine Learning (2016). arXiv preprint arXiv:1602.03616
Pang, T., Du, C., Dong, Y., Zhu, J.: Towards robust detection of adversarial examples. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) NeurIPS, vol. 31, pp. 4584–4594. CAI, Montreal (2018)
Google Scholar
Papernot, N., McDaniel, P.: Deep k-nearest neighbors: towards confident, interpretable and robust deep learning. ArXiv https://arxiv.org/abs/1803.04765 (2018)
Pearson, K.: LIII. On lines and planes of closest fit to systems of points in space. London Edinb. Dublin Philos. Mag. J. Sci. 2(11), 559–572 (1901)
Google Scholar
Qin, Y., Frosst, N., Sabour, S., Raffel, C., Cottrell, G., Hinton, G.E.: Detecting and diagnosing adversarial images with class-conditional capsule reconstructions. In: ICLR. Addis Ababa, Ethiopia (2020)
Google Scholar
Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20(1), 53–65 (1987)
Google Scholar
Szegedy, C., et al.: Intriguing properties of neural networks. In: Bengio, Y., LeCun, Y. (eds.) ICLR. Banff, CA (2014)
Google Scholar
Xu, W., Evans, D., Qi, Y.: Feature squeezing: detecting adversarial examples in deep neural networks. ArXiv https://arxiv.org/abs/1704.01155 (2017)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Chapter Google Scholar
Zhang, H., Dauphin, Y.N., Ma, T.: Fixup initialization: residual learning without normalization. ArXiv https://arxiv.org/abs/1901.09321 (2019)

Download references

Author information

Authors and Affiliations

Institut für Mathematik und Informatik, Universität Greifswald, Walther-Rathenau-Straße 47, 17489, Greifswald, Germany
Daniel Lehmann & Marc Ebner

Authors

Daniel Lehmann
View author publications
You can also search for this author in PubMed Google Scholar
Marc Ebner
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Daniel Lehmann or Marc Ebner .

Editor information

Editors and Affiliations

Comenius University in Bratislava, Bratislava, Slovakia
Igor Farkaš
iMotions A/S, Copenhagen, Denmark
Paolo Masulli
University of Tübingen, Tübingen, Baden-Württemberg, Germany
Sebastian Otte
Universität Hamburg, Hamburg, Germany
Stefan Wermter

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lehmann, D., Ebner, M. (2021). Layer-Wise Activation Cluster Analysis of CNNs to Detect Out-of-Distribution Samples. In: Farkaš, I., Masulli, P., Otte, S., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2021. ICANN 2021. Lecture Notes in Computer Science(), vol 12894. Springer, Cham. https://doi.org/10.1007/978-3-030-86380-7_18

Download citation

DOI: https://doi.org/10.1007/978-3-030-86380-7_18
Published: 07 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86379-1
Online ISBN: 978-3-030-86380-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics