Loading [a11y]/accessibility-menu.js
Model-Agnostic Adversarial Example Detection Through Logit Distribution Learning | IEEE Conference Publication | IEEE Xplore

Model-Agnostic Adversarial Example Detection Through Logit Distribution Learning


Abstract:

Recent research on vision-based tasks has achieved great improvement due to the development of deep learning solutions. However, deep models have been found vulnerable to...Show More

Abstract:

Recent research on vision-based tasks has achieved great improvement due to the development of deep learning solutions. However, deep models have been found vulnerable to adversarial attacks where the original inputs are maliciously manipulated and cause dramatic shifts to the outputs. In this paper, we focus on adversarial attacks in image classifiers built with deep neural networks and propose a model-agnostic approach to detect adversarial inputs. We argue that the logit semantics of adversarial inputs follow a different evolution with respect to original inputs, and construct a logits-based embedding of features for effective representation learning. We train an LSTM network to further analyze the sequence of logits-based features to detect adversarial examples. Experimental results on the MNIST, CFAR-10, and CFAR-100 datasets show that our method achieves state-of-the-art accuracy for detecting adversarial examples and has strong generalizability.
Date of Conference: 19-22 September 2021
Date Added to IEEE Xplore: 23 August 2021
ISBN Information:

ISSN Information:

Conference Location: Anchorage, AK, USA

Contact IEEE to Subscribe

References

References is not available for this document.