Comparing Unsupervised Detection Algorithms for Audio Adversarial Examples

Choosaksakunwiboon, Shanatip; Pizzi, Karla; Kao, Ching-Yu

doi:10.1007/978-3-031-20980-2_11

Shanatip Choosaksakunwiboon¹¹,
Karla Pizzi^11,12 &
Ching-Yu Kao^11,12

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13721))

Included in the following conference series:

International Conference on Speech and Computer

1115 Accesses

Abstract

Recent works on automatic speech recognition (ASR) systems have shown that the underlying neural networks are vulnerable to so-called adversarial examples. In order to avoid these attacks, different defense mechanisms have been proposed. Most defense mechanisms discussed so far are based on supervised learning, which requires a lot of resources. In this research, we present and compare various unsupervised learning methods for the detection of audio adversarial examples (including autoencoder, VAE, OCSVM, and isolation forest), requiring no adversarial examples in the training data. Our experimental results show that some of the considered methods successfully defend against a simple adversarial attack, e.g., with isolation forest. Even in a more elaborate attack scenario that considers human psychoacoustics, we still achieve a high detection rate with the cost of slightly increased false positive rate, e.g., with an autoencoder. We expect our detailed analysis to be a helpful baseline for further research in the area of defense methods against audio adversarial examples.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

AudioGuard: Speech Recognition System Robust against Optimized Audio Adversarial Examples

Article 23 December 2023

Towards the universal defense for query-based audio adversarial attacks on speech recognition system

Article Open access 05 August 2023

Adaptive unified defense framework for tackling adversarial audio attacks

Article Open access 26 July 2024

Notes

References

Abdullah, H., Warren, K., Bindschaedler, V., Papernot, N., Traynor, P.: SoK: the faults in our ASRs: an overview of attacks against automatic speech recognition and speaker identification systems. In: 2021 IEEE Symposium on Security and Privacy (SP), pp. 730–747. IEEE (2021)
Google Scholar
Akinwande, V., Cintas, C., Speakman, S., Sridharan, S.: Identifying audio adversarial examples via anomalous pattern detection (2020)
Google Scholar
Andronic, I., Kürzinger, L., Chavez Rosas, E.R., Rigoll, G., Seeber, B.U.: MP3 compression to diminish adversarial noise in end-to-end speech recognition. In: Karpov, A., Potapova, R. (eds.) SPECOM 2020. LNCS (LNAI), vol. 12335, pp. 22–34. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60276-5_3
Chapter Google Scholar
Ardila, R., et al.: Common voice: a massively-multilingual speech corpus. In: LREC 2020 (2020)
Google Scholar
Carlini, N., Wagner, D.: Audio adversarial examples: targeted attacks on speech-to-text. In: 2018 IEEE Security and Privacy Workshops (SPW), pp. 1–7. IEEE (2018)
Google Scholar
Das, N., et al.: Compression to the rescue: defending from adversarial attacks across modalities. In: KDD Project Showcase (2018)
Google Scholar
Hussain, S., Neekhara, P., Dubnov, S., McAuley, J., Koushanfar, F.: WaveGuard: understanding and mitigating audio adversarial examples. In: USENIX Security 2021 (2021)
Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes (2013). https://doi.org/10.48550/arxiv.1312.6114. https://arxiv.org/abs/1312.6114
Liu, A., Yang, S., Chi, P.H., Hsu, P., Lee, H.: Mockingjay: unsupervised speech representation learning with deep bidirectional transformer encoders. In: ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2020)
Google Scholar
Liu, A.T., Li, S.W., Lee, H.: TERA: self-supervised learning of transformer encoder representation for speech. IEEE/ACM Trans. Audio Speech Lang. Process. 29, 2351–2366 (2021)
Article Google Scholar
Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation forest. In: 2008 Eighth IEEE International Conference on Data Mining, pp. 413–422 (2008)
Google Scholar
Mendes, E., Hogan, K.: Defending against imperceptible audio adversarial examples using proportional additive Gaussian noise (2020)
Google Scholar
Mitchell, J.L.: Introduction to digital audio coding and standards. J. Electron. Imaging 13, 399 (2004)
Article Google Scholar
Mozilla: Project DeepSpeech (2021). https://github.com/mozilla/DeepSpeech
Olivier, R., Raj, B.: Recent improvements of ASR models in the face of adversarial attacks (2022). https://doi.org/10.48550/ARXIV.2203.16536. https://arxiv.org/abs/2203.16536
Panayotov, V., Chen, G., Povey, D., Khudanpur, S.: LibriSpeech: an ASR corpus based on public domain audio books. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5206–5210 (2015). https://doi.org/10.1109/ICASSP.2015.7178964
Park, N., Ji, S., Kim, J.: Detecting audio adversarial examples with logit noising. In: Proceedings of the 37th Annual Computer Security Applications Conference (ACSAC 2021) (2021)
Google Scholar
Paul, M.: An adversarial detection model for different data types. Master’s thesis, Technical University of Munich (2021)
Google Scholar
Pereira, A., Thomas, C.: Challenges of machine learning applied to safety-critical cyber-physical systems. Mach. Learn. Knowl. Extract. 2, 579–602 (2020)
Article Google Scholar
Qin, Y., Carlini, N., Cottrell, G., Goodfellow, I., Raffel, C.: Imperceptible, robust, and targeted adversarial examples for automatic speech recognition. In: International Conference on Machine Learning, pp. 5231–5240. PMLR (2019)
Google Scholar
Ravanelli, M., et al.: SpeechBrain: a general-purpose speech toolkit (2021). arXiv:2106.04624
Schölkopf, B., Williamson, R., Smola, A., Shawe-Taylor, J., Platt, J.: Support vector method for novelty detection. In: Neural Information Processing Systems 12 (NIPS 1999), vol. 12, pp. 582–588 (1999)
Google Scholar
Schönherr, L., Kohls, K., Zeiler, S., Holz, T., Kolossa, D.: Adversarial attacks against automatic speech recognition systems via psychoacoustic hiding (2018)
Google Scholar
Sperl, P., Kao, C., Chen, P., Böttinger, K.: DLA: dense-layer-analysis for adversarial example detection. CoRR abs/1911.01921 (2019). https://arxiv.org/abs/1911.01921
Subramanian, V., Benetos, E., Sandler, M.B.: Robustness of adversarial attacks in sound event classification. In: Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE 2019), pp. 239–243 (2019)
Google Scholar
Szurley, J., Kolter, J.Z.: Perceptual based adversarial audio attacks (2019)
Google Scholar
Wu, H., Li, X., Liu, A.T., Wu, Z., Meng, H., Lee, H.: Adversarial defense for automatic speaker verification by cascaded self-supervised learning models. In: ICASSP 2021 (2021)
Google Scholar
Wu, H., Liu, A., Lee, H.: Defense for black-box attacks on anti-spoofing models by self-supervised learning (2020)
Google Scholar
Yakura, H., Sakuma, J.: Robust audio adversarial example for a physical attack. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (2018)
Google Scholar
Yang, Z., Li, B., Chen, P.Y., Song, D.: Characterizing audio adversarial examples using temporal dependency (2019)
Google Scholar
Yuan, X., et al.: CommanderSong: a systematic approach for practical adversarial voice recognition. In: USENIX Security 2018 (2018)
Google Scholar
Zhang, G., Yan, C., Ji, X., Zhang, T., Zhang, T., Xu, W.: DolphinAttack: inaudible voice commands. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (2017)
Google Scholar

Download references

Acknowledgment

This research was supported by the Bavarian Ministry of Economic Affairs, Regional Development and Energy.

Author information

Authors and Affiliations

Technical University Munich, Garching, Germany
Shanatip Choosaksakunwiboon, Karla Pizzi & Ching-Yu Kao
Fraunhofer Institute for Applied and Integrated Security, Garching, Germany
Karla Pizzi & Ching-Yu Kao

Authors

Shanatip Choosaksakunwiboon
View author publications
You can also search for this author in PubMed Google Scholar
Karla Pizzi
View author publications
You can also search for this author in PubMed Google Scholar
Ching-Yu Kao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ching-Yu Kao .

Editor information

Editors and Affiliations

Indian Institute of Technology Dharwad, Dharwad, India
S. R. Mahadeva Prasanna
St. Petersburg Federal Research Center of the Russian Academy of Sciences, St. Petersburg, Russia
Alexey Karpov
Koneru Lakshmaiah Education Foundation, Vaddeswaram, India
K. Samudravijaya
KIIT Group of Colleges, Gurugram, India
Shyam S. Agrawal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Choosaksakunwiboon, S., Pizzi, K., Kao, CY. (2022). Comparing Unsupervised Detection Algorithms for Audio Adversarial Examples. In: Prasanna, S.R.M., Karpov, A., Samudravijaya, K., Agrawal, S.S. (eds) Speech and Computer. SPECOM 2022. Lecture Notes in Computer Science(), vol 13721. Springer, Cham. https://doi.org/10.1007/978-3-031-20980-2_11

Download citation

DOI: https://doi.org/10.1007/978-3-031-20980-2_11
Published: 10 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20979-6
Online ISBN: 978-3-031-20980-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Comparing Unsupervised Detection Algorithms for Audio Adversarial Examples