skip to main content
10.1145/3577530.3577574acmotherconferencesArticle/Chapter ViewAbstractPublication PagescsaiConference Proceedingsconference-collections
research-article

Decision-based adversarial attack for speaker recognition models

Published:30 March 2023Publication History

ABSTRACT

As a biometric technology, speaker recognition is widely used in finance, criminal investigation, and other fields due to its convenience and high accuracy. Speaker recognition models are vulnerable to spoofing attacks and adversarial attacks. Thus, the security of speaker recognition models has received much attention. However, few works focus on the decision-based adversarial attacks for speaker recognition systems (SRS), in which the adversary can only access the final decisions of the black-box models. In this paper, we proposed Biased-Aha, a decision-based attack method that combined query history information and prior gradient from the substitution model to launch an efficient attack. Specifically, to generate the adversarial example, the perturbation is determined by following the sampling direction for successful queries and avoiding the sampling direction for failed queries, combined with the gradient direction from the substitution model. The experiment results show that Biased-Aha takes a high attack success rate and high efficiency. For the speaker recognition models, Gaussian Mixture Models (GMM) and ivector, Biased-Aha outperforms the state-of-the-art decision-based adversarial attacks.

References

  1. Reynolds D A. An overview of automatic speaker recognition technology[C]//2002 IEEE international conference on acoustics, speech, and signal processing. IEEE, 2002, 4: IV-4072-IV-4075.Google ScholarGoogle Scholar
  2. Bai Z, Zhang X L. Speaker recognition based on deep learning: An overview[J]. Neural Networks, 2021, 140: 65-99.Google ScholarGoogle ScholarCross RefCross Ref
  3. Singh S. Forensic and Automatic Speaker Recognition System[J]. International Journal of Electrical & Computer Engineering (2088-8708), 2018, 8(5).Google ScholarGoogle ScholarCross RefCross Ref
  4. Chowdhury A, Atoum Y, Tran L, Msu-avis dataset: Fusing face and voice modalities for biometric recognition in indoor surveillance videos[C]//2018 24th International Conference on Pattern Recognition (ICPR). IEEE, 2018: 3567-3573.Google ScholarGoogle Scholar
  5. Ren H, Song Y, Yang S, Secure smart home: A voiceprint and internet based authentication system for remote accessing[C]//2016 11th International Conference on Computer Science & Education (ICCSE). IEEE, 2016: 247-251.Google ScholarGoogle Scholar
  6. Wang Q, Guo P, Xie L. Inaudible adversarial perturbations for targeted attack in speaker recognition[J]. arXiv preprint arXiv:2005.10637, 2020.Google ScholarGoogle Scholar
  7. Kreuk F, Adi Y, Cisse M, Fooling end-to-end speaker verification with adversarial examples[C]//2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, 2018: 1962-1966.Google ScholarGoogle Scholar
  8. Gong Y, Poellabauer C. Crafting adversarial examples for speech paralinguistics applications[J]. arXiv preprint arXiv:1711.03280, 2017.Google ScholarGoogle Scholar
  9. Chen G, Chenb S, Fan L, Who is real bob? adversarial attacks on speaker recognition systems[C]//2021 IEEE Symposium on Security and Privacy (SP). IEEE, 2021: 694-711.Google ScholarGoogle Scholar
  10. Du T, Ji S, Li J, Sirenattack: Generating adversarial audio for end-to-end acoustic systems[C]//Proceedings of the 15th ACM Asia Conference on Computer and Communications Security. 2020: 357-369.Google ScholarGoogle Scholar
  11. Zheng B, Jiang P, Wang Q, Black-box adversarial attacks on commercial speech platforms with minimal information[C]//Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. 2021: 86-107.Google ScholarGoogle Scholar
  12. Seo J, Yoon T, Kim J, One-to-one Example-based Automatic Image Coloring Using Deep Convolutional Generative Adversarial Network[J]. Journal of Advances in Information Technology Vol, 2017, 8(2).Google ScholarGoogle Scholar
  13. Kumar A, Irsoy O, Ondruska P, Ask me anything: Dynamic memory networks for natural language processing[C]//International conference on machine learning. PMLR, 2016: 1378-1387.Google ScholarGoogle Scholar
  14. Zhang Y, Jiang Z, Villalba J, Black-Box Attacks on Spoofing Countermeasures Using Transferability of Adversarial Examples[C]//INTERSPEECH. 2020: 4238-4242.Google ScholarGoogle Scholar
  15. Li Z, Wu Y, Liu J, Advpulse: Universal, synchronization-free, and targeted audio adversarial attacks via subsecond perturbations[C]//Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security. 2020: 1121-1134.Google ScholarGoogle Scholar
  16. Brendel W, Rauber J, Bethge M. Decision-based adversarial attacks: Reliable attacks against black-box machine learning models[J]. arXiv preprint arXiv:1712.04248, 2017.Google ScholarGoogle Scholar
  17. Dong Y, Su H, Wu B, Efficient decision-based black-box adversarial attacks on face recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 7714-7722.Google ScholarGoogle Scholar
  18. Brunner T, Diehl F, Le M T, Guessing smart: Biased sampling for efficient black-box adversarial attacks[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 4958-4966.Google ScholarGoogle Scholar
  19. Shi Y, Han Y, Tian Q. Polishing decision-based adversarial noise with a customized sampling[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 1030-1038.Google ScholarGoogle Scholar
  20. Ilyas A, Engstrom L, Madry A. Prior convictions: Black-box adversarial attacks with bandits and priors[J]. arXiv preprint arXiv:1807.07978, 2018.Google ScholarGoogle Scholar
  21. Li J, Ji R, Liu H, Projection & probability-driven black-box attack[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 362-371.Google ScholarGoogle Scholar
  22. Li J, Ji R, Chen P, Aha! adaptive history-driven attack for decision-based black-box models[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 16168-16177.Google ScholarGoogle Scholar
  23. Chen G, Zhao Z, Song F, SEC4SR: a security analysis platform for speaker recognition[J]. arXiv preprint arXiv:2109.01766, 2021.Google ScholarGoogle Scholar
  24. Reynolds D A, Quatieri T F, Dunn R B. Speaker verification using adapted Gaussian mixture models[J]. Digital signal processing, 2000, 10(1-3): 19-41.Google ScholarGoogle Scholar
  25. Dehak N, Kenny P J, Dehak R, Front-end factor analysis for speaker verification[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2010, 19(4): 788-798.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Ravanelli M, Bengio Y. Speaker recognition from raw waveform with sincnet[C]//2018 IEEE Spoken Language Technology Workshop (SLT). IEEE, 2018: 1021-1028.Google ScholarGoogle Scholar
  27. Snyder D, Garcia-Romero D, Sell G, Speaker recognition for multi-speaker conversations using x-vectors[C]//ICASSP 2019-2019 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, 2019: 5796-5800.Google ScholarGoogle Scholar
  28. Chen J, Jordan M I, Wainwright M J. Hopskipjumpattack: A query-efficient decision-based attack[C]//2020 ieee symposium on security and privacy (sp). IEEE, 2020: 1277-1294.Google ScholarGoogle Scholar

Index Terms

  1. Decision-based adversarial attack for speaker recognition models
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        CSAI '22: Proceedings of the 2022 6th International Conference on Computer Science and Artificial Intelligence
        December 2022
        341 pages
        ISBN:9781450397773
        DOI:10.1145/3577530

        Copyright © 2022 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 30 March 2023

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited
      • Article Metrics

        • Downloads (Last 12 months)53
        • Downloads (Last 6 weeks)7

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format