research-article

Decision-based adversarial attack for speaker recognition models

Authors:
Xueyang Cao

School of Information Science and Engineering, University of Jinan, China, China

School of Information Science and Engineering, University of Jinan, China, China

0000-0003-0178-2502
View Profile

,
Shanshan Wang

School of Information Science and Engineering, University of Jinan, China, China

School of Information Science and Engineering, University of Jinan, China, China

0000-0002-8620-9766
View Profile

,
Zhenxiang Chen

School of Information Science and Engineering, University of Jinan, China, China

School of Information Science and Engineering, University of Jinan, China, China

0000-0002-4948-3803
View Profile

,
Xiaoqing Jiang

School of Information Science and Engineering, University of Jinan, China, China

School of Information Science and Engineering, University of Jinan, China, China

0000-0002-6132-4805
View Profile

,
Weiliang Zheng

School of Information Science and Engineering, University of Jinan, China, China

School of Information Science and Engineering, University of Jinan, China, China

0000-0002-9682-6573
View Profile

,
Yadi Han

School of Information Science and Engineering, University of Jinan, China, China

School of Information Science and Engineering, University of Jinan, China, China

0000-0002-1099-4649
View Profile

CSAI '22: Proceedings of the 2022 6th International Conference on Computer Science and Artificial IntelligenceDecember 2022Pages 278–283https://doi.org/10.1145/3577530.3577574

Published:30 March 2023Publication History

CSAI '22: Proceedings of the 2022 6th International Conference on Computer Science and Artificial Intelligence

Pages 278–283

ABSTRACT

As a biometric technology, speaker recognition is widely used in finance, criminal investigation, and other fields due to its convenience and high accuracy. Speaker recognition models are vulnerable to spoofing attacks and adversarial attacks. Thus, the security of speaker recognition models has received much attention. However, few works focus on the decision-based adversarial attacks for speaker recognition systems (SRS), in which the adversary can only access the final decisions of the black-box models. In this paper, we proposed Biased-Aha, a decision-based attack method that combined query history information and prior gradient from the substitution model to launch an efficient attack. Specifically, to generate the adversarial example, the perturbation is determined by following the sampling direction for successful queries and avoiding the sampling direction for failed queries, combined with the gradient direction from the substitution model. The experiment results show that Biased-Aha takes a high attack success rate and high efficiency. For the speaker recognition models, Gaussian Mixture Models (GMM) and ivector, Biased-Aha outperforms the state-of-the-art decision-based adversarial attacks.

References

Reynolds D A. An overview of automatic speaker recognition technology[C]//2002 IEEE international conference on acoustics, speech, and signal processing. IEEE, 2002, 4: IV-4072-IV-4075.Google Scholar
Bai Z, Zhang X L. Speaker recognition based on deep learning: An overview[J]. Neural Networks, 2021, 140: 65-99.Google ScholarCross Ref
Singh S. Forensic and Automatic Speaker Recognition System[J]. International Journal of Electrical & Computer Engineering (2088-8708), 2018, 8(5).Google ScholarCross Ref
Chowdhury A, Atoum Y, Tran L, Msu-avis dataset: Fusing face and voice modalities for biometric recognition in indoor surveillance videos[C]//2018 24th International Conference on Pattern Recognition (ICPR). IEEE, 2018: 3567-3573.Google Scholar
Ren H, Song Y, Yang S, Secure smart home: A voiceprint and internet based authentication system for remote accessing[C]//2016 11th International Conference on Computer Science & Education (ICCSE). IEEE, 2016: 247-251.Google Scholar
Wang Q, Guo P, Xie L. Inaudible adversarial perturbations for targeted attack in speaker recognition[J]. arXiv preprint arXiv:2005.10637, 2020.Google Scholar
Kreuk F, Adi Y, Cisse M, Fooling end-to-end speaker verification with adversarial examples[C]//2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, 2018: 1962-1966.Google Scholar
Gong Y, Poellabauer C. Crafting adversarial examples for speech paralinguistics applications[J]. arXiv preprint arXiv:1711.03280, 2017.Google Scholar
Chen G, Chenb S, Fan L, Who is real bob? adversarial attacks on speaker recognition systems[C]//2021 IEEE Symposium on Security and Privacy (SP). IEEE, 2021: 694-711.Google Scholar
Du T, Ji S, Li J, Sirenattack: Generating adversarial audio for end-to-end acoustic systems[C]//Proceedings of the 15th ACM Asia Conference on Computer and Communications Security. 2020: 357-369.Google Scholar
Zheng B, Jiang P, Wang Q, Black-box adversarial attacks on commercial speech platforms with minimal information[C]//Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. 2021: 86-107.Google Scholar
Seo J, Yoon T, Kim J, One-to-one Example-based Automatic Image Coloring Using Deep Convolutional Generative Adversarial Network[J]. Journal of Advances in Information Technology Vol, 2017, 8(2).Google Scholar
Kumar A, Irsoy O, Ondruska P, Ask me anything: Dynamic memory networks for natural language processing[C]//International conference on machine learning. PMLR, 2016: 1378-1387.Google Scholar
Zhang Y, Jiang Z, Villalba J, Black-Box Attacks on Spoofing Countermeasures Using Transferability of Adversarial Examples[C]//INTERSPEECH. 2020: 4238-4242.Google Scholar
Li Z, Wu Y, Liu J, Advpulse: Universal, synchronization-free, and targeted audio adversarial attacks via subsecond perturbations[C]//Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security. 2020: 1121-1134.Google Scholar
Brendel W, Rauber J, Bethge M. Decision-based adversarial attacks: Reliable attacks against black-box machine learning models[J]. arXiv preprint arXiv:1712.04248, 2017.Google Scholar
Dong Y, Su H, Wu B, Efficient decision-based black-box adversarial attacks on face recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 7714-7722.Google Scholar
Brunner T, Diehl F, Le M T, Guessing smart: Biased sampling for efficient black-box adversarial attacks[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 4958-4966.Google Scholar
Shi Y, Han Y, Tian Q. Polishing decision-based adversarial noise with a customized sampling[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 1030-1038.Google Scholar
Ilyas A, Engstrom L, Madry A. Prior convictions: Black-box adversarial attacks with bandits and priors[J]. arXiv preprint arXiv:1807.07978, 2018.Google Scholar
Li J, Ji R, Liu H, Projection & probability-driven black-box attack[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 362-371.Google Scholar
Li J, Ji R, Chen P, Aha! adaptive history-driven attack for decision-based black-box models[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 16168-16177.Google Scholar
Chen G, Zhao Z, Song F, SEC4SR: a security analysis platform for speaker recognition[J]. arXiv preprint arXiv:2109.01766, 2021.Google Scholar
Reynolds D A, Quatieri T F, Dunn R B. Speaker verification using adapted Gaussian mixture models[J]. Digital signal processing, 2000, 10(1-3): 19-41.Google Scholar
Dehak N, Kenny P J, Dehak R, Front-end factor analysis for speaker verification[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2010, 19(4): 788-798.Google ScholarDigital Library
Ravanelli M, Bengio Y. Speaker recognition from raw waveform with sincnet[C]//2018 IEEE Spoken Language Technology Workshop (SLT). IEEE, 2018: 1021-1028.Google Scholar
Snyder D, Garcia-Romero D, Sell G, Speaker recognition for multi-speaker conversations using x-vectors[C]//ICASSP 2019-2019 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, 2019: 5796-5800.Google Scholar
Chen J, Jordan M I, Wainwright M J. Hopskipjumpattack: A query-efficient decision-based attack[C]//2020 ieee symposium on security and privacy (sp). IEEE, 2020: 1277-1294.Google Scholar

Index Terms

Decision-based adversarial attack for speaker recognition models
1. Computing methodologies
2. Security and privacy
  1. Security services

Index terms have been assigned to the content through auto-classification.

Recommendations

Practical Adversarial Attacks Against Speaker Recognition Systems
HotMobile '20: Proceedings of the 21st International Workshop on Mobile Computing Systems and Applications

Unlike other biometric-based user identification methods (e.g., fingerprint and iris), speaker recognition systems can identify individuals relying on their unique voice biometrics without requiring users to be physically present. Therefore, speaker ...
Read More
Practical Backdoor Attack Against Speaker Recognition System
Information Security Practice and Experience
Abstract
Deep learning-based models have achieved state-of-the-art performance in a wide variety of classification and recognition tasks. Although such models have been demonstrated to suffer from backdoor attacks in multiple domains, little is known ...
Read More
Enhancing Transferability of Adversarial Audio in Speaker Recognition Systems
Pattern Recognition and Image Analysis
Abstract
Although deep neural networks have demonstrated state-of-the-art performance in several tasks such as speaker recognition among others, they are highly vulnerable to adversarial attacks. These attacks involve the transformation of the original ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

CSAI '22: Proceedings of the 2022 6th International Conference on Computer Science and Artificial Intelligence
December 2022
341 pages
ISBN:9781450397773
DOI:10.1145/3577530

Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 March 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Black-box
Decision-based adversarial attack
Speaker recognition
Substitution model
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 53
  Total Downloads
- Downloads (Last 12 months)53
- Downloads (Last 6 weeks)7
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Decision-based adversarial attack for speaker recognition models

CSAI '22: Proceedings of the 2022 6th International Conference on Computer Science and Artificial Intelligence

ABSTRACT

References

Cited By

Index Terms

Recommendations

Practical Adversarial Attacks Against Speaker Recognition Systems

Practical Backdoor Attack Against Speaker Recognition System

Enhancing Transferability of Adversarial Audio in Speaker Recognition Systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Decision-based adversarial attack for speaker recognition models

CSAI '22: Proceedings of the 2022 6th International Conference on Computer Science and Artificial Intelligence

ABSTRACT

References

Cited By

Index Terms

Recommendations

Practical Adversarial Attacks Against Speaker Recognition Systems

Practical Backdoor Attack Against Speaker Recognition System

Enhancing Transferability of Adversarial Audio in Speaker Recognition Systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media