ABSTRACT
Faced with the threat of identity leakage during voice data publishing, users are engaged in a privacy-utility dilemma while enjoying convenient voice services. Existing studies employ direct modification or text-based re-synthesis to de-identify users' voices, but resulting in inconsistent audibility for human participants and not adaptive to informed attacks. In this poster, we propose a non-intrusive and adaptive speaker de-identification scheme to balance the privacy and utility of voice services. We generate adversarial examples to conceal user identity from exposure by Automatic Speaker Identification (ASI). By learning a compact distribution with a conditional variational auto-encoder, our system enables on-demand target sampling and diverse identity transformation. We also introduce the acoustic masking effect to construct inaudible perturbations, thus preserving the speech content and perceptual quality. Experiments on 50 speakers show our system could achieve 98.2% successful de-identification on 4 mainstream ASIs with an objective perceptual quality of 4.38 and a subjective mean opinion score of 4.56.
- Shimaa Ahmed, Amrita Roy Chowdhury, Kassem Fawaz, and Parmesh Ramanathan. 2020. Preech: A System for Privacy-Preserving Speech Transcription. In Proceedings of USENIX Security. Virtual Event, 2703--2720.Google Scholar
- Tadej Justin, Vitomir Struc, Simon Dobrisek, Bostjan Vesnicer, Ivo Ipsic, and France Mihelic. 2015. Speaker de-identification using diphone recognition and speech synthesis. In Proceedings of IEEE FG. Ljubljana, Slovenia, 1--7.Google ScholarCross Ref
- Jianwei Qian, Haohua Du, Jiahui Hou, Linlin Chen, Taeho Jung, and Xiang-Yang Li. 2018. Hidebehind: Enjoy Voice Input with Voiceprint Unclonability and Anonymity. In Proceedings of ACM SenSys. Shenzhen, China, 82--94.Google ScholarDigital Library
- Brij Mohan Lal Srivastava, Natalia A. Tomashenko, Xin Wang, Emmanuel Vincent, Junichi Yamagishi, Mohamed Maouche, Aurélien Bellet, and Marc Tommasi. 2020. Design Choices for X-Vector Based Speaker Anonymization. In Proceedings of ISCA Interspeech. Virtual Event, Shanghai, China, 1713--1717.Google Scholar
- Brij Mohan Lal Srivastava, Nathalie Vauquier, Md. Sahidullah, Aurélien Bellet, Marc Tommasi, and Emmanuel Vincent. 2020. Evaluating Voice Conversion-Based Privacy Protection against Informed Attackers. In Proceedings of IEEE ICASSP. Barcelona, Spain, 2802--2806.Google Scholar
- Tavish Vaidya and Micah Sherr. 2019. You Talk Too Much: Limiting Privacy Exposure Via Voice Input. In Proceedings of IEEE S&P Workshops. San Francisco, CA, USA, 84--91.Google ScholarCross Ref
Index Terms
- A non-intrusive and adaptive speaker de-identification scheme using adversarial examples
Recommendations
Reversible speaker de-identification using pre-trained transformation functions
A speaker de-identification method based on pre-trained transformations is proposed.We overcome the need for a parallel corpus between input and target speakers.Objective and subjective evaluations prove the validity of the proposed approach.This de-...
FedSP: Federated Speaker Verification with Personal Privacy Preservation
Algorithms and Architectures for Parallel ProcessingAbstractAutomatic speaker verification (ASV) has been widely applied in a variety of industrial scenarios. In ASV, the universal background model (UBM) needs to be trained with a large variety of speaker data so that the UBM can learn the speaker-...
Speaker anonymization using generative adversarial networks
The advent use of smart devices has enabled the emergence of many applications that facilitate user interaction through speech. However, speech reveals private and sensitive information about the user’s identity, posing several security risks. For ...
Comments