research-article

An Experimental Study on Replay Attack Detection Using Spoofing Clues from both Voiced and Non-Voiced Segments

Authors:
Yifeng Wang

Xiaomi Inc.,Beijing,China, China

Xiaomi Inc.,Beijing,China, China
View Profile

,
Yong Liu

Xiaomi Inc.,Beijing,China, China

Xiaomi Inc.,Beijing,China, China
View Profile

,
Peng Gao

Xiaomi Inc.,Beijing,China, China

Xiaomi Inc.,Beijing,China, China
View Profile

,
Yujun Wang

Xiaomi Inc.,Beijing,China, China

Xiaomi Inc.,Beijing,China, China
View Profile

ICDSP '21: Proceedings of the 2021 5th International Conference on Digital Signal ProcessingFebruary 2021Pages 266–271https://doi.org/10.1145/3458380.3458426

Published:23 September 2021Publication History

ICDSP '21: Proceedings of the 2021 5th International Conference on Digital Signal Processing

Pages 266–271

ABSTRACT

The spoofing clues with reverberation, channel and environmental noise are intertwined with the genuine speaker voice, making the task for replay attack detection challenging. In this study, we propose a novel approach to make full use of the replay clues of a whole utterance, by separately extracting different features from voiced and non-voiced segments and training separate Gaussian Mixed Models. First, a joint voice activity detector is adopted to get accurate boundaries of the different segments. Then this paper extracts Constant-Q Cepstral Coefficients and Inverse Mel Frequency Cepstral Coefficients from voiced and non-voiced segments respectively. Finally, a Score Calibrator Toolkit is used to fuse the scores of voiced and non-voiced segments. The result on evaluation set of ASVspoof 2017 V2.0 corpus shows that our proposed method yields an 18.4% relative reduction in equal error ratecompared to the CQCC-CMVN baseline system.

References

Li, Lantian and Chen, Yixiang and Wang, Dong and Zheng, Thomas Fang. 2017. A study on replay attack and anti-spoofing for automatic speaker verification, arXiv preprint arXiv:1706.02101Google Scholar
Yoon, S. H., Koh, M. S., Park, J. H., & Yu, H. J. 2020. A New Replay Attack Against Automatic Speaker Verification Systems. IEEE Access, 8, 36080-36088Google ScholarCross Ref
Jung, J. W., Shim, H. J., Heo, H. S., & Yu, H. J. 2020. A study on the role of subsidiary information in replay attack spoofing detection. arXiv preprint arXiv:2001.11688Google Scholar
Kinnunen, Tomi and Evans, Nicholas and Yamagishi, Junichi and Lee, Kong Aik and Sahidullah, Md and Todisco, Massimiliano and Delgado, H´ector. 2017. ASVspoof 2017: automatic speaker verification spoofing and countermeasures challenge evaluation plan, Training, vol.10, no.1508Google Scholar
Font, Roberto, Juan M. Espín, and María José Cano. 2017. Experimental analysis of features for replay attack detection-results on the ASVspoof 2017 Challenge. Interspeech, pp.7-11Google ScholarCross Ref
Patil, Hemant A and Kamble, Madhu R and Patel, Tanvina B and Soni, Meet H. 2017. Novel Variable Length Teager Energy Separation Based Instantaneous Frequency Features for Replay Detection., in Interspeech, pp.12-16Google Scholar
Kamble, Madhu R and Tak, Hemlata and Patil, Hemant A. 2018. Effectiveness of Speech Demodulation-Based Features for Replay Detection., in Interspeech, pp.641–645Google ScholarCross Ref
Suthokumar, Gajan and Sethu, Vidhyasaharan and Wijenayake, Chamith and Ambikairajah, Eliathamby. 2018. Modulation Dynamic Features for the Detection of Replay Attacks., in Interspeech, pp.691-695Google ScholarCross Ref
Sriskandaraja, Kaavya and Suthokumar, Gajan and Sethu, Vidhyasaharan and Ambikairajah, Eliathamby. 2017. Investigating the use of scattering coefficients for replay attack detection, in 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp.1195-1198Google ScholarCross Ref
Paliwal, Kuldip K. 1998. Spectral subband centroid features for speech recognition, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP’98 (Cat. No. 98CH36181), vol.2, pp.617-620Google ScholarCross Ref
Suthokumar, Gajan and Sriskandaraja, Kaavya and Sethu, Vidhyasaharan andWijenayake, Chamith and Ambikairajah, Eliathamby. 2019. Phoneme specific modelling and scoring techniques for anti spoofing system, in ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.6106-6110Google ScholarCross Ref
Chettri, Bhusan and Mishra, Saumitra and Sturm, Bob L and Benetos, Emmanouil.2018. Analysing the predictions of a cnn-based replay spoofing detection system, in 2018 IEEE Spoken Language Technology Workshop (SLT), pp.92-97Google Scholar
Saranya, MS and Padmanabhan, R and Murthy, Hema A. 2018. Replay attack detection in speaker verification using non-voiced segments and decision level feature switching, in 2018 International Conference on Signal Processing and Communications (SPCOM), pp.332-336Google ScholarCross Ref
Beritelli, F and Casale, S and Ruggeri, G and Serrano, S. 2002. Performance evaluation and comparison of G. 729/AMR/fuzzy voice activity detectors, IEEE Signal Processing Letters, vol.9, no.3, pp.85-88Google ScholarCross Ref
Tanel Alumäe, Asadullah. 2019. The TalTech Systems for the VOiCES from a Distance Challenge, in Interspeech (submitted)Google Scholar
Delgado H, Todisco M, Sahidullah M, 2018. ASVspoof 2017 Version 2.0: meta-data analysis and baseline enhancementsGoogle Scholar
Kinnunen, Tomi and Sahidullah, Md and Falcone, Mauro and Costantini, Luca and Hautam¨aki, Rosa Gonz´alez and Thomsen, Dennis and Sarkar, Achintya and Tan, Zheng-Hua and Delgado, H´ector and Todisco, Massimiliano and others. 2017. Reddots replayed: A new replay spoofing attack corpus for text-dependent speaker verification research, in 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.5395-5399Google Scholar
Hautamäki, V., Tuononen, M., Niemi-Laitinen, T., & Fränti, P. 2007. Improving speaker verification by periodicity based voice activity detection, Proc. 12th Int. Conf. Speech and Computer (SPECOM’2007), vol.2, pp.645-650Google Scholar
Sjölander, Kåre. 2003. An HMM-based system for automatic segmentation and alignment of speech, Proceedings of Fonetik, vol.2003, pp.93-96Google Scholar
Wu, Zhizheng and Kinnunen, Tomi and Evans, Nicholas and Yamagishi, Junichi and Hanilc¸i, Cemal and Sahidullah, Md and Sizov, Aleksandr. 2015. ASVspoof 2015: the first automatic speaker verification spoofing and countermeasures challenge, in Sixteenth Annual Conference of the International Speech Communication AssociationGoogle ScholarCross Ref
Brümmer N, De Villiers E. 2011. The BOSARIS toolkit user guide: Theory, algorithms and code for binary classifier score processing, in Documentation of BOSARIS toolkitGoogle Scholar
Niko Brümmer. 2010. Measuring, Refining and Calibrating Speaker and Language Information Extracted from Speech, Ph.D. thesis, University of Stellenbosch, Stellenbosch, South Africa, DecGoogle Scholar
Nandwana, Mahesh Kumar and Van Hout, Julien and McLaren, Mitchell and Richey, Colleen and Lawson, Aaron and Barrios, Maria Alejandra. 2019. The voices from a distance challenge 2019 evaluation plan, in arXiv preprint arXiv:1902.10828Google Scholar

Recommendations

Detection of Voice Conversion Spoofing Attacks Using Voiced Speech
Secure IT Systems
Abstract
Speech consists of voiced and unvoiced segments that differ in their production process and exhibit different characteristics. In this paper, we investigate the spectral differences between bonafide and spoofed speech for voiced and unvoiced ...
Read More
Automatic detection of breathy voiced vowels in Gujarati speech

This paper proposes a method for automatic detection of breathy voiced vowels in continuous Gujarati speech. As breathy voice is a specific phonetic feature predominantly present in Gujarati among Indian languages, it can be used for identifying ...
Read More
Improving Speech Intelligibility in Monaural Segregation System by Fusing Voiced and Unvoiced Speech Segments

Improving the speech intelligibility remains a challenging problem in digital hearing aids. This research work proposes a new speech segregation algorithm to improve the speech intelligibility by effectively fusing the voiced and unvoiced segment of the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICDSP '21: Proceedings of the 2021 5th International Conference on Digital Signal Processing
February 2021
336 pages
ISBN:9781450389365
DOI:10.1145/3458380

Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 23 September 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
replay attack
score calibration
speaker verification
spoofing detection
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 44
  Total Downloads
- Downloads (Last 12 months)9
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

An Experimental Study on Replay Attack Detection Using Spoofing Clues from both Voiced and Non-Voiced Segments

ICDSP '21: Proceedings of the 2021 5th International Conference on Digital Signal Processing

ABSTRACT

References

Cited By

Recommendations

Detection of Voice Conversion Spoofing Attacks Using Voiced Speech

Automatic detection of breathy voiced vowels in Gujarati speech

Improving Speech Intelligibility in Monaural Segregation System by Fusing Voiced and Unvoiced Speech Segments

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

An Experimental Study on Replay Attack Detection Using Spoofing Clues from both Voiced and Non-Voiced Segments

ICDSP '21: Proceedings of the 2021 5th International Conference on Digital Signal Processing

ABSTRACT

References

Cited By

Recommendations

Detection of Voice Conversion Spoofing Attacks Using Voiced Speech

Automatic detection of breathy voiced vowels in Gujarati speech

Improving Speech Intelligibility in Monaural Segregation System by Fusing Voiced and Unvoiced Speech Segments

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media