Skip to main content

Advertisement

Log in

Replay attack detection based on distortion by loudspeaker for voice authentication

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Identity authentication based on Automatic Speaker Verification (ASV) has attracted extensive attention. Voice can be used as a substitute of password in many applications. However, the security of current ASV systems has been seriously challenged by many malicious spoofing attacks. Among all those attacks, replay attack is one of the biggest threats to the ASV System, where an adversary can use a pre-recorded speech sample of the legal user to access the ASV system. In this paper, we present a replay attack detection (RAD) scheme to distinguish normal speech and replayed speech. We focus on the distortion caused by loudspeaker: low-frequency attenuation and high-frequency harmonics, and present a suite of RAD features DL-RAD, including Harmonic Energy Ratio (HER), Low Spectral Ratio (LSR), Low Spectral Variance (LSV), and Low Spectral Difference Variance (LSDV), to describe the different characteristics between the normal speech signal and replay speech signal. SVM is adopted as a classifier to evaluate the performance of these features. Experiment results show that the True Positive Rate (TPR), True Negative Rate (TNR) of the proposed method are about 98.15% and 98.75% respectively, which are significantly better than the existing scheme. The proposed scheme can be applied to both text-dependent and text-independent ASV systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Brown S (2006) Linear and nonlinear loudspeaker characterization. Ph.D. thesis, Citeseer

  2. Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines (ACM)

  3. findblometrics (2015) Voicevault biometrics to protect payments. https://findbiometrics.com/voicevault-biometrics-to-protect-payments-25131/

  4. Gaka J, Grzywacz M, Samborski R (2015) Playback attack detection for text-dependent speaker verification over telephone channels. Speech Comm 67:143

    Article  Google Scholar 

  5. Koga S, Makihara S, Yamanouchi Y (2010) . In: IEEE international conference on acoustics speech and signal processing, pp 1678–1681

  6. Kollewe J (2016) Hsbc rolls out voice and touch id security for bank customers–business. The Guardian

  7. Lindberg J, Blomberg M (2012) Vulnerability in speaker verification - a study of technical impostor techniques

  8. Ma Y, Luo X, Li X, Bao Z, Zhang Y (2018) Selection of rich model steganalysis features based on decision rough set α-positive region reduction. IEEE Trans Circ Chapman Hall/CRC Syst Video Technol PP(99):1

    Google Scholar 

  9. MPF (2015) DAILYMAIL.COM. Android can now unlock your phone when it hears your voice. http://www.dailymail.co.uk/sciencetech/article-3037733/OK-Google-Android-unlock-phone-hears-voice.html

  10. Reynolds DA (2002) An overview of automatic speaker recognition technology 4, IV

  11. (2015) Review: Jbl xtreme - how much bass can you handle? http://www.oluvsgadgets.net/2015/07/review-jbl-xtreme-how-much-bass-can-you-handle.html

  12. Shen W, Khanna R (1997) Prolog to speaker recognition: a tutorial. Proc IEEE 85(9):1436

    Article  Google Scholar 

  13. Shiota S, Villavicencio F, Yamagishi J, Ono N, Echizen I, Matsui T (2015) Voice liveness detection algorithms based on pop noise caused by human breath for automatic speaker verification

  14. Villalba J, Lleida E (2010) . In: Fala, pp 131–134

  15. Villalba J, Lleida E (2011) . In: Cost 2101 European conference on biometrics and Id management, pp 274–285

  16. Villalba J, Lleida E (2011) Preventing replay attacks on speaker verification systems 47 (10), p 1

  17. Wang ZF, Wei G, He QH, Wang ZF, Wei G (2011) Channel pattern noise based playback attack detection algorithm for speaker recognition 4, p 1708

  18. Wang ZF (2011) Playback attack detection based on channel pattern noise. Huanan Ligong Daxue Xuebao/journal of South China University of Technology 39(10):7

    Google Scholar 

  19. Wang J, Li T, Shi YQ, Lian S, Ye J (2016) Forensics feature analysis in quaternion wavelet domain for distinguishing photographic images and computer graphics. Multimedia Tools Chapman Hall/CRC Appl 76(22):1

    Google Scholar 

  20. Wu Z, Evans N, Kinnunen T, Yamagishi J, Alegre F, Li H (2014) Spoofing and countermeasures for speaker verification: a survey. Speech Comm 66:130

    Article  Google Scholar 

  21. Wu Z, Gao S, Cling ES, Li H (2015) . In: Signal and information processing association summit and conference, pp 35–45

  22. Wu Z, Li H (2016) On the study of replay and voice conversion attacks to text-dependent speaker verification. Multimedia Tools Appl 75(9):5311

    Article  Google Scholar 

  23. Zhang L, Tan S, Yang J, Chen Y (2016) . In: ACM Sigsac conference on computer and communications security, pp 1080–1091

  24. Zhang L, Cao J, Xu M, Zheng F (2008) Prevention of impostors entering speaker recognition systems, Journal of Tsinghua University

  25. Zhang Y, Qin C, Zhang W, Liu F, Luo X (2018) On the fault-tolerant performance for a class of robust image steganography, Signal Processing

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanzhen Ren.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work is supported by the Natural Science Foundation of China (NSFC) under the grant NO. U1536114, NO. 61872275, NO.U1536204, and China Scholarship Council.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ren, Y., Fang, Z., Liu, D. et al. Replay attack detection based on distortion by loudspeaker for voice authentication. Multimed Tools Appl 78, 8383–8396 (2019). https://doi.org/10.1007/s11042-018-6834-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-6834-3

Keywords

Navigation