Skip to main content

Attention Network with GMM Based Feature for ASV Spoofing Detection

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12878))

Abstract

Automatic Speaker Verification (ASV) is widely used for its convenience, but is vulnerable to spoofing attack. The 2-class Gaussian Mixture Model classifier for genuine and spoofed speech is usually used as the baseline in ASVspoof challenge. The GMM accumulates the scores on all frames in a speech independently, and does not consider its context. We propose the self-attention network spoofing detection model whose input is the log-probabilities of the speech frames on the GMM components. The model relies on the self-attention mechanism which directly draws the global dependencies of the inputs. The model considers not only the score distribution on GMM components, but also the relationship of frames. And the pooling layer is used to capture long-term characteristics for detection. We also proposed the two-path attention network, which is based on two GMMs trained on genuine and spoofed speech respectively. Experiments on the ASVspoof 2019 challenge logical and physical access scenarios show that the proposed models can improve performance greatly compared with the baseline systems. LFCC feature is more suitable for our models than CQCC in experiments.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Wu, Z., Evans, N., Kinnunen, T., Yamagishi, J., Alegre, F., Li, H.: Spoofing and countermeasures for speaker verification: a survey. Speech Commun. 66, 130–153 (2015)

    Article  Google Scholar 

  2. Kinnunen, T., Wu, Z.Z., Lee, K.A., Sedlak, F., Chng, E.S., Li, H.: Vulnerability of speaker verification systems against voice conversion spoofing attacks: The Case of telephone speech. In: IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), pp. 4401–4404 (2012)

    Google Scholar 

  3. Lindberg, J., Blomberg, M.: Vulnerability in speaker verification–a study of technical impostor techniques. In: European Conference on Speech Communication and Technology (1999)

    Google Scholar 

  4. Hautamäki, R.S., et al.: Automatic versus human speaker verification: the case of voice mimicry. Speech Commun. 72, 13–31 (2015)

    Article  Google Scholar 

  5. Toda, T., Black, A.W., Tokuda, K.: Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory. IEEE Trans. Audio, Speech, Lang. Process. 15(8), 2222–2235 (2007)

    Article  Google Scholar 

  6. Galka, J., Grzywacz, M., Samborski, R.: Playback attack detection for text-dependent speaker verification over telephone channels. Speech Commun. 67, 143–153 (2015)

    Article  Google Scholar 

  7. Sahidullah, M., Kinnunen, T., Hanilci, C.: A comparison of features for synthetic speech detection. In: Proceedings of the INTERSPEECH, pp. 2087–2091 (2015)

    Google Scholar 

  8. Todisco, M., Delgado, H., Evans, N.: Constant Q cepstral coefficients: a spoofing countermeasure for automatic speaker verification. Comput. Speech Lang. 45, 516–535 (2017)

    Article  Google Scholar 

  9. Davis, S.B., Mermelstein, P.: Comparison of parametric representation for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. 28(4), 357–366 (1980)

    Article  Google Scholar 

  10. Alegre, F., Amehraye, A., Evans, N.: A one-class classification approach to generalised speaker verification spoofing countermeasures using local binary patterns. In: IEEE Sixth International Conference on Biometrics: Theory, Applications and Systems (BTAS). pp. 1–8 (2013)

    Google Scholar 

  11. Lavrentyeva, G., Novoselov, S., Malykh, E., Kozlov, A., Kudashev, O., Shchemelinin, V.: Audio replay attack detection with deep learning frameworks. In: INTERSPEECH, pp. 82–86 (2017)

    Google Scholar 

  12. Gomez-Alanis, A., Peinado, A.M., Gonzalez, J.A., Gomez, A.M.: A light convolutional GRU-RNN deep feature extractor for ASV spoofing detection. In: INTERSPEECH, pp. 1068–1072 (2019)

    Google Scholar 

  13. Alzantot, M., Wang, Z., Srivastava, M.B.: Deep residual neural networks for audio spoofing detection. In: INTERSPEECH, pp. 1078–1082 (2019)

    Google Scholar 

  14. Lai, C-I., Abad, A., Richmond, K., Yamagishi, J., Dehak, N., King, S.: Attentive filtering networks for audio replay attack detection. In: ICASSP, pp. 6316–6320 (2019)

    Google Scholar 

  15. Vaswani, A., et al.: Attention is all you need. arXiv preprint arXiv:1706.03762 (2017)

  16. Tom, F., Jain, M., Dey, P.: End-to-end audio replay attack detection using deep convolutional networks with attention. In: INTERSPEECH, pp. 681–685 (2018)

    Google Scholar 

  17. Lai, C., Chen, N., Villalba, J., Dehak, N.: ASSERT: anti-spoofing with squeeze-excitation and residual networks. In: INTERSPEECH (2019)

    Google Scholar 

  18. Todisco, M., et al.: ASVspoof 2019: future horizons in spoofed and fake audio detection. In: INTERSPEECH (2019)

    Google Scholar 

  19. Sadjadi, S.O., et al.: MSR Identity Toolbox v1.0: A MATLAB toolbox for speaker recognition research. Speech and Lang. Process. Tech. Committee Newsl. (2013)

    Google Scholar 

Download references

Acknowledgments

This work is supported by National Natural Science Foundation of P.R. China (62067004), and by Educational Commission of Jiangxi Province of P.R. China (GJJ170205).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhenchun Lei .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lei, Z., Yu, H., Yang, Y., Ma, M. (2021). Attention Network with GMM Based Feature for ASV Spoofing Detection. In: Feng, J., Zhang, J., Liu, M., Fang, Y. (eds) Biometric Recognition. CCBR 2021. Lecture Notes in Computer Science(), vol 12878. Springer, Cham. https://doi.org/10.1007/978-3-030-86608-2_50

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86608-2_50

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86607-5

  • Online ISBN: 978-3-030-86608-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics