Skip to main content

Feature and Signal Enhancement for Robust Speaker Identification of G.729 Decoded Speech

  • Conference paper
Neural Information Processing (ICONIP 2012)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7667))

Included in the following conference series:

  • 4008 Accesses

Abstract

For wireless remote access security, there is an emerging need for biometric speaker identification systems (SID) to be robust to speech coding distortion. This paper presents results on a Gaussian mixture model (GMM) based SID system that is trained on clean speech and tested on the decoded speech of the G.729 codec. To mitigate the performance loss due to mismatched training and testing conditions, five robust features, two enhancement approaches and three fusion strategies are used. The first enhancement method is feature compensation based on the affine transform. The second is the McCree signal enhancement approach based on the spectral envelope information in the G.729 bit stream. Ensemble systems using decision level, score fusion and Borda count are studied. The best performance is obtained by performing signal enhancement, feature compensation and decision level fusion. This results in an identification success rate (ISR) of 89.8%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Jain, A.K., Ross, A., Nandakumar, K.: Introduction to Biometrics. Springer (2011)

    Google Scholar 

  2. Togneri, R., Pullella, D.: An overview of speaker identification: Accuracy and robustness issues. IEEE Circuits and Systems Magazine, 23–61 (2011)

    Google Scholar 

  3. Fazel, A., Chakrabartty, S.: An overview of statistical pattern recognition techniques for speaker verification. IEEE Circuits and Systems Magazine, 62–81 (2011)

    Google Scholar 

  4. Campbell, J.P., Shen, W., Campbell, W.M., Schwartz, R., Bonastre, J.-F., Matrouf, D.: Forensic speaker recognition. IEEE Signal Proc. Mag., 95–103 (2009)

    Google Scholar 

  5. Mammone, R.J., Zhang, X., Ramachandran, R.P.: Robust speaker recognition - A feature based approach. IEEE Signal Proc. Mag., 58–71 (1996)

    Google Scholar 

  6. ITU-T: Recommendation G.729 - coding of speech at 8 kbit/s using conjugate-structure algebraic-code-exited linear prediction, CS-ACELP (2007)

    Google Scholar 

  7. Moreno-Daniel, A., Juang, B.-H., Nolazco-Flores, J.A.: Robustness of bit-stream based features for speaker verification. In: IEEE Int. Conf. on Acoustics, Speech and Signal Proc., pp. I-749–I-752 (2005)

    Google Scholar 

  8. McCree, A.: Reducing Speech Coding Distortion for Speaker Identification. In: IEEE Int. Conf. on Spoken Language Proc. (2006)

    Google Scholar 

  9. Zilovic, M.S., Ramachandran, R.P., Mammone, R.J.: Speaker identification based on the use of robust cepstral features obtained from pole-zero transfer functions. IEEE Trans. on Speech and Audio Proc., 260–267 (1998)

    Google Scholar 

  10. Polikar, R.: Ensemble based systems in decision making. IEEE Circuits and Systems Magazine, 21–45 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Raval, K., Ramachandran, R.P., Shetty, S.S., Smolenski, B.Y. (2012). Feature and Signal Enhancement for Robust Speaker Identification of G.729 Decoded Speech. In: Huang, T., Zeng, Z., Li, C., Leung, C.S. (eds) Neural Information Processing. ICONIP 2012. Lecture Notes in Computer Science, vol 7667. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34500-5_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-34500-5_41

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34499-2

  • Online ISBN: 978-3-642-34500-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics