Exploiting Glottal Information in Speaker Recognition Using Parallel GMMs

Yang, Pu; Yang, Yingchun; Wu, Zhaohui

doi:10.1007/11527923_84

Pu Yang¹⁹,
Yingchun Yang¹⁹ &
Zhaohui Wu¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 3546))

Included in the following conference series:

International Conference on Audio- and Video-Based Biometric Person Authentication

2285 Accesses
1 Citations

Abstract

The information of the vocal tract and the glottis are two kinds of sources which can characterize speakers. Though the former one has archived quite good performance in automatic speaker recognition (ASR) tasks, the glottal information behaves poorly when used individually. This work explores how to combining vocal tract and glottal information in an efficient and effective way. Taking into account the short-term correlation between them, our improved joint probability function model of the corresponding features is first proposed. Then we present a novel integrating system which uses parallel Gaussian Mixture Models (GMM) grounded on this function. Together with the traditional GMM, it also forms a hybrid model. Both methods were applied to YOHO and SRMC corpus, and experimental works show promising results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Improved Text-Independent Speaker Identification and Verification with Gaussian Mixture Models

Robust Speaker Identification Algorithms and Results in Noisy Environments

Speech Features Evaluation for Small Set Automatic Speaker Verification Using GMM-UBM System

References

Atal, B.S.: Automatic recogntion of speakers from their voices. Proc. IEEE 64, 460–475 (1976)
Article Google Scholar
Sonmez, K., Shriberg, E., Heck, L., Weintraub, M.: Modeling Dynamic Prosodic Variation for Speaker Verification. In: Proc. Intl. Conf. on Spoken Language Processing, vol. 7, pp. 3189–3192 (1998)
Google Scholar
Mizuno, H., et al.: Pitch dependent phone modeling for HMM-based speech recognition. J. Acoust. Soc. Jpn(E) 15, 77–84 (1994)
Google Scholar
Adami, A., Mihaescu, R., Reynolds, D., Godfrey, J.: Modeling Prosodic Dynamics for Speaker Recognition. In: IEEE ICASSP 2003, vol. 4, pp. 788–791 (2003)
Google Scholar
Reynolds Douglas, A.: The effects of handset variability on speaker recognition performance: Experiments on Switchboard corpus. In: IEEE ICASSP 1996, vol. 1, pp. 113–116 (1996)
Google Scholar
Shao, X., Milner, B., Cox, S.: Integrated Pitch and MFCC Extraction for Speech Reconstruction and Speech Recognition Applications. In: Eurospeech 2003, pp. 1725–1728 (2003)
Google Scholar
Peskin, B., Navratil, J., Abramson, J., Jones, D., Reynolds, D., Xiang, B.: Using Prosodic and Conversational Features for High-performance Speaker Recognition: Report from JHU WS 2002. In: ICASSP 2003, vol. 4, pp. 792–795 (2003)
Google Scholar
Arcienega, M., Drygajlo, A.: Pitch-dependent GMMs for Text-Independent Speaker Recognition Systems. In: Eurospeech 2001, Scandinavia, pp. 2821–2824 (2001)
Google Scholar
Ezzaidi, H., Rouat, J., Shaughnessy, D.: Towards combining pitch and MFCC for speaker identification systems. In: Proceedings of Eurospeech, pp. 2825–2828 (2001)
Google Scholar
Campbell Jr., J.: Speaker Recognition: A Tutorial. Proceedings of the IEEE 85, 1436–1462 (1997)
Article Google Scholar
Dautrich, B.A., Rabiner, L.R., Martin, T.B.: On the effects of varying filter bank parameters on isolated word recognition. IEEE Trans. Acoust., Speech, Signal Processing. 31, 793–807 (1983)
Article Google Scholar
Jain, K., Ross, A.: Learning User-specific Parameters in a Multibiometric System. In: Proc. Intl. Conf. on Image Processing, pp. 57–60 (2002)
Google Scholar
Campbell Jr., J.: Testing with the YOHO CD-ROM Voice Verification Corpus. In: ICASSP 1995, pp. 341–345 (1995)
Google Scholar
Sang, L., Wu, Z., Yang, Y.: Speaker Recognition System in Multi-Channel Environment. In: IEEE International Conference on System, Man & Cybernetics, pp. 3116–3121 (2003)
Google Scholar
Sun, X.: A Pitch Determination Algorithm Based on Subharmonic-to-harmonic ratio. In: The 6th International Conferernce of Spoken Language Processing, Beijing, China, vol. 4, pp. 676–679 (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Computer Science and Technology, Zhejiang University, Hangzhou, 310027, P.R. China
Pu Yang, Yingchun Yang & Zhaohui Wu

Authors

Pu Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yingchun Yang
View author publications
You can also search for this author in PubMed Google Scholar
Zhaohui Wu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

The Robotics Institute, Carnegie Mellon University., Pittsburgh, 15213-3890, Pennsylvania, USA
Takeo Kanade
Withington Hospital, Nightingale Centre, Manchester, UK
Anil Jain
IBM Thomas J. Watson Research Center, 19 Skyline Drive, NY 10598, Hawthorne, USA
Nalini K. Ratha

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, P., Yang, Y., Wu, Z. (2005). Exploiting Glottal Information in Speaker Recognition Using Parallel GMMs. In: Kanade, T., Jain, A., Ratha, N.K. (eds) Audio- and Video-Based Biometric Person Authentication. AVBPA 2005. Lecture Notes in Computer Science, vol 3546. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11527923_84

Download citation

DOI: https://doi.org/10.1007/11527923_84
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27887-0
Online ISBN: 978-3-540-31638-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Exploiting Glottal Information in Speaker Recognition Using Parallel GMMs

Abstract

Access this chapter

Preview

Similar content being viewed by others

Improved Text-Independent Speaker Identification and Verification with Gaussian Mixture Models

Robust Speaker Identification Algorithms and Results in Noisy Environments

Speech Features Evaluation for Small Set Automatic Speaker Verification Using GMM-UBM System

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Exploiting Glottal Information in Speaker Recognition Using Parallel GMMs

Abstract

Access this chapter

Preview

Similar content being viewed by others

Improved Text-Independent Speaker Identification and Verification with Gaussian Mixture Models

Robust Speaker Identification Algorithms and Results in Noisy Environments

Speech Features Evaluation for Small Set Automatic Speaker Verification Using GMM-UBM System

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation