Skip to main content
Log in

A new speaker verification method with global speaker model and likelihood score normalization

  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

In this paper a new text-independent speaker verification method GSMSV is proposed based on likelihood score normalization. In this novel method a global speaker model is established to represent the universal features of speech and normalize the likelihood score. Statistical analysis demonstrates that this normalization method can remove common factors of speech and bring the differences between speakers into prominence. As a result the equal error rate is decreased significantly, verification procedure is accelerated and system adaptability to speaking speed is improved.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Gao Wen, Ma Jiyong. A text-independent speaker identification approach based on statistical inference and VQ. InProceedings of the First International Conference on Multimodal Interface, Beijing, 1996, pp. 104–104.

  2. Wu Zhiqiang, Chen Ke, Chi Huisheng. Research on text-independent speaker recognition with limited vocabulary by using hidden Markov model. InProceedings of the 4th National Conference on ManMachine Speech Communications (NCMMSC’96), Beijing, 1996, pp. 212–216.

  3. Mou Xiaolong, Hu Qixiu, Wu Wenhu. Text-independent speaker identification using average spectrum and GMM approaches. InProceedings of the First International Conference on Multimodal Interface, Beijing, 1996, pp. 49–53.

  4. Bao Weiquan, Chen Ke, Chi Huisheng. Speaker identification based on HMM/MLFNN hybrid structure. InProceedings of the 4th National Conference on Man-Machine Speech Communications (NCMMSC’96), Beijing, 1996, pp.185–189.

  5. Liu Ying, Chen Yongbin. Speaker recognition by using nasal unvoiced phoneme recognition.Chinese Journal of Acoustics, 1995, 20(3): pp. 232–234.

    Google Scholar 

  6. Soong F K, Rosenberg A E, Rabiner L R, Juang B H. A vector quantization approach to speaker recognition.AT&T Tech. Journal, 1987, 66: 14–26.

    Google Scholar 

  7. Rosenberg A E, Soong F K. Evaluation of a vector quantization talker recognition system in text-independent and text dependent modes. InProceedings of International Conference on Acoustics, Speech and Signal Processing, 1986, pp. 873–876.

  8. Tomoko Matsui, Sadaoki Furui. A text-independent speaker recognition method robust against utterance variations. InProceedings of International Conference on Acoustics Speech and Signal Processing, 1991, pp.377–380.

  9. Tomoko Matsui, Sadaoki Furui. Comparison of text-independent speaker recognition methods using VQ-distortion and discrete/continuous HMMs. InProceedings of International Conference on Acoustics, Speech and Signal Processing, 1992, pp. 157–160.

  10. Oglesby J, Mason J S. Radial basis function networks for speaker recognition. InProceedings of International Conference on Acoustics, Speech and Signal Processing, 1991, pp.393–396.

  11. Soong F K, Rosenberg A E. On the use of instantaneous and transitional spectral information in speaker recognition. InProceedings of International Conference on Acoustics, Speech and Signal Processing, 1986, pp.877–880.

  12. Richard J Mammone, Zhang Xiaoyu, Ravip Ramachandran. Robust speaker recognition—A featurebased approach.IEEE Signal Processing Magazine, Sept. 1996, pp. 58–71.

  13. Rosenberg A E, Lee C-H, Soong F K. Sub-word unit talker verification using hidden Markov models. InProceedings of International Conference on Acoustics, Speech and Signal Processing, 1990, pp. 269–272.

  14. Higgins A, Bahler L. Text-independent speaker verification by discriminant counting. InProceedings of International Conference on Acoustics, Speech and Signal Processing, May 1991, pp.405–408.

  15. Rosenberg A E, Delong J, Lee C-H, Juang B H, Soong F K. The use of cohort normalized scores for speaker recognition. InProceedings of International Conference on Spoken Language Processing, Oct. 1992, pp.599–602.

  16. Liu Chi-Shi, Wang Hsiao-Chuan, Lee Chin-Hui. Speaker verification using normalized log-likelihood score.IEEE Trans. on Speech and Audio Processing, 1996, 4(1): 57–60.

    Google Scholar 

  17. Zhang Yiying Zhu Xiaoyan, Zhang Bo. A novel speaker verification method.Accepted by Journal of Software (in Chinese).

  18. Belle L. Tseng, Frank K. Soong, Aaron E. Rosenberg. Continuous probabilistic acoustic MAP for speaker recognition. InProceedings of International Conference on Acoustics, Speech and Signal Processing, 1992, pp. 161–164.

  19. Michael Savic, Sunil K Gupta. Variable parameter speaker verification system based on hidden Markov modeling. InProceedings of International Conference on Acoustics, Speech and Signal Processing, 1990, pp. 281–284.

  20. Huang X D, Jack M A. Semi-continuous hidden Markov models for speech signals.Computer Speech and Language, 1989, 3: 239–251.

    Article  Google Scholar 

  21. Bellegarda J R, Nahamoo D. Tied mixture continuous parameter models for large vocabulary isolated speech recognition. InProceedings of International Conference on Acoustics, Speech and Signal Processing, 1989, pp. 13–16.

  22. John J Godfrey, Edward C Holliman, Jane McDaniel. SWITCHBOARD: Telephone speech corpus for research and development. InProceedings of International Conference on Acoustics, Speech and Signal Processing, 1992, pp. 517–520.

Download references

Author information

Authors and Affiliations

Authors

Additional information

Supported by the National Natural Science Foundation of China.

ZHANG Yiying received her M.S. degree from the Department of Computer Science, Jilin University in 1995. She is now a Ph.D. candidate of the Department of Computer Science and Technology, Tsinghua University. Her current research interests are speaker recognition and speech recognition.

ZHU Xiaoyan received her Ph.D. degree from Nagoya Institute of Technology, Japan in 1990. She is now an, Associate Professor of the Department of Computer Science and Technology, Tsinghua University. Her current research interests are artificial neural network, pattern recognition and speech processing.

ZHANG Bo graduated from the Department of Automation, Tsinghua University in 1958. He is now a Professor and Ph.D. candidate adviser of the Department of Computer Science and Technology, Tsinghua University, a member of the Chinese Academy of Sciences His current research interests are artificial intelligence and computer applications.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, Y., Zhu, X. & Bo, Z. A new speaker verification method with global speaker model and likelihood score normalization. J. Comput. Sci. & Technol. 15, 184–193 (2000). https://doi.org/10.1007/BF02948803

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02948803

Keywords

Navigation