Text-dependent speaker identification based on input/output HMMs: An empirical study

Chen, Ke; Xie, Dahong; Chi, Huisheng

doi:10.1007/BF00571681

Text-dependent speaker identification based on input/output HMMs: An empirical study

Published: June 1996

Volume 3, pages 81–89, (1996)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Ke Chen^1,2,
Dahong Xie^1,2 &
Huisheng Chi^1,2

46 Accesses
Explore all metrics

Abstract

In this paper, we explore theInput/Output HMM (IOHMM) architecture for a substantial problem, that of text-dependent speaker identification. For subnetworks modeled with generalized linear models, we extend the IRLS algorithm to the M-step of the corresponding EM algorithm. Experimental results show that the improved EM algorithm yields significantly faster training than the original one. In comparison with the multilayer perceptron, the dynamic programming technique and hidden Markov models, we empirically demonstrate that the IOHMM architecture is a promising way to text-dependent speaker identification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

G.R.Doddington, “Speaker recognition — identifying people by their voices”, Proc. IEEE, Vol. 73, pp. 1651–1664, 1986.
Google Scholar
T.Matsui and S.Furui, “Speaker recognition technology”, NTT Review, Vol. 7, No. 2, pp. 40–48, 1995.
Google Scholar
Y. Bennani and P. Gallinari, “Connectionist approaches for automatic speaker recognition”, Proc. ESCA Workshop on Automatic Speaker Recognition, Martigny, Switzerland, pp. 95–102, April 4–7, 1994.
H.Sakoe and S.Chiba, “Dynamic programming algorithm optimization for speech word recognition”, IEEE Trans. Acoustics, Speech and Signal Processing, Vol. ASSP-26, No. 1, pp. 43–49. 1978.
Google Scholar
Y.Bengio, P.Simard and P.Frasconi, “Learning long-term dependencies with gradient descent is difficult”, IEEE Trans. Neural Networks, Vol. 5, No. 2, pp. 157–166, 1994.
Google Scholar
K.Chen, D.Xie and H.Chi, “Speaker identification using time-delay HMEs”, International Journal of Neural Systems, Vol. 7, No. 1 (March), pp. 29–43, 1996.
Google Scholar
Y.Bengio and P.Frasconi, “An Input/Output HMM architecture”, in J.D.Cowan, G.Tesauro, J.Alspector (eds) Advances in Neural Information Systems 7, MIT Press: Cambridge, MA, 1995.
Google Scholar
S. Furui, “An overview of speaker recognition technology”, Proc. ESCA Workshop on Automatic Speaker Recognition, Martigny, Switzerland, pp. 1–9, April 4–7, 1994.
M.I.Jordan and R.A.Jacobs, “Hierarchical mixtures of experts and EM algorithm”, Neural Computation, Vol. 6, No. 2, pp. 181–214, 1994.
Google Scholar
P.McCullagh and J.A.Nelder, Generalized Linear Models, Chapman and Hall: London, 1989.
Google Scholar
K.Zwicker, “Subdivision of the audible frequency range into critical bands”, J. Acoust. Soc. Amer., Vol. 35, No. 2, pp. 248–252, 1961.
Google Scholar
L.Rabiner and B.H.Juang, Fundamentals of Speech Recognition, Prentice-Hall: Englewood Cliffs, NJ, 1993.
Google Scholar

Download references

Author information

Authors and Affiliations

National Laboratory of Machine Perception, Peking University, 100871, Beijing, China
Ke Chen, Dahong Xie & Huisheng Chi
Center for Information Science, Peking University, 100871, Beijing, China
Ke Chen, Dahong Xie & Huisheng Chi

Authors

Ke Chen
View author publications
You can also search for this author in PubMed Google Scholar
Dahong Xie
View author publications
You can also search for this author in PubMed Google Scholar
Huisheng Chi
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, K., Xie, D. & Chi, H. Text-dependent speaker identification based on input/output HMMs: An empirical study. Neural Process Lett 3, 81–89 (1996). https://doi.org/10.1007/BF00571681

Download citation

Issue Date: June 1996
DOI: https://doi.org/10.1007/BF00571681

Key words

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Text-dependent speaker identification based on input/output HMMs: An empirical study

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Milestones in speaker recognition

Probabilistic Prediction in Multiclass Classification Derived for Flexible Text-Prompted Speaker Verification

Automatic Speech Recognition in English Language: A Review

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Key words

Subscribe and save

Buy Now

Navigation

Text-dependent speaker identification based on input/output HMMs: An empirical study

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Milestones in speaker recognition

Probabilistic Prediction in Multiclass Classification Derived for Flexible Text-Prompted Speaker Verification

Automatic Speech Recognition in English Language: A Review

Explore related subjects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Key words

Subscribe and save

Buy Now

Search

Navigation