Prototype-based minimum error training for speech recognition

McDermott, Erik; Katagiri, Shigeru

doi:10.1007/BF00872091

Prototype-based minimum error training for speech recognition

Published: July 1994

Volume 4, pages 245–256, (1994)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Erik McDermott¹ &
Shigeru Katagiri¹

40 Accesses
Explore all metrics

Abstract

A key concept in pattern recognition is that a pattern recognizer should be designed so as to minimize the errors it makes in classifying patterns. In this article, we review a recent, promising approach for minimizing the error rate of a classifier and describe a particular application to a simple, prototype-based speech recognizer. The key idea is to define a smooth, differentiable loss function that incorporates all adaptable classifier parameters and that approximates the actual performance error rate. Gradient descent can then be used to minimize this loss. This approach allows but does not require the use of explicitly probabilistic models. Furthermore, minimum error training does not involve the estimation of probability distributions that are difficult to obtain reliably. This new method has been applied to a variety of pattern recognition problems, with good results. Here we describe a particular application in which a relatively simple distance-based classifier is trained to minimize errors in speech recognition tasks. The loss function is defined so as to reflect errors at the level of the final, grammar-driven recognition output. Thus, minimization of this loss directly optimizes the overall system performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

R.O. Duda and P.E. Hart,Pattern Classification and Scene Analysis. John Wiley & Sons: New York, 1973.
Google Scholar
K.S. Fu,Sequential Methods in Pattern Recognition and Machine Learning. Academic Press: New York, 1968.
Google Scholar
E. McDermott and S. Katagiri, “Prototype-based discriminative training for various speech units,”Proc. IEEE ICASSP-92, San Francisco, CA, vol. I: pp. 417–420.
E. McDermott and S. Katagiri, “Smoother GPD training for a prototype-based minimum error classifier,”Proc. Acoust. Soc. Jpn. pp. 203–204, March 1992.
S. Amari, “A theory of adaptive pattern classifiers,”IEEE Trans. Princeton, NJ, vol. EC-16, no. 3, pp. 299–307, 1967.
Google Scholar
S. Katagiri, C.H. Lee, and B.H. Juang, “Discriminative multilayer feedforward networks,”Proc. 1991 IEEE Workshop on Neural Networks for Signal Processing, August 1991, pp. 11–20.
S. Katagiri, C.H. Lee, and B.H. Juang, “A generalized probabilistic descent method,”Proc. Acoust. Soc. Jpn., Fall Meeting, 1990, pp. 141–142.
B.H. Juang and S. Katagiri, “Discriminative learning for minimum error classification,”IEEE Trans. Signal Processing, vol. 40, no. 12, December 1992.
D. Rainton and S. Sagayama, “Appropriate error criterion selection for continuous speech HMM minimum error training,”Proc. Int. Conf. Spoken Language Processing, Banff, Canada 1992, pp. 233–236.
D. Rainton and S. Sagayama, “Minimum error classification training of HMMs—implementation details and experimental results,”J. Acoust. Soc. Jpn. vol. 13, no. 6, pp. 379–387, November 1992.
Google Scholar
W. Chou, B.H. Juang, and C.H. Lee, “Segmental GPD training of HMM based speech recognizer,”Proc. IEEE ICASSP-92, 1992, pp. I:473–476.
T. Komori and S. Katagiri, “Application of GPD method to dynamic time warping-based speech recognition,”Proc. IEEE ICASSP-92, 1992, pp. I:497–500.
P.C. Chang and B.H. Juang, “Discriminative template training for dynamic programming speech recognition,”Proc. IEEE ICASSP-92, 1992, pp. I:493–496.
K.Y.Su and C.H. Lee, “Robustness and discrimination oriented speech recognition using weighted HMM and subspace projection approaches,”Proc. IEEE ICASSP-91, 1991, pp. I:541–544.
A. Biem and S. Katagiri, “Cepstrum liftering based on minimum classification error,”Proc. IEICE, Sendai, June 1992.
S. Young, N.H. Russell, and J.H.S. Thornton, “The use of syntax and multiple alternatives in the VODIS voice operated database inquiry system,”Comput. Speech Language vol. 5, pp. 65–80, 1991.
Google Scholar
F.K. Soong and E.F. Huang, “CA tree-trellis based fast search for finding the N best sentence hypotheses in continuous speech recognition,”Proc. IEEE ICASSP-91, 1991, pp. 705–708.
P. Haffner, M. Franzini, and A. Waibel, “Integrating time alignment and neural networks for high performance continuous speech recognition,”Proc. IEEE ICASSP-91, 1991, pp. 105–108.
Y. Lepage, O. Furuse, and H. Iida, “Relation between a pattern-matching operation and a distance: on the path to reconcile two approaches in Natural Language Processing,”Proc. First Singapore Int. Conf. Intell. Syst., Singapore, November 1992, pp. 513–518.

Download references

Author information

Authors and Affiliations

ATR Human Information Processing Research Laboratories, Hikari-dai 2-2, Seika-cho, Soraku-gun, 619-02, Kyoto, Japan
Erik McDermott & Shigeru Katagiri

Authors

Erik McDermott
View author publications
You can also search for this author in PubMed Google Scholar
Shigeru Katagiri
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

McDermott, E., Katagiri, S. Prototype-based minimum error training for speech recognition. Appl Intell 4, 245–256 (1994). https://doi.org/10.1007/BF00872091

Download citation

Issue Date: July 1994
DOI: https://doi.org/10.1007/BF00872091

Key words

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Prototype-based minimum error training for speech recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Minimizing Free Energy of Stochastic Functions of Markov Chains

A Decade of Discriminative Language Modeling for Automatic Speech Recognition

Sequence-Discriminative Training of Neural Networks

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Key words

Subscribe and save

Buy Now

Prototype-based minimum error training for speech recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Minimizing Free Energy of Stochastic Functions of Markov Chains

A Decade of Discriminative Language Modeling for Automatic Speech Recognition

Sequence-Discriminative Training of Neural Networks

Explore related subjects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Key words

Subscribe and save

Buy Now