Skip to main content
Log in

A rejection model based on multi-layer perceptrons for Mandarin digit recognition

  • Correspondence
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

High performance Mandarin digit recognition (MDR) is much more difficult to achieve than its English counterpart, especially on inexpensive hardware implementation. In this paper, a new Multi-Layer Perceptrons (MLP) based postprocessor, ana posteriori probability estimator, is presented and used for the rejection model of the speaker independent Mandarin digit recognition system based on hidden Markov model (HMM). Poor utterances, which are recognized by HMMs but have lowa posteriori probability, will be rejected. After rejecting about 4.9% of the tested utterances, the MLP rejection model can boost the digit recognition accuracy from 97.1% to 99.6%. The performance is better than those rejection models based on linear discrimination, likelihood ratio or anti-digit.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Wilpon J G, Rabiner L R, Lee C H, Goldman E R. Automatic recognition of keywords in unconstrained speech using HMM’s.IEEE Trans. Acoustic, Speech, Signal Processing, 1990, ASSP-38(11): 1870–1878.

    Article  Google Scholar 

  2. Sukkar R A, Wilpon J G. A two pass classifier for utterance rejection in keyword spotting. InProc. IEEE International Conference Acoustics, Speech, Signal Processing (ICASSP’93), 1993, Vol. 2, pp. 451–454.

  3. Sukkar R A. Rejection for connected digit recognition based on GPD segmental discrimination. InProc. IEEE International Conference Acoustics, Speech, Signal Processing (ICASSP’94), 1994, Vol. 1, pp. 393–396.

  4. Rahim M G, Lee C H, Juang B H. Discriminative utterance verification for connected digits recognition.IEEE Trans. Speech and Audio Processing, 1997, 5(3): 266–277.

    Article  Google Scholar 

  5. Villarrubia L, Acero A. Rejection techniques for digit recognition in telecommunication applications. InProc. IEEE International Conference Acoustics, Speech, Signal Processing (ICASSP’93), 1993, Vol.2, pp. 455–458.

  6. Mathan L, Miclet L. Rejection of extraneous input in speech recognition application using MLP’s and the trace of HMM’s. InProc. IEEE International Conference Acoustics, Speech, Signal Processing (ICASSP’91), 1991, Vol.1, pp. 93–96.

  7. Richard M D, Lippmann R P. Neural network classifiers estimate Bayesiana posteriori probabilities.Neural Computation, 1991, 3: 461–483.

    Article  Google Scholar 

  8. Gu L, Liu R S. Mandarin digit speech recognition: State of the art, difficult points analysis and methods comparison.Journal of Circuits and Systems, 1997, 2(4): 32–39, (in Chinese).

    Google Scholar 

  9. Loizou P C, Spanias A S. High performance alphabet recognition.IEEE Trans. Speech and Audio Processing, 1996, 4(6): 430–445.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

This project is supported by the National Natural Science Fundation of China (Grant No.69975007) and the National “863” High-Tech Programme of China (No.863-306-ZD13-04-6), Open Funds of National Laboratory of Pattern Recognition, and Intel Architecture Development Co., Ltd.

ZHONG Lin received his B.S. and M.S. degrees in circuit and system from Tsinghua University, Beijing, China, in 1998 and 2000, respectively. Now he is a Ph.D. candidate in the Electronic Engineering Department, Princeton University, US.

LIU Jia received his B.S., M.S., and Ph.D. degrees in communication and electronic systems from Tsinghua University, Beijing, China, in 1983, 1986 and 1990, respectively. In April 1990, he joined the Remote Sensing Satellite Ground Station, Chinese Academy of Sciences, and then he worked as a Royal Society visiting scientist at the Engineering Department, Cambridge University, UK during 1992–1994. He is now a professor in the Department of Electronic Engineering, Tsinghua University and an IEEE member. His current research focuses on speech recognition, speech synthesis, speech coding, speech ASIC design and multimedia communication.

LIU Runsheng graduated from the Department of Radio and Electronics, Tsinghua University, Beijing, China, in 1958. Since 1958, he has been working at Tsinghua University and he is now a professor in the Department of Electronic Engineering, Tsinghua University, where he teaches and conducts researches on digital and analog circuits, IC design, electronic circuit CAD, signal processing and digital communication.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhong, L., Liu, J. & Liu, R. A rejection model based on multi-layer perceptrons for Mandarin digit recognition. J. Comput. Sci. & Technol. 17, 196–202 (2002). https://doi.org/10.1007/BF02962212

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02962212

Keywords

Navigation