A statistical model for an automatic procedure to compress a word transcription dictionary

Mouria-Beji, Fériel

doi:10.1007/BFb0033335

Fériel Mouria-Beji¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1451))

Included in the following conference series:

Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR)

439 Accesses

Abstract

Various experiments have conclusively shown that superior continuous speech recognition performance is obtained when using context-dependent phonemic models. However, we have observed that using an explicit context-dependent phonemic model can yield many transcriptions for a single lexicon entry. In this work, we study the compression of the word transcription dictionaries (WTD) into a more compact form to balance the need between flexibility and reliability. Based on a measure of a likelihood function, a statistical model for an automatic procedure to compress a WTD is developed. The compressed dictionary is then used for sentence recognition in a continuous speech recognition system. Experimental results indicate a substantial improvement of the recognition rate after compression.

Download to read the full chapter text

Chapter PDF

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Y. Zhao. A speaker-independent continuous speech recognition system using continuous mixture gaussian density HMM of phoneme-sized units. IEEE Trans. on Speech and Audio Processing, 1(3):345–361, July 1993.
Article Google Scholar
L. R. Bahl, P. F. Brown, P. V. de Souza, R. L. Mercer, and M. A. Picheny. A method or the construction of acoustic Markov models for words. IEEE Trans. on Speech and Audio Processing, 1(4):443–452, October 1993.
Article Google Scholar
Y. Zhao, H. Wakita, and X. Zhuang. Generate word transcription dictionary from sentence utterances and evaluate its effects on speaker independant continuous speech recognition. In Proceedings of European Conference on Speech Technology, pages 679–682, Genova, Italy, September 1991.
Google Scholar
F. Mouria-Beji. Context and Speed Dependent Phonetic Models for Continuous Speech Recognition. In ESCA Tutorial Proceedings of Modeling Pronunciation Variation for Automatic Speech Recognition, Kerkrade, Netherlands, May 1998.
Google Scholar
K.F. Lee. Large-vocabulary speaker-independent continuous speech recognition: the SPHINX system. PhD thesis, Carnegie Mellon Univ., Pittsburgh, PA, April 1988.
Google Scholar
F. Beji Mouria. Un Systéme de Reconnaissance de la Parole Continue et son Expérimentation avec un Large Vocabulaire. To appear in revue magrébine de l'ingénieur. 1998.
Google Scholar
C.H. Lee. Acoustic modeling of subword units for speech recognition. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, pages 721–724, Albuquerque, USA, April 1990.
Google Scholar
L.R. Bahl, P.V. de Souza, P.S. Gopalakrishnan, N. Nahamoo, and M.A. Pichney. Decision trees for phonological roles in continuous speech. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, pages 185–188, Toronto, Canada, May 1991.
Google Scholar
F. Mouria-Beji. CODEPHON-NN: A COntext-DEpendent PHONemic model based on Neural Networks. In Computational Engineering in Systems Applications multiconference, CESA'98. IEEE-SMC, April 1998.
Google Scholar
R.M. Schwartz, Y. Chow, O.A. Kimball, S. Roucos, M. Krasner, and J. Makhoul. Contextdependent modeling for acoustic-phonetic recognition of continuous speech. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, pages 1205–1208, Tampa, Florida, March 1985.
Google Scholar
F. Mouria-Beji, Y. Gong, and J. P. Haton. Use of explicit context-dependent phonemic model in continuous speech recognition. In Proc. European Conf. on Speech Communication and Technology, pages 2223–2226, Berlin, Germany, 1993,EUROSPEECH'93.
Google Scholar
L.R. Bahl, P.V. de Souza, P.S. Gopalakrishnan, D. Nahamoo, and M. Picheney. Word lookahead scheme for cross-word right context models in a stack decoder. In Proceedings of European Conference on Speech Technology, pages 851–854, Berlin, 1993.
Google Scholar
K.F. Lee. Context dependent phonetic hidden Markov models for speaker independent continuous speech recognition. IEEE Trans. on Acoust., Speech and Signal Processing, 38(4):599–609, April 1990.
Google Scholar
F. Mouria-Beji. A Multi-Lingual Continuous Speech Recognition System. In Proc. 6th International Conference and Exhibition on Multi-lingual Computing, Cambridge, UK, April 1998. ICEMCO-98.
Google Scholar
Y. Gong, J.-P. Haton, and F. Mouria-Beji. Continuous speech recognition based on high plausibility regions. In Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing 1991, volume 1, pages 725–728, Toronto, Canada, May 1991, IEEE-ICASSP'91.
Google Scholar
F. Mouria-Beji. Neural network use in a non-linear vectorial interpolation technique for speaker recognition. In IEEE World Congress on Computational Intelligence, Anchorage, Alaska, May 1998,IEEE WCCL
Google Scholar
F. Mouria-Beji, Y. Gong, and J. P. Haton. Un modéle phonétique tenant compte explicitement du contexte pour la reconnaissance de la parole. In Actes du 9 ^éme congrés Reconnaissance des Formes et Intelligence Artificielle, volume 1, pages 265–275, Paris, France, January 1994, AFCET RFIA'94.
Google Scholar
F. Mouria-Beji and J. P. Haton. Utilisation des réseaux de neurones pour le traitement de la variabilité du signal de parole. In The 15^th Tunisian Conference on Electrical Machinery and Automatic Controle, volume 1, pages 122–130, Nabeul, November. 1995. JTEA'95.
Google Scholar
S. Takahashi, T. Matsuoka, Y. Minami, and K. Shikano. Phoneme HMMS constrained by frame correlations. In Proceedings of International Conference on Acoustics, Speech and Signal Processing, volume II, pages 219–222, 1993.
Google Scholar
M. Ostendorf and S. Roucos. A stochastic segment model for phoneme-based continuous speech recognition. IEEE Trans. Acoust., Speech and Signal Processing, 37(12):1857–1869,1989.
Google Scholar
V. V. Digalakis, M. Ostendorf, and J. R. Rohlicek. Fast algorithms for phone classification and recognition using segment-based models. IEEE Trans. on Signal Processing, 40(12):2885–2896, Dec. 1992.
Article Google Scholar
S. Furui. Speaker-independent isolated word recognition using dynamic features of speech spectrum. IEEE Trans. Acoust., Speech and Signal Processing, ASSP-34(1):53–59, 1986.
Google Scholar
Wellekens. Explicite time correlation in hidden Markov models for speech recognition. In Proceedings of International Conference on Acoustics, Speech and Signal Processing, pages 384–386, Dallas, 1987.
Google Scholar
K. K. PaliwaL Use of temporal correlation between successive frames in a hidden Markov model based speech recognizes. In Proceedings of International Conference on Acoustics, Speech and Signal Processing, volume II, pages 215–218, 1993.
Google Scholar
T. Robinson. A real-time recurrent error propagation network word recognition system. In Proceedings of International Conference on Acoustics, Speech and Signal Processing, volume I, pages 617–620, 1992.
Google Scholar

Download references

Author information

Authors and Affiliations

ENSI/LIA. Artificial Intelligence Group., Cité Mahrajéne, BP. 275, 1082, Tunis, Tunisia
Fériel Mouria-Beji (Member IEEE)

Authors

Fériel Mouria-Beji
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Adnan Amin Dov Dori Pavel Pudil Herbert Freeman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mouria-Beji, F. (1998). A statistical model for an automatic procedure to compress a word transcription dictionary. In: Amin, A., Dori, D., Pudil, P., Freeman, H. (eds) Advances in Pattern Recognition. SSPR /SPR 1998. Lecture Notes in Computer Science, vol 1451. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0033335

Download citation

DOI: https://doi.org/10.1007/BFb0033335
Published: 09 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64858-1
Online ISBN: 978-3-540-68526-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)