Abstract
A new trial on speech recognition from graphical point of view is introduced. Isolated spoken-letters and color-names words are considered. After recording, the speech signal is processed as an image by Power Spectrum Estimation. For feature extraction, classification and hence recognition, the algorithm of minimal eigenvalues of Toeplitz matrices together with other methods of speech processing and recognition are used. A number of examples on applications and comparisons are presented in the work. The efficiency of the method is very high in the case of the six Polish vowels and English color- names, and the results are encouraging to extend the algorithm to cover more word classes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Burr, D.J.: Experiments on Neural Net Recognition of Spoken and Written Text. IEEE Transactions on Acoustic, Speech, and Signal Processing 36 (July 1988)
MacDonald, J.L., Zucchini, W., Zucchi, W.: Hidden Markov and Other Models for Discrete-Valued Time Series. CRC Press, Boca Raton (1997)
Saeed, K.: Computer Graphics Analysis: A Criterion for Image Feature Extraction and Recognition. MGV - International Journal on Machine Graphics and Vision 10(2), 185–194 (2001); Institute of Computer Science, Polish Academy of Sciences, Warsaw
Schafer, R.W., Rabiner, L.R.: System for Automatic Formant Analysis of Voiced Speech. J. Acoust. Soc. Amer. 47 (February 1970)
Grad, L.: Obrazowa reprezentacja sygnału mowy. Biuletyn IAiR WAT, nr 11, Warsaw (2000)
Basztura, C.: Modele analizy i procedury w komputerowym rozpoznawaniu głosów. Prace naukowe ITiA Politechniki Wrocławskiej, nr 30, Wrocław (1989)
Marple, L.S.: Digital Spectral Analysis. Prentice Hall, Englewood Cliffs (1987)
Saeed, K., Kozłowski, M., Kaczanowski, A.: Metoda do rozpoznawania obrazów akustycznych izolowanych liter mowy, Zeszyty Politechniki Białostockiej, Białystok, I-1/2002, pp. 181–207 (2002) (in Polish)
Tadeusiewicz, R.: Sygnałmowy. WKiŁ, Warsaw (1988) (in Polish)
Ingle, V.K., Proakis, J.G.: Digital Signal Processing Using MATLAB. Brooks Cole (July 1999)
Levinson, N.: The Wiener RMS (Root Mean Square) Error Criterion in Filter Design and Prediction. Journal Math. Phys. 25 (1947)
Durbin, J.: Efficient Estimation of Parameters in Moving Average Models. Biometrics 46(part 1, 2) (1969)
Saeed, K.: Experimental Algorithm for Testing The Realization of Transfer Functions. In: Proceedings of the Fourteenth IASTED International Conference, Austria (1995)
Niedzielski, R.: Kryterium do rozpoznawania znaków maszynowych alfabetu łukowego. MSc Thesis, Ins. Informatyki PB, Białystok (1999)
Saeed, K., Dardzinska, A.: Language Processing: Word Recognition without Segmentation. JASIST - Journal of the American Society for Information Science and Technology 52(14), 1275–1279 (2001)
Lyons, R.G.: Wprowadzenie do cyfrowego przetwarzania sygnałów. WKiŁ, Warsaw (1999) (in Polish)
Furui, S.: Digital Speech Processing, Synthesis, and Recognition. Marcel Dekker, Inc., New York (2001)
Saeed, K., Rybnik, M., Tabędzki, M.: More Results and Applications about the Algorithm of Thinning Images to One-Pixel-width. In: Skarbek, W. (ed.) CAIP 2001. LNCS, vol. 2124, pp. 601–609. Springer, Heidelberg (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Saeed, K., Kozłowski, M. (2003). An Image-Based System for Spoken-Letter Recognition. In: Petkov, N., Westenberg, M.A. (eds) Computer Analysis of Images and Patterns. CAIP 2003. Lecture Notes in Computer Science, vol 2756. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45179-2_61
Download citation
DOI: https://doi.org/10.1007/978-3-540-45179-2_61
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40730-0
Online ISBN: 978-3-540-45179-2
eBook Packages: Springer Book Archive