Abstract
Advances in speech signal analysis during the last decade have allowed the development of automatic algorithms for a non-invasive detection of laryngeal pathologies. Performance assessment of such techniques reveals that classification success rates over 90 % are achievable. Bearing in mind the extension of these automatic methods to remote diagnosis scenarios, this paper analyses the performance of a pathology detector based on Mel Frequency Cepstral Coefficients when the speech signal has undergone the distortion of an analogue communications channel, namely the phone channel. Such channel is modeled as a concatenation of linear effects. It is shown that while the overall performance of the system is degraded, success rates in the range of 80% can still be achieved. This study also shows that the performance degradation is mainly due to band limitation and noise addition.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Södersten, M., Lindhe, C.: Voice ergonomics - an overview of recent research. In: Berlin, C., Bligard, L.O. (eds.) Proceedings of the 39th Nordic Ergonomics Society Conference (2007)
Umapathy, K., Krishnan, S., Parsa, V., Jamieson, D.G.: Discrimination of pathological voices using a time-frequency approach. IEEE Transactions on Biomedical Engineering 52, 421–430 (2005)
Baken, R.J., Orlikoff, R.F.: Clinical Measurement of Speech and Voice. Singular Publishers, San Diego (2000)
Boyanov, B., Hadjitodorov, S.: Acoustic analysis of pathological voices. A voice analysis system for the screening of laryngeal diseases. IEEE Engineering in Medicine and Biology 16, 74–82 (1997)
Godino-Llorente, J.I., Gómez-Vilda, P.: Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors. IEEE Transactions on Biomedical Engineering 51, 380–384 (2004)
Fraile, R., Godino-Llorente, J.I., Sáenz-Lechón, N., Osma-Ruiz, V., Gómez-Vilda, P.: Use of cepstrum-based parameters for automatic pathology detection on speech. Analysis of performance and theoretical justification. In: Proceedings of Biosignals 2008, vol. 1, pp. 85–91 (2008)
Murphy, P.J., Akande, O.O.: Quantification of glottal and voiced speech harmonics-to-noise ratios using cepstral-based estimation. In: Proceedings of the 3rd International Conference on Non-Linear Speech Processing (NOLISP 2005), pp. 224–232 (2005)
Godino-Llorente, J.I., Gómez-Vilda, P., Blanco-Velasco, M.: Dimensionality reduction of a pathological voice quality assessment system based on gaussian mixture models and short-term cepstral parameters. IEEE Transactions on Biomedical Engineering 53, 1943–1953 (2006)
Fraile, R., Sáenz-Lechón, N., Godino-Llorente, J.I., Osma-Ruiz, V., Gómez-Vilda, P.: Use of mel-frequency cepstral coeffcients for automatic pathology detection on sustained vowel phonations: Mathematical and statistical justification. In: Proceedings of the International Symposium on Image/Video Communications over fixed and mobile networks, Bilbao (July 2008)
TM Alliance Team: Telemedicine 2010: Visions for a personal medical network. Technical Report BR-29, ESA Publications Division (2004)
Moran, R.J., Reilly, R.B., de Chazal, P., Lacy, P.D.: Telephony-based voice pathology assessment using automated speech analysis. IEEE Transactions on Biomedical Engineering 53, 468–477 (2006)
Jamieson, D.G., Parsa, V., Price, M.C., Till, J.: Interaction of speech coders and atypical speech ii: Effects on speech quality. Journal of Speech, Language and Hearing Research 45, 689–699 (2002)
Fraile, R., Godino-Llorente, J.I., Sáenz-Lechón, N., Osma-Ruiz, V., Gómez-Vilda, P.: Analysis of the impact of analogue telephone channel on MFCC parameters for voice pathology detection. In: Proceedings of the 8th INTERSPEECH Conference (INTERSPEECH 2007), pp. 1218–1221 (2007)
Deller, J.R., Proakis, J.G., Hansen, J.H.L.: Discrete-time processing of speech signals. Macmillan Publishing Company, New York (1993)
Davis, S.B., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech and Signal Processing ASSP-28, 357–366 (1980)
Dimolitsas, S., Gunn, J.E.: Modular, off line, full duplex telephone channel simulator for high speed data transceiver evaluation. IEE Proceedings 135, 155–160 (1988)
ITU-T: Transmission characteristics of national networks. Series G: Transmission Systems and Media, Digital Systems and Networks Rec. G.120 (12/98) (1998)
Reynolds, D.A., Zissman, M.A., Quatieri, T.F., O’Leary, G.C., Carlson, B.A.: The effects of telephone transmission degradations on speaker recognition performance. In: Proceedings of ICASSP 1995, Detroit, MI, USA, vol. 1, pp. 329–332 (1995)
Massachusetts Eye and Ear Infirmary: Voice disorders database, CD-ROM (1994)
Parsa, V., Jamieson, D.G.: Identification of pathological voices using glottal noise measures. Journal of Speech, Language and Hearing Research 43, 469–485 (2000)
Haykin, S.: Neural networks: A comprehensive foundation. Macmillan, New York (1994)
Bimbot, F., Bonastre, J.F., Fredouille, C., Gravier, G., Magrin-Chagnolleau, I., Meignier, S., Merlin, T., Ortega-Garcia, J., Petrovska, D., Reynolds, D.A.: A tutorial on text-independent speaker verification. EURASIP Journal on Applied Signal Processing 2004, 430–451 (2004)
Martin, A.F., Doddington, G.R., Kamm, T., Ordowski, M., Przybocki, M.A.: The DET curve in assessment of detection task performance. In: Proceedings of Eurospeech 1997, Rhodes, Crete, vol. IV, pp. 1895–1898 (1997)
Pouchoulin, G., Fredouille, C., Bonastre, J.F., Ghio, A., Giovanni, A.: Frequency study for the characterization of the dysphonic voices. In: Proceedings of the 8th INTERSPEECH Conference (INTERSPEECH 2007), pp. 1198–1201 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fraile, R., Sáenz-Lechón, N., Godino-Llorente, J.I., Osma-Ruiz, V., Fredouille, C. (2010). Effect of a Simulated Analogue Telephone Channel on the Performance of a Remote Automatic System for the Detection of Pathologies in Voice: Impact of Linear Distortions on Cepstrum-Based Assessment - Band Limitation, Frequency Response and Additive Noise. In: Fred, A., Filipe, J., Gamboa, H. (eds) Biomedical Engineering Systems and Technologies. BIOSTEC 2009. Communications in Computer and Information Science, vol 52. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11721-3_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-11721-3_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11720-6
Online ISBN: 978-3-642-11721-3
eBook Packages: Computer ScienceComputer Science (R0)