Abstract
This paper proposes and evaluates a new direct speech transform method with waveforms from laryngectomee speech to normal speech. Almost all conventional speech recognition systems and other speech processing systems are not able to treat laryngectomee speech with satisfactory results. One of the major causes is difficulty preparing corpora. It is very hard to record a large amount of clear and intelligible utterance data because the acoustical quality depends strongly on the individual status of such people.
We focus on acoustic characteristics of speech waveform by laryngectomee people and transform them directly into normal speech. The proposed method is able to deal with esophageal and alaryngeal speech in the same algorithm. The method is realized by learning transform rules that have acoustic correspondences between laryngectomee and normal speech. Results of several fundamental experiments indicate a promising performance for real transform.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Müller, J., Stahl, H.: Speech understanding and speech transform by maximum a-posteriori semantic decoding. Proceedings of Artificial Intelligence in Engineering, 373–384 (1999)
Ding, W., Higuchi, N.: A voice conversion method based on complex RBF network. In: Proceedings of the 1997 autumn meeting of ASJ(Japanese), pp. 335–336 (1997)
Turk, O., Arslan, L.M.: Subband based Voice Conversation. In: Proceedings of ICSLP 2002, pp. 289–292 (2002)
Murakami, K., Hiroshige, M., Araki, K., Tochinai, K.: Evaluation of direct speech transform method using Inductive Learning for conversations in the travel domain. In: Proceedings of ACL 2002 Workshop on Speech-to-Speech Translation, pp. 45–52 (2002)
Murakami, K., Araki, K., Hiroshige, M., Tochinai, K.: Evaluation of the rule acquisition on a direct speech translation method with waveforms using Inductive Learning for nouns and noun phrases. In: Proceedings of Pacific Association for Computational Linguistics PACLING 2003, pp. 121–130 (2003)
Araki, K., Tochinai, K.: Effectiveness of natural language processing method using inductive learning. In: Proceedings of Artificial Intelligence and Soft Computing ASC 2001, pp. 295–300 (2001)
Matsui, K., Noguchi, E.: Enhancement of esophageal speech. In: Proceedings the 1996 autumn meeting of the ASJ(Japanese), pp. 423–424 (1996)
Lu, J., Doi, Y., Nakamura, S., Shikano, K.: Acoustical Characteristics of Vowels of Esophageal Speech. Technical report of IEICE, SP96-126, pp. 233–240 (1997)
Espy-Wilson, C.Y., Chari, V.R., Huang, C.B.: Enhancement of Alaryngeal Speech by Adaptive Filter. In: Proceedings of ICSLP 1996, pp. 764–767 (1996)
Lee, A., Kawahara, T., Shikano, K.: Julius – a Open Source Real-Time Large Vocabulary Recognition Engine. In: Proceedings of EUROSPEECH 2001, pp. 1691–1693 (2001)
Katoh, Y.: Acoustic characteristics of speech in voice disorders. In: The 2000 spring meeting of the ASJ(Japanese), pp. 309–310 (2002)
Callan, D., Kent, R.D., Roy, N., Tasko, S.M.: Self-organizing Map for the Classification of Normal and Disordered Female Voices. Journal of Speech, Language, and Hearing Research 43, 355–366 (1999)
Silverman, H.F., Morgan, D.P.: The application of dynamic programming to connected speech recognition. IEEE, ASSP Magazine, 6–25 (1990)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Murakami, K., Araki, K., Hiroshige, M., Tochinai, K. (2003). Effectiveness of A Direct Speech Transform Method Using Inductive Learning from Laryngectomee Speech to Normal Speech. In: Gedeon, T.(.D., Fung, L.C.C. (eds) AI 2003: Advances in Artificial Intelligence. AI 2003. Lecture Notes in Computer Science(), vol 2903. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24581-0_59
Download citation
DOI: https://doi.org/10.1007/978-3-540-24581-0_59
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20646-0
Online ISBN: 978-3-540-24581-0
eBook Packages: Springer Book Archive