Abstract
In this paper, we propose a voice changing method to provide a seamless switchable function with a low computational complexity for digital imaging devices. The proposed method combines a waveform similarity overlap-and-add (WSOLA) algorithm with a sampling rate changing technique that operates in the time domain. In addition, the proposed method includes a noise technique in the region where the voice changing switching mode changes from on to off, and vice versa. We finally compare the performance of the proposed method with that of a conventional one in terms of the processing time and speech quality. It is shown from the experiments that the proposed voice changing method gives a relative complexity reduction of 84.5% in a resource-constrained device having an ARM processor and is more preferred than the conventional method by 76%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Salor, Ö., Demirekler, M.: Dynamic programming approach to voice transformation. Speech Communication 48(10), 1262–1272 (2006)
Moulines, E., Laroche, J.: Non-parametric techniques for pitch-scale and time-scale modification of speech. Speech Communication 16(2), 175–205 (1995)
Stylianou, Y.: Voice transformation: a survey. In: Proceedings of ICASSP, pp. 3585–3588 (2009)
Benesty, J., Sondhi, M., Huang, Y.: Handbook of Speech Processing. Springer, Heidelberg (2007)
Vergin, R., O’Shaughnessy, D., Farhat, A.: Time domain technique for pitch modification and robust voice transformation. In: Proceedings of ICASSP, pp. 947–950 (1997)
Roucos, S., Wilgus, A.: High quality time-scale modification for speech. In: Proceedings of ICASSP, pp. 493–496 (1986)
Wayman, J., Wilson, D.: Some improvements on the synchronized-overlap-add method of time scale modification for use in real-time speech compression and noise filtering. IEEE Transactions on Acoustics, Speech, and Signal Processing 36(1), 139–140 (1988)
Verhelst, W., Roelands, M.: An overlap-add technique based on waveform similarity (WSOLA) for high quality time-scale modification of speech. In: Proceedings of ICASSP, pp. 554–557 (1993)
Hardam, E.: High quality time scale modification of speech signals using fast synchronized overlap add algorithms. In: Proceedings of ICASSP, pp. 409–412 (1990)
Moulines, E., Charpentier, F.: Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones. Speech Communication 9(5-6), 453–467 (1990)
Keogh, E., Pazzani, M.: Derivative dynamic time warping. In: Proceedings of 1st SIAM International Conference on Data Mining, pp. 1–11 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jo, S.D., Lee, Y.H., Park, J.H., Kim, H.K., Kim, J.W., Kim, M.B. (2011). High-Quality and Low-Complexity Real-Time Voice Changing with Seamless Switching for Digital Imaging Devices. In: Kim, Th., Adeli, H., Robles, R.J., Balitanas, M. (eds) Ubiquitous Computing and Multimedia Applications. UCMA 2011. Communications in Computer and Information Science, vol 151. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20998-7_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-20998-7_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20997-0
Online ISBN: 978-3-642-20998-7
eBook Packages: Computer ScienceComputer Science (R0)