Abstract:
Recently, a lot of deep learning model successful in taking over conventional methods in speech processing fields. Vector quantization is a popular technique to reduce th...Show MoreMetadata
Abstract:
Recently, a lot of deep learning model successful in taking over conventional methods in speech processing fields. Vector quantization is a popular technique to reduce the amount of speech data before transmitting. The conventional vector quantization method is based on the mathematical model. Last few years, the Vector Quantized Variational AutoEncoder has been proposed for an end-to-end vector quantization based on deep learning techniques. In this paper, we investigate the sub-band quantization in the Vector Quantized Variational AutoEncoder. This model can concentrate on specific frequency bands to assign more bits and leave the unnecessary band with few bits. Experimental results show the efficiency of the proposed quantization method for the spectral envelope parameters of the high-quality vocoder that operates at 48 kHz sampling frequency named WORLD vocoder. At the same four target bit rates, the sub-band Vector Quantized Variational AutoEncoder can reduce the Log Spectral Distortion around 0.93 dB in average.
Published in: TENCON 2019 - 2019 IEEE Region 10 Conference (TENCON)
Date of Conference: 17-20 October 2019
Date Added to IEEE Xplore: 12 December 2019
ISBN Information: