ISCA Archive Interspeech 2018
ISCA Archive Interspeech 2018

Multi-frame Quantization of LSF Parameters Using a Deep Autoencoder and Pyramid Vector Quantizer

Yaxing Li, Eshete Derb Emiru, Shengwu Xiong, Anna Zhu, Pengfei Duan, Yichang Li

This paper presents a multi-frame quantization of line spectral frequency (LSF) parameters using a deep autoencoder (DAE) and pyramid vector quantizer (PVQ). The object is to provide sophisticated LSF quantization for the ultra-low bit rate speech coders with moderate delay. For the compression and de-correlation of multiple LSF frames, a DAE possessing linear coder-layer units with Gaussian noise is used. The DAE demonstrates a high degree of modelling flexibility for multiple LSF frames. To quantize the coder-layer vector effectively, a PVQ is considered. Comparing the discrete cosine model (DCM), the DAE-based compression shows better modelling accuracy of multi-frame LSF parameters and possesses an advantage in that the coder-layer dimensions could be any value. The compressed coder-layer dimensions of the DAE govern the trade-off between the modelling distortion and the coder-layer quantization distortion. The experimental results show that the proposed algorithm with determined optimal coder-layer dimension outperforms the DCM-based multi-frame LSF quantization approach in terms of spectral distortion (SD) performance and robustness across different speech segments.


doi: 10.21437/Interspeech.2018-2577

Cite as: Li, Y., Emiru, E.D., Xiong, S., Zhu, A., Duan, P., Li, Y. (2018) Multi-frame Quantization of LSF Parameters Using a Deep Autoencoder and Pyramid Vector Quantizer. Proc. Interspeech 2018, 3553-3557, doi: 10.21437/Interspeech.2018-2577

@inproceedings{li18p_interspeech,
  author={Yaxing Li and Eshete Derb Emiru and Shengwu Xiong and Anna Zhu and Pengfei Duan and Yichang Li},
  title={{Multi-frame Quantization of LSF Parameters Using a Deep Autoencoder and Pyramid Vector Quantizer}},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={3553--3557},
  doi={10.21437/Interspeech.2018-2577}
}