Optimum switched split vector quantization of LSF parameters

doi:10.1016/j.sigpro.2008.01.001

Signal Processing

Volume 88, Issue 6, June 2008, Pages 1528-1538

https://doi.org/10.1016/j.sigpro.2008.01.001 Get rights and content

Abstract

We address the issue of rate–distortion (R/D) performance optimality of the recently proposed switched split vector quantization (SSVQ) method. The distribution of the source is modeled using Gaussian mixture density and thus, the non-parametric SSVQ is analyzed in a parametric model based framework for achieving optimum R/D performance. Using high rate quantization theory, we derive the optimum bit allocation formulae for the intra-cluster split vector quantizer (SVQ) and the inter-cluster switching.

For the wide-band speech line spectrum frequency (LSF) parameter quantization, it is shown that the Gaussian mixture model (GMM) based optimum parametric SSVQ method provides 1 bit/vector advantage over the non-parametric SSVQ method.

Introduction

Most of the speech coders use linear prediction (LP) analysis and thus, more effective scheme of quantizing the LP coefficients (LPCs), equivalently line spectrum frequencies (LSFs), is in great demand. Vector quantization (VQ) of LSFs is the best way to reach lowest bitrate, but the prohibitive complexity of a full-search VQ limits its usage. Many different product code VQ methods [1], [2], [3], [4], [5] have been reported for LSF coding, which reduce complexity with a moderate loss of quantization performance. One of the widely reported techniques is split vector quantization (SVQ) method which was first proposed by Paliwal and Atal [6] for telephone-band speech and then further explored for wide-band speech [7], [8], [9]. Recently, So and Paliwal have proposed switched split vector quantization (SSVQ) method [10], [11] which is shown to provide a better R/D performance than the traditional SVQ method, for both telephone-band and wide-band speech cases. The SSVQ is further explored in [12], [13] to show its competitive performance advantage over many other product code VQ methods.

The SSVQ is a non-parametric product code VQ method, where the vector space is divided into non-overlapping Voronoi regions¹ and a separate SVQ is designed for each region. Thus, the SSVQ is composed of multiple SVQs. An input vector to be quantized is first classified to a Voronoi region and then the region specific SVQ is used for quantization. Though the SSVQ provides better rate–distortion (R/D) performance than the SVQ, it does not address the optimality of its R/D performance.

Currently there is a growing interest to develop parametric pdf based quantization methods using Gaussian mixture model (GMM), such that the optimum R/D performance can be achieved at a given bitrate [14], [15], [16], [17]. In this paper, we address the R/D performance optimality of SSVQ in a GMM based framework. We derive the optimum bit allocation criteria for both the stages of quantization, referred to as inter-cluster and intra-cluster bit allocation. We resort to the fixed and variable bitrate schemes of [16] for inter-cluster bit allocation. For intra-cluster bit allocation, we derive the R/D performance expression of the region specific optimum SVQ method using high rate quantization theory. We use square Euclidean distance (SED) as the distortion measure for ease of analysis. Focusing on wide-band speech LSF quantization, we show that the optimum parametric SSVQ method provides 1 bit/vector advantage over the non-parametric SSVQ method.

Section snippets

Preliminaries

For a source pdf given by $f_{g} (g)$ , the high rate quantization distortion (mean square error), using a VQ is given by [18]: $D ⩾ N^{- 2 / h} \frac{1}{π} \frac{h}{h + 2} {[\frac{h}{2} Γ (\frac{h}{2})]}^{2 / h} {[\int [f_{g} (g)]^{h / h + 2} d g]}^{(h + 2) / h},$ where $N = 2^{b_{g}}$ is the number of Voronoi regions and $b_{g}$ is the allocated bits/vector to quantize the source; h is the dimension of vector $g$ and $Γ (.)$ is the usual gamma function.

Let us consider the lower bound (equality in Eq. (1)) for a multi-variate Gaussian source. Suppose, $f_{g} (g)$ is multi-variate Gaussian distributed as $f_{g} (g) = N (μ_{g}, C_{g})$ .

Optimum SVQ

The SSVQ consists of multiple SVQs. Thus, we first address the R/D performance optimality of SVQ method using high rate quantization theory. For analyzing the optimum SVQ, let us assume that the source vector is multi-variate Gaussian distributed. This assumption is well justified in the context of analyzing SSVQ, since the SVQ is applied to a subset of data, occurring within a Voronoi region. Also, let that the source be quantized using c bits/vector.

Let $X$ be the p-dimensional vector which is

Optimum switched split VQ

In this section, we address the R/D performance optimality of the SSVQ method using a parametric model of the source pdf. The basis of the SSVQ method is to populate the vector space with M number of SVQs and switching to one of them for quantization, based on a nearest neighbor criterion [12], [13]. While SSVQ is shown to be better than SVQ, the issue of R/D performance optimality has not been addressed so far. We address this issue using GMM based framework for the source signal. Each

Quantization experiments

To test the LSF quantization performance, we consider wide-band speech LSFs. The speech data used in the experiments are from TIMIT database. The specification of AMR-WB speech codec [19] is used to compute the 16th order LPCs which are then converted to LSFs. We briefly describe the LPC analysis method in AMR-WB speech codec [19]. The 16 kHz speech is processed in two sub-bands, 0.05–6.4 and 6.4–7 kHz, to allocate the bits optimally according to the subjective importance of the lower band. In

Conclusion

We address the rate–distortion (R/D) performance optimality of the recently proposed switched split VQ (SSVQ) method. Using the GMM based framework, the optimality of SSVQ is addressed using a linearized approximation to the total average distortion. These result in optimum inter-cluster and intra-cluster bit allocation schemes. For wide-band speech LSF quantization, we show that the new parametric optimum SSVQ methods perform better than the non-parametric SSVQ method.

References (21)

D. Chang et al.
A classified vector quantization of LSF parameters
Signal Processing
(June 1997)
S. So et al.
Efficient product code vector quantisation using the switched split vector quantiser
Digital Signal Process.
(January 2007)
S. So et al.
A comparative study of LPC parameter representations and quantisation schemes for wide-band speech coding
Digital Signal Process.
(January 2007)
R. Laroia et al.
Robust and efficient quantization of speech LSP parameters using structured vector quantizers
W.F. LeBlanc et al.
Efficient search and design procedures for robust multi-stage VQ of LPC parameters for 4 Kb/s speech coding
IEEE Trans. Speech Audio Proc.
(October 1993)
S. Chatterjee et al.
Two stage transform vector quantization of LSFs for wideband speech coding
S. Chatterjee, T.V. Sreenivas, Conditional PDF-based split vector quantization of wideband LSF parameters, IEEE Signal...
K.K. Paliwal et al.
Efficient vector quantization of LPC parameters at 24 bits/frame
IEEE Trans. Acoust. Speech Signal Process.
(January 1993)
R. Lefebvre et al.
High quality coding of wide-band audio signals using transform coded excitation (TCX)
J.H. Chen et al.
Transform predictive coding of wide-band speech signals

There are more references available in the full text version of this article.

Cited by (23)

An upgraded version of the binary search space-structured VQ search algorithm for AMR-WB codec
2019, Symmetry
High-efficiency Vector Quantization Codebook Search Algorithms for Extended Adaptive Multi-rate-wideband Audio Coder
2019, Sensors and Materials
Genetic Simulated Annealing-Based Kernel Vector Quantization Algorithm
2017, International Journal of Pattern Recognition and Artificial Intelligence
An efficient VQ codebook search algorithm applied to AMR-WB speech coding
2017, Symmetry
An efficient search algorithm for ISF vector quantization in AMR-WB speech codec
2016, IEEJ Transactions on Electrical and Electronic Engineering
Efficient binary search space-structured VQ encoder applied to a line spectral frequency quantisation in G.729 standard
2016, IET Communications

View all citing articles on Scopus

View full text

Optimum switched split vector quantization of LSF parameters

Abstract

Introduction

Section snippets

Preliminaries

Optimum SVQ

Optimum switched split VQ

Quantization experiments

Conclusion

Signal Processing

Digital Signal Process.

Digital Signal Process.

Robust and efficient quantization of speech LSP parameters using structured vector quantizers

Efficient search and design procedures for robust multi-stage VQ of LPC parameters for 4 Kb/s speech coding

IEEE Trans. Speech Audio Proc.

Two stage transform vector quantization of LSFs for wideband speech coding

Efficient vector quantization of LPC parameters at 24 bits/frame

IEEE Trans. Acoust. Speech Signal Process.

High quality coding of wide-band audio signals using transform coded excitation (TCX)

Transform predictive coding of wide-band speech signals