Spatial parameters for audio coding: MDCT domain analysis and synthesis

Chen, Shuixian; Xiong, Naixue; Hyuk Park, Jong; Chen, Min; Hu, Ruimin

doi:10.1007/s11042-009-0326-4

Spatial parameters for audio coding: MDCT domain analysis and synthesis

Published: 22 July 2009

Volume 48, pages 225–246, (2010)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Shuixian Chen¹,
Naixue Xiong²,
Jong Hyuk Park³,
Min Chen⁴ &
…
Ruimin Hu¹

327 Accesses
6 Altmetric
Explore all metrics

Abstract

We use Modified Discrete Cosine Transform (MDCT) to analyze and synthesize spatial parameters. MDCT in itself lacks phase information and energy conservation, which are needed by spatial parameters representation. Completing MDCT with Modified Discrete Sine Transform (MDST) into “MDCT-j*MDST” overcomes this and enables the representation in a form similar to that of DFT. And due to overlap-add in time domain, a MDST spectrum can be built perfectly from MDCT spectra of neighboring frames through matrix-vector multiplication. The matrix is heavily diagonal and keeping only a small number of its sub-diagonals is sufficient for approximation. When using MDCT based core coder in spatial audio coding, like Advanced Audio Coding (AAC), we need no separate transforming for spatial processing, cutting down significantly the computational complexity. Subjective listening tests also show that MDCT domain spatial processing has no quality impairment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

JND-based spatial parameter quantization of multichannel audio signals

Article Open access 21 May 2016

The Perceptual Lossless Quantization of Spatial Parameter for 3D Audio Signals

Multi-channel Object-Based Spatial Parameter Compression Approach for 3D Audio

References

3GPP specification Series TS 26.410 (2005) General audio codec audio processing functions; enhanced aacPlus general audio codec; floating-point ANSI-C code, http://www.3gpp.org/ftp/Specs/html-info/26-series.htm, Apr. 2005
3GPP Specification Series TS26.405 (2005) General audio codec audio processing functions; enhanced aacPlus general audio codec; encoder specification; parametric stereo part, http://www.3gpp.org/ftp/Specs/html-info/26-series.htm, Apr. 2005
Algazi VR, Duda RO, Thompson DM, Avendano C (2001) The CIPIC HRTF database. Presented at IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics
Baumgarte F, Faller C (2002a) Estimation of auditory spatial cues for binaural cue coding. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp 1801–1804
Baumgarte F, Faller C (2002b) Why binaural cue coding is better than intensity stereo coding. Presented at the 112th AES Convention, Munich, Germany
Baumgarte F, Faller C (2003) Binaural cue coding—part I: psychoacoustic fundamentals and design principles. IEEE Trans Speech Audio Process 11:509–519. doi:10.1109/TSA.2003.818109
Article Google Scholar
Blauert J (1983) Spatial hearing: the psychophysics of human sound localization. MIT, USA
Google Scholar
Bosi M, Goldberg R (2003) MPEG-2 AAC. In: Introduction to digital audio coding and standards, chap. 13. Kluwer Academic, USA, pp 333–367
Bosi M, Brandenburg K, Quackenbush S, Fielder L, Akagiri K, Fuchs H, Dietz M (1997) ISO/IEC MPEG-2 advanced audio coding. J Audio Eng Soc 45(10):789–814
Google Scholar
Breebaart J (2007) Analysis and synthesis of binaural parameters for efficient 3D audio rendering in MPEG Surround. In: IEEE International Conference on Multimedia and Expo, Beijing, China, pp 1878–1881
Breebaart J, van de Par S, Kohlrausch A (2001) Binaural processing model based on contralateral inhibition. I. Model structure. J Acoust Soc Am 110:1074–1088. doi:10.1121/1.1383297
Article Google Scholar
Breebaart J, Disch S, Faller C, Herre J, Hotho G, Kjörling K, Myburg F, Neusinger M, Oomen W, Purnhagen H, Rödén J (2005a) MPEG spatial audio coding / MPEG Surround: overview and current status. Presented at the 119th AES Convention, New York
Breebaart J, van de Par S, Kohlrausch A, Schuijers E (2005b) Parametric coding of stereo audio. EURASIP J Appl Signal Process 9:1305–1322. doi:10.1155/ASP.2005.1305
Google Scholar
Breebaart J, Hotho G, Koppens J, Schuijers E, Oomen W, van de Par S (2007) Background, concept and architecture for the recent MPEG Surround standard on multi-channel audio compression. J Audio Eng Soc 55:331–351
Google Scholar
Breebaart J, Villemoes L, Köjrling K (2008) Binaural rendering in MPEG Surround. EURASIP J. Advances in Signal Processing, Article ID 732895
Cheng CI (2004) Method for estimating magnitude and phase in the MDCT domain. Presented at the 116th AES Convention, Berlin, Germany
Disch S, Ertel C, Faller C, Herre J, Hilpert J, Hoelzer A, Kroon P, Linzmeier K, Spenger C (2004) Spatial audio coding: next-generation efficient and compatible coding of multi-channel audio. Presented at the 117th AES Convention, San Francisco, USA
Engdegård J, Purnhagen H, Rödén J, Liljeryd L (2004) Synthetic ambience in parametric stereo coding. Presented at 116th AES Convention, Berlin, Germany
Faller C (2004) Parametric coding of spatial audio. Ph.D. Dissertation, Institut de systèmes de communication, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
Faller C (2006) Parametric multichannel audio coding: synthesis of coherence cues. IEEE Trans Audio Speech Lang Process 14:299–310. doi:10.1109/TSA.2005.854105
Article Google Scholar
Faller C, Baumgarte F (2001) Efficient representation of spatial audio using perceptual parameterization. Presented at IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, New York
Faller C, Baumgarte F (2002a) Binaural cue coding: a novel and efficient representation of spatial audio. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp 1841–1844
Faller C, Baumgarte F (2002b) Binaural cue coding applied to stereo and multi-channel audio compression. Presented at the 11th AES Convention, Munich, Germany
Faller C, Baumgarte F (2002c) Binaural cue coding applied to audio compression with flexible rendering. Presented at the 113th AES Convention, Los Angeles, USA
Faller C, Baumgarte F (2003) Binaural cue coding—part II: schemes and applications. IEEE Trans Speech Audio Process 11:520–531. doi:10.1109/TSA.2003.818108
Article Google Scholar
Fliege NJ (1994) Modified DFT Polyphase SBC filter banks with almost perfect reconstruction. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp 149–152
Gilkey R, Anderson TR (eds) (1997) Binaural and spatial hearing in real and virtual environments. Erlbaum, Mahwah, NJ
Google Scholar
Herre J (2004) From joint stereo to spatial audio coding-recent progress and standardization. In: Proc. of the 7th Int. Conference on Digital Audio Effects, Naples, Italy, Oct. 2004, pp. 157–162
Herre J, Purnhagen H, Breebaart J, Faller C, Disch S, Kjörling K (2005) The reference model architecture for MPEG spatial audio coding. Presented at the 118th AES Convention, Barcelona, Spain
Herre J, Köjrling K, Breebaart J, Faller C, Disch S, Purnhagen H, Koppens J, Hilpert J, Rödén J, Oomen W, Linzmeier K, Chong KS (2008) MPEG Surround—the ISO/MPEG standard for efficient and compatible multi-channel audio coding. J Audio Eng Soc 56:932–955
Google Scholar
Hotho G, Villemoes LF, Breebaart J (2008) A backward-compatible multichannel audio codec. IEEE Trans Audio Speech Lang Process 16:83–93. doi:10.1109/TASL.2007.910768
Article Google Scholar
ISO/IEC JTC1/SC29/WG11 (2005) Information technology—generic coding of moving pictures and associated audio information—part 7: advanced audio coding (AAC), ISO/IEC 13818-7:2005(E)
ISO/IEC JTC1/SC 29/WG11 (2006) MPEG Audio sub-group, Text of ISO/IEC 23003-1:2006/FCD, MPEG Surround
ITU (2003) Method for the subjective assessment of intermediate quality level of coding systems, ITU-R BS.1534-1
Joris P, Yin TCT (2006) A matter of time: internal delays in binaural processing. Trends Neurosci 30:70–78. doi:10.1016/j.tins.2006.12.004
Article Google Scholar
Karp T, Fliege NJ (1995) MDFT filter banks with perfect reconstruction. Presented at IEEE International Symposium on Circuits and Systems
Malvar HS (1990) Lapped transforms for efficient transform/subband coding. IEEE Trans Acoust Speech Signal Process 38:969–978. doi:10.1109/29.56057
Article Google Scholar
Malvar HS (1991) Fast algorithm for modulated lapped transform. Electron Lett 27(9):775–776. doi:10.1049/el:19910482
Article Google Scholar
Malvar HS (1992) Signal processing with lapped transforms. Artech House, Norwood, MA
MATH Google Scholar
Malvar H (1999) A modulated complex lapped transform and its applications to audio processing. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp 1421–1424
Malvar HS (2003) Fast algorithm for the modulated complex lapped transform. IEEE Signal Process Lett 10:8–10. doi:10.1109/LSP.2002.806700
Article Google Scholar
Malvar HS, Staelin DH (1989) The LOT: transform coding without blocking effects. IEEE Trans Acoust Speech Signal Process 37:553–559. doi:10.1109/29.17536
Article Google Scholar
McAlpine D, Jiang D, Palmer AR (2001) A neural code for low-frequency sound localization in mammals. Nat Neurosci 4:396–401. doi:10.1038/86049
Article Google Scholar
Mu-Huo C, Yu-Hsin H (2003) Fast IMDCT and MDCT algorithms—a matrix approach. IEEE Trans Signal Process 51:221–229. doi:10.1109/TSP.2002.806566
Article MathSciNet Google Scholar
Munkong R, Biing-Hwang J (2008) Auditory perception and cognition. IEEE Signal Process Mag 25:98–117. doi:10.1109/MSP.2008.918418
Article Google Scholar
Plogsties J, Breebaart J, Herre J, Villemoes L, Jin C, Kjörling K, Koppens J (2006) MPEG Surround binaural rendering—surround sound for mobile devices. Presented at 24th Tonmeistertagung—VDT International Convention, Leipzig, Germany
Princen J, Bradley A (1986) Analysis/synthesis filter bank design based on time domain aliasing cancellation. IEEE Trans Acoust Speech Signal Process 34:1153–1161. doi:10.1109/TASSP.1986.1164954
Article Google Scholar
Princen JP, Johnson AW, Bradley AB (1987) Subband/transform coding using filter bank designs based on time domain aliasing cancellation. In: Proceedings of IEEE International Conference on Acoustics, Speech, Signal Processing, pp 2161–2164
Quackenbush S, Herre J (2005) MPEG Surround. IEEE Multimedia 12:18–23. doi:10.1109/MMUL.2005.76
Article Google Scholar
Roden J, Breebaart J, Hilpert J, Purnhagen H, Schuijers E, Koppens J, Linzmeier K, Holzer A (2007) A study of the MPEG Surround quality versus bit-rate curve. Presented at the 123rd AES Convention, New York, USA
Schuijers EGP, Oomen AWJ, den Brinker AC, Gerrits AJ (2003) Advances in parametric coding for high-quality audio. Presented at the 114th AES Convention, Amsterdam, The Netherlands
Schuijers E, Breebaart J, Purnhagen H, Engdegard J (2004) Low complexity parametric stereo coding. Presented at 116th AES Convention, Berlin, Germany
Strutt JW (1907) (Lord Rayleigh), on our perception of sound direction. Philos Mag 13:214–232
Google Scholar
Wang Y, Vilermo M (2003) Modified discrete cosine transform—its implications for audio coding and error concealment. J Audio Eng Soc 51:52–61
Google Scholar

Download references

Acknowledgement

This research was supported by National Science Foundation of China (grant 60832002) and MKE(Ministry of Knowledge Economy), Korea, under the ITRC(Information Technology Research Center) Support program supervised by the IITA(Institute of Information Technology Advancement) (IITA-2009-C1090-0902-0020)

Author information

Authors and Affiliations

Computer School, Wuhan University, Wuhan, China
Shuixian Chen & Ruimin Hu
Department of Computer Science, Georgia State University, Atlanta, GA, USA
Naixue Xiong
Department of Computer Science and Engineering, Kyungnam University, Masan, Korea
Jong Hyuk Park
School of Computer Science & Engineering, Seoul National University, Seoul, 151-744, Korea
Min Chen

Authors

Shuixian Chen
View author publications
You can also search for this author inPubMed Google Scholar
Naixue Xiong
View author publications
You can also search for this author inPubMed Google Scholar
Jong Hyuk Park
View author publications
You can also search for this author inPubMed Google Scholar
Min Chen
View author publications
You can also search for this author inPubMed Google Scholar
Ruimin Hu
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Jong Hyuk Park.

Appendix

1.1 A. MDFT energy conservation

As in (3.a) and (3.b), $ {c_0}, \ldots, {c_{N - 1}}\,{\text{and}}\,{s_0}, \ldots, {s_{N - 1}} $ are 2N-dimensional basis vectors for MDCT and MDST respectively. The inner products between them are

$$ \left\{ {\begin{array}{*{20}{c}} {\left\langle {{c_k},{c_l}} \right\rangle = N\delta \left( {k - l} \right),}{k,l = 0, \ldots, N - 1} \\ {\left\langle {{s_k},{s_l}} \right\rangle = N\delta \left( {k - l} \right),}{k,l = 0, \ldots, N - 1} \\ {\left\langle {{c_k},{s_l}} \right\rangle = 0,}{k,l = 0, \ldots, N - 1} \\ \end{array} } \right., $$

(A.1)

where δ(•) is the unit impulse function. They compose an orthogonal basis for 2N-dimensional real vector space. Then for a time signal $ x(n),\,n = 0, \ldots, 2N - 1 $, and its MDCT spectrum X(k) and MDST spectrum $ Y(k),\,k = 0,...,2N - 1 $, their energy satisfies

$$ \begin{array}{*{20}{c}} {N\left\langle {x,x} \right\rangle } \\ { = \frac{1}{N}\left\langle {\sum\limits_{k = 0}^{N - 1} {\left( {X(k){c_k} + Y(k){s_k}} \right)}, \sum\limits_{k = 0}^{N - 1} {\left( {X(k){c_k} + Y(k){s_k}} \right)} } \right\rangle } \\ { = \left\langle {X,X} \right\rangle + \left\langle {Y,Y} \right\rangle = \left\langle {X + jY,X + jY} \right\rangle } \\ \end{array} . $$

(A.2)

This verifies that MDFT spectral energy is N times of temporal energy.

1.2 B. MDFT time shift and phase shift

From MDFT definition in (3.c), we have when time signal x(n) has a shift d and satisfies $ x\left( {n - 2N} \right) = - x(n) $, its MDFT spectrum as

$$ \begin{array}{*{20}{c}} {\tilde Z(k) = \sum\limits_{n = 0}^{2N - 1} {x\left( {n - d} \right)\exp \left[ { - j\frac{\pi }{N}\left( {n + \frac{1}{2} + \frac{N}{2}} \right)\left( {k + \frac{1}{2}} \right)} \right]} } \\ { = \sum\limits_{n = - d}^{2N - 1 - d} {x(n)\exp \left[ { - j\frac{\pi }{N}\left( {n + d + \frac{1}{2} + \frac{N}{2}} \right)\left( {k + \frac{1}{2}} \right)} \right]} } \\ { = \sum\limits_{n = 0}^{2N - 1 - d} {x(n)\exp \left[ { - j\frac{\pi }{N}\left( {n + d + \frac{1}{2} + \frac{N}{2}} \right)\left( {k + \frac{1}{2}} \right)} \right]} } \\ { - \sum\limits_{n = 2N - d}^{2N - 1} {x\left( {n - 2N} \right)\exp \left[ { - j\frac{\pi }{N}\left( {n + d + \frac{1}{2} + \frac{N}{2}} \right)\left( {k + \frac{1}{2}} \right)} \right]} } \\ { = Z(k)\exp \left[ { - j\frac{\pi }{N}d\left( {k + \frac{1}{2}} \right)} \right]} \\ \end{array}, $$

(A.3)

where Z(k) is MDFT spectrum of x(n) without shift. The condition $ x\left( {n - 2N} \right) = - x(n) $ parallels DFT’s requirement of periodicity but with a negative sign. For real signals and d<<2N, (A.4) is an approximation.

1.3 C. Windowed MDFT

Note X(k) and Y(k) are sine-windowed MDCT spectrum and cosine-windowed MDST spectrum respectively. Then we have

$$ \begin{array}{*{20}{c}} {{Z_{+} }(k) = Y(k) + X(k)} \\ { = \sum\limits_{n = 0}^{2N - 1} {x(n)\cos \left[ {\frac{\pi }{2N}\left( {n + \frac{1}{2}} \right)} \right]\sin \left[ {\frac{\pi }{N}\left( {n + \frac{1}{2} + \frac{N}{2}} \right)\left( {k + \frac{1}{2}} \right)} \right]} } \\ { + \sum\limits_{n = 0}^{2N - 1} {x(n)\sin \left[ {\frac{\pi }{2N}\left( {n + \frac{1}{2}} \right)} \right]\cos \left[ {\frac{\pi }{N}\left( {n + \frac{1}{2} + \frac{N}{2}} \right)\left( {k + \frac{1}{2}} \right)} \right]} } \\ { = - \sum\limits_{n = 0}^{2N - 1} {x(n)\cos \left[ {\frac{\pi }{N}n\left( {k + 1} \right) + \frac{\pi }{N}\left( {k + 1} \right)\left( {\frac{1}{2} + \frac{N}{2}} \right) + \frac{\pi }{4}} \right]} } \\ \end{array}, $$

(A.4)

and

$$ \begin{array}{*{20}{c}} {{Z_{-} }(k) = Y(k) - X(k)} \\ { = \sum\limits_{n = 0}^{2N - 1} {x(n)\cos \left[ {\frac{\pi }{2N}\left( {n + \frac{1}{2}} \right)} \right]\sin \left[ {\frac{\pi }{N}\left( {n + \frac{1}{2} + \frac{N}{2}} \right)\left( {k + \frac{1}{2}} \right)} \right]} } \\ { - \sum\limits_{n = 0}^{2N - 1} {x(n)\sin \left[ {\frac{\pi }{2N}\left( {n + \frac{1}{2}} \right)} \right]\cos \left[ {\frac{\pi }{N}\left( {n + \frac{1}{2} + \frac{N}{2}} \right)\left( {k + \frac{1}{2}} \right)} \right]} } \\ { = \sum\limits_{n = 0}^{2N - 1} {x(n)\sin \left[ {\frac{\pi }{N}nk + \frac{\pi }{N}k\left( {\frac{1}{2} + \frac{N}{2}} \right) + \frac{\pi }{4}} \right]} } \\ \end{array} . $$

(A.5)

Take (A.4) and (A.5) as real part and imaginary part respectively,

$$ \begin{array}{*{20}{c}} { - {Z_{+} }\left( {k - 1} \right) - j{Z_{-} }(k)} \\ { = \sum\limits_{n = 0}^{2N - 1} {x(n)\cos \left[ {\frac{\pi }{N}nk + \frac{\pi }{N}k\left( {\frac{1}{2} + \frac{N}{2}} \right) + \frac{\pi }{4}} \right]} } \\ { - j\sum\limits_{n = 0}^{2N - 1} {x(n)\sin \left[ {\frac{\pi }{N}nk + \frac{\pi }{N}k\left( {\frac{1}{2} + \frac{N}{2}} \right) + \frac{\pi }{4}} \right]} } \\ { = \exp \left\{ { - j\left[ {\frac{\pi }{N}k\left( {\frac{1}{2} + \frac{N}{2}} \right) + \frac{\pi }{4}} \right]} \right\}\sum\limits_{n = 0}^{2N - 1} {x(n)\exp \left[ { - j\frac{\pi }{N}nk} \right]} } \\ \end{array}, $$

(A.6)

which is 2N-point DFT with a phase shift. Moreover with $ {Z_{+} }\left( { - 1} \right) = - {Z_{-} }(0)\,{\text{and}}\,{Z_{-} }(N) = {Z_{+} }\left( {N - 1} \right) $, (A.6) leads to (5.a).

1.4 D. Properties of MDCT and MDST transform matrices

From (6), we can see each column vector of C ₀ and S ₁ are odd-symmetric, and each column vector of C ₁ and S ₀ are even-symmetric. With the help of anti-diagonal matrix J having only 1 on its anti-diagonal, the symmetries are equivalent to $ {\mathbf{J}}{{\mathbf{C}}_0} = - {{\mathbf{C}}_0},{\mathbf{J}}{{\mathbf{S}}_1} = - {{\mathbf{S}}_1}\,{\text{and}}\,{\mathbf{J}}{{\mathbf{C}}_1} = {{\mathbf{C}}_1},{\mathbf{J}}{{\mathbf{S}}_0} = {{\mathbf{S}}_0} $ respectively. From this and $ {{\mathbf{J}}^{\text{T}}}{\mathbf{J}} = {\mathbf{JJ}} = {\mathbf{I}} $, we have

$$ {\mathbf{S}}_0^{\text{T}}{{\mathbf{C}}_0} = {\mathbf{S}}_0^{\text{T}}{{\mathbf{J}}^{\text{T}}}{\mathbf{J}}{{\mathbf{C}}_0} = {\left( {{\mathbf{J}}{{\mathbf{S}}_0}} \right)^{\text{T}}}\left( {{\mathbf{J}}{{\mathbf{C}}_0}} \right) = - {\mathbf{S}}_0^{\text{T}}{{\mathbf{C}}_0}, $$

(A.7)

which implies $ {\mathbf{S}}_0^{\text{T}}{{\mathbf{C}}_0} = {\mathbf{0}} $. And for the same reason, $ {\mathbf{S}}_1^{\text{T}}{{\mathbf{C}}_1} = {\mathbf{0}} $. For the windowed case, from the second equation of (14) W ₁=JW ₀ J and that W ₀ and W ₁ are diagonal matrices then W ₀ W ₁=W ₁ W ₀, we have

$$ \begin{array}{*{20}{c}} {{\mathbf{S}}_0^{\text{T}}{{\mathbf{W}}_1}{{\mathbf{W}}_0}{{\mathbf{C}}_0} = {\mathbf{S}}_0^{\text{T}}{{\mathbf{J}}^{\text{T}}}{\mathbf{J}}{{\mathbf{W}}_1}{\mathbf{JJ}}{{\mathbf{W}}_0}{\mathbf{JJ}}{{\mathbf{C}}_0}} \\ { = {{\left( {{\mathbf{J}}{{\mathbf{S}}_0}} \right)}^{\text{T}}}\left( {{\mathbf{J}}{{\mathbf{W}}_1}{\mathbf{J}}} \right)\left( {{\mathbf{J}}{{\mathbf{W}}_0}{\mathbf{J}}} \right)\left( {{\mathbf{J}}{{\mathbf{C}}_0}} \right)} \\ { = - {\mathbf{S}}_0^{\text{T}}{{\mathbf{W}}_0}{{\mathbf{W}}_1}{{\mathbf{C}}_0}} \\ { = - {\mathbf{S}}_0^{\text{T}}{{\mathbf{W}}_1}{{\mathbf{W}}_0}{{\mathbf{C}}_0}} \\ \end{array}, $$

(A.8)

which implies $ {\mathbf{S}}_0^{\text{T}}{{\mathbf{W}}_1}{{\mathbf{W}}_0}{{\mathbf{C}}_0} = {\mathbf{0}} $. And for the same reason, $ {\mathbf{S}}_1^{\text{T}}{{\mathbf{W}}_0}{{\mathbf{W}}_1}{{\mathbf{C}}_1} = {\mathbf{0}} $. Also by similar procedure as (A.8), we have $ {\mathbf{S}}_0^{\text{T}}{{\mathbf{W}}_1}{{\mathbf{W}}_1}{{\mathbf{C}}_1} = {\mathbf{S}}_0^{\text{T}}{{\mathbf{W}}_0}{{\mathbf{W}}_0}{{\mathbf{C}}_1} $. From this and with the help of the first equation of (14) $ {{\mathbf{W}}_0}{{\mathbf{W}}_0} + {{\mathbf{W}}_1}{{\mathbf{W}}_1} = {\mathbf{I}} $, we can see

$$ \begin{array}{*{20}{c}} {{\mathbf{S}}_0^{\text{T}}{{\mathbf{W}}_1}{{\mathbf{W}}_1}{{\mathbf{C}}_1} = \frac{1}{2}\left( {{\mathbf{S}}_0^{\text{T}}{{\mathbf{W}}_0}{{\mathbf{W}}_0}{{\mathbf{C}}_1} + {\mathbf{S}}_0^{\text{T}}{{\mathbf{W}}_1}{{\mathbf{W}}_1}{{\mathbf{C}}_1}} \right)} \\ { = \frac{1}{2}{\mathbf{S}}_0^{\text{T}}\left( {{{\mathbf{W}}_0}{{\mathbf{W}}_0} + {{\mathbf{W}}_1}{{\mathbf{W}}_1}} \right){{\mathbf{C}}_1}} \\ { = \frac{1}{2}{\mathbf{S}}_0^{\text{T}}{{\mathbf{C}}_1}} \\ \end{array} . $$

(A.9)

And for the same reason, $ {\mathbf{S}}_1^{\text{T}}{{\mathbf{W}}_0}{{\mathbf{W}}_0}{{\mathbf{C}}_0} = {\mathbf{S}}_1^{\text{T}}{{\mathbf{C}}_0}/2 $.

1.5 E. Properties of the conversion matrix T

As in (7.b), P is a matrix having only $ + 1, - 1, + 1, - 1, \ldots, $ on its diagonal, implying PP ^T=I. And with $ {{\mathbf{S}}_0} = - {{\mathbf{C}}_1}{\mathbf{P}},{{\mathbf{S}}_1} = {{\mathbf{C}}_0}{\mathbf{P}} $ in (7.b), we have $ {{\mathbf{S}}_1}{\mathbf{S}}_1^{\text{T}} = {{\mathbf{C}}_0}{\mathbf{C}}_0^{\text{T}},\,{{\mathbf{S}}_0}{\mathbf{S}}_0^{\text{T}} = {{\mathbf{C}}_1}{\mathbf{C}}_1^{\text{T}},\,{{\mathbf{S}}_0}{\mathbf{S}}_1^{\text{T}} = - {{\mathbf{C}}_1}{\mathbf{C}}_0^{\text{T}},\,{{\mathbf{S}}_1}{\mathbf{S}}_0^{\text{T}} = - {{\mathbf{C}}_0}{\mathbf{C}}_1^{\text{T}} $. With the help of $ {\mathbf{C}}_0^{\text{T}}{{\mathbf{C}}_0} + {\mathbf{C}}_1^{\text{T}}{{\mathbf{C}}_1} = N{\mathbf{I}}\,{\text{and}}\,{{\mathbf{C}}_1}{\mathbf{C}}_0^T = {{\mathbf{C}}_0}{\mathbf{C}}_1^T = {\mathbf{0}} $ in (7.a), the conversion matrix defined in (10.b) is orthogonal, or

$$ \begin{array}{*{20}{c}} {{{\mathbf{T}}^{\text{T}}}{\mathbf{T}} = \frac{1}{N^2}{{\left( {{\mathbf{S}}_1^{\text{T}}{{\mathbf{C}}_0} + {\mathbf{S}}_0^{\text{T}}{{\mathbf{C}}_1}} \right)}^{\text{T}}}\left( {{\mathbf{S}}_1^{\text{T}}{{\mathbf{C}}_0} + {\mathbf{S}}_0^{\text{T}}{{\mathbf{C}}_1}} \right)} \\ { = \frac{1}{N^2}\left( {{\mathbf{C}}_0^{\text{T}}{{\mathbf{S}}_1}{\mathbf{S}}_1^{\text{T}}{{\mathbf{C}}_0} + {\mathbf{C}}_1^{\text{T}}{{\mathbf{S}}_0}{\mathbf{S}}_0^{\text{T}}{{\mathbf{C}}_1} + {\mathbf{C}}_1^{\text{T}}{{\mathbf{S}}_0}{\mathbf{S}}_1^{\text{T}}{{\mathbf{C}}_0} + {\mathbf{C}}_0^{\text{T}}{{\mathbf{S}}_1}{\mathbf{S}}_0^{\text{T}}{{\mathbf{C}}_1}} \right)} \\ { = \frac{1}{N^2}\left( {{\mathbf{C}}_0^{\text{T}}{{\mathbf{C}}_0}{\mathbf{C}}_0^{\text{T}}{{\mathbf{C}}_0} + {\mathbf{C}}_1^{\text{T}}{{\mathbf{C}}_1}{\mathbf{C}}_1^{\text{T}}{{\mathbf{C}}_1} - {\mathbf{C}}_1^{\text{T}}{{\mathbf{C}}_1}{\mathbf{C}}_0^{\text{T}}{{\mathbf{C}}_0} - {\mathbf{C}}_0^{\text{T}}{{\mathbf{C}}_0}{\mathbf{C}}_1^{\text{T}}{{\mathbf{C}}_1}} \right)} \\ { = \frac{1}{N^2}\left( {{\mathbf{C}}_0^{\text{T}}{{\mathbf{C}}_0}{\mathbf{C}}_0^{\text{T}}{{\mathbf{C}}_0} + {\mathbf{C}}_1^{\text{T}}{{\mathbf{C}}_1}{\mathbf{C}}_1^{\text{T}}{{\mathbf{C}}_1} + {\mathbf{C}}_1^{\text{T}}{{\mathbf{C}}_1}{\mathbf{C}}_0^{\text{T}}{{\mathbf{C}}_0} + {\mathbf{C}}_0^{\text{T}}{{\mathbf{C}}_0}{\mathbf{C}}_1^{\text{T}}{{\mathbf{C}}_1}} \right)} \\ { = \frac{1}{N^2}\left( {{\mathbf{C}}_0^{\text{T}}{{\mathbf{C}}_0} + {\mathbf{C}}_1^{\text{T}}{{\mathbf{C}}_1}} \right)\left( {{\mathbf{C}}_0^{\text{T}}{{\mathbf{C}}_0} + {\mathbf{C}}_1^{\text{T}}{{\mathbf{C}}_1}} \right)} \\ { = {\mathbf{I}}} \\ \end{array} . $$

(A.10)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, S., Xiong, N., Hyuk Park, J. et al. Spatial parameters for audio coding: MDCT domain analysis and synthesis. Multimed Tools Appl 48, 225–246 (2010). https://doi.org/10.1007/s11042-009-0326-4

Download citation

Published: 22 July 2009
Issue Date: June 2010
DOI: https://doi.org/10.1007/s11042-009-0326-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Spatial parameters for audio coding: MDCT domain analysis and synthesis

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

JND-based spatial parameter quantization of multichannel audio signals

The Perceptual Lossless Quantization of Spatial Parameter for 3D Audio Signals

Multi-channel Object-Based Spatial Parameter Compression Approach for 3D Audio

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

1.1 A. MDFT energy conservation

1.2 B. MDFT time shift and phase shift

1.3 C. Windowed MDFT

1.4 D. Properties of MDCT and MDST transform matrices

1.5 E. Properties of the conversion matrix T

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now