Abstract
In this paper, we propose a new summation method for MPEG-1 encoded audio signals in the compressed domain. This method operates directly in the subband domain in order to reduce the delay and the implementation complexity. The main problem to be solved deals with bit allocation and subsequent quantization of the resulting summation signal. The bit allocation of encoded mpeg signals is based on the signal-to-mask ratios of the individual signals. To estimate the signal-to-mask ratios of the combined signal, the algorithm simply needs the information contained in the individual encoded frames. Along with a description of the proposed algorithm, variations and applications of the algorithm are detailed. Additionally, a performance evaluation of this method compared to direct summation in the time domain is given.
Résumé
Cet article présente une nouvelle méthode de sommation des signaux audio codés mpeg-1 dans le domaine compressé. Elle se propose d’effectuer un traitement direct dans le domaine des sous-bandes dans le but de réduire le délai et la complexité algorithmique dus aux bancs de filtres. Le problème essentiel concernant cette approche est lié à l’allocation des bits pour le signal somme. Ainsi, l’algorithme propose d’utiliser seulement les informations contenues dans les trames individuelles pour estimer les rapports signal à masque du nouveau signal. D’autres variantes de l’algorithme sont aussi proposées. Une évaluation des performances par rapport à la sommation directe dans le domaine temporel est finalement donnée.
Similar content being viewed by others
References
*** Coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit/s — Part 3 : audio.IS 11172–3. Information Technology ISO/IEC (1993).
Brandenburg (K.), Stoll (G.). ISO mpeg-1 audio: A generic standard for coding of high-quality digital audio.J. Audio Eng. Soc. (1994), 42, n° 10, pp. 780–792.
Assunção (P.), Ghanbari (M.). A frequency-domain video transcoder for dynamic bit-rate reduction of mpeg-2 bit stream.IEEE Trans, on Circuits and Systems for Video Technology (1998),8, n° 8, pp. 953–967.
Chang (S.-E.), Messerschmitt (D.G.). Manipulation and composition of MC DCT compressed video.IEEE J. on Selected Areas in Communications (1995),13, n° 1, pp. 1–11.
Zhu (W.)., Yang (K.H.), Beacken (M.J.). CIF-to-QCIF video bitstream down-conversion in the DCT domain.Bell Labs Technical J. (July — Sept. 1998), pp. 21–29.
Schmitt (J.-C.), Julien (J.-P.), Haigneré (I.), Château (N.). Paramètres techniques pour la qualité audiovisuelle.Echo des Recherches (1998),171, pp. 49–56.
Broadhead (M.A.), Owen (C.B.). Direct manipulation of mpeg compressed digital audio.ACM Multimedia Conference (1995), San Francisco.
***. Codage audiofréquence à 7 kHz à un débit inférieur ou égale à 64 kbits/s, Annexe I: Aspect réseau.ITU-T Recommendation G.722 (1988).
Taka (M.), Shimada (S.), Aoyama (T.). Multimedia multipoint teleconference system using the 7 kHz audio coding standard at 64 kbit/s.IEEE J. on Selected Areas in Communications (1988),6 (2), pp. 299–306.
Jayant (N. S.), Noll (P.). Digital coding of waveforms: principles and applications to speech and video.Prentice Hall (1984), 688p.
Wei (X.), Shaw (M. J.), Varley (R.). Optimum bit allocation and decomposition for high quality audio coding.IEEE Int. Conf. on Acoust., Speech and Signal Proc. (1997), pp. 315–318.
Gersho (R.), Gray (M.R.). Vector quantization and signal compression.Kluwer Academic Publishers (1992), 732p.
Zwicker (E.), Fastl (H.). Psychoacoustics — facts and models.Springer (1999), 416 p.
Humes (L.E.), Jesteadt (W.). Models of the additivity of masking.J. Acoust. Soc. of Am. (1989),85 (3), pp. 1285–1294.
Humes (L.E.), Lee (L.W.). Two experiments on the spectral boundary conditions for non linear additivity of simultaneous masking.J. Acoust. Soc. of Am. (1992),92 (5), pp. 2598–2606.
Nakajima (Y.), Yangihara (H.), Yoneyama (A.), Sugano (M). mpeg audio rate scaling on coded data domain.IEEE Int. Conf. on Acoust., Speech and Signal Proc. (1998), pp. 3669–3672.
***. Methods for subjective determination of transmission quality, Annexe DITU-T Recommendation P. 800 (1996).
Colomes (C), Lever (M.), Dehery (Y.F.). A perceptual objective measurement system (pom) for the quality assessment of perceptual coders.Audio Eng. Soc. 96th Convention (1994), Preprint 3801.
Perreau Guimares (M.). Optimisation des ressources binaires et modélisation psychoacoustique pour le codage audio.Ph. D Thesis-UniversitéParis V(1998), 149p.
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Touimi, A.B., Mahieux, Y. & Lanciani, C.A. A summation algorithm for MPEG-1 coded audio signals: a first step towards audio processing in the compressed domain. Ann. Télécommun. 55, 108–116 (2000). https://doi.org/10.1007/BF03001904
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF03001904