Skip to main content

Adaptive Signal Models for Wide-Band Speech and Audio Compression

  • Conference paper
Pattern Recognition and Image Analysis (IbPRIA 2005)

Abstract

This paper deals with the application of adaptive signal models for parametric speech and audio compression. The matching pursuit algorithm is used for extracting sinusoidal components and transients in audio signals. The resulting residue is perceptually modelled as a noise like signal. When a transient is detected, psychoacoustic-adapted matching pursuits are accomplished using a wavelet-based dictionary followed of an harmonic one. Otherwise, matching pursuit is applied only to the harmonic dictionary. This multi-part model (Sines + Transients + Noise) is successfully applied for speech and audio coding purposes, assuring high perceptual quality at low bit rates (close to 16 kbps for most of the signals considered for testing).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Levine, S., Smith, J.: A Sines+Transients+Noise Audio Representation for Data Compression and Time/Pitch Scale Modifications. In: 105th AES Convention, preprint 4781 (1998)

    Google Scholar 

  2. Verma, T.S.: A perceptually based audio signal model with application to scalable audio compression, PhD Thesis, Standford University (1999)

    Google Scholar 

  3. Den Brinker, A.C., Schuiijers, A.G.P., Oomen, A.W.J.: Parametric coding for high quality audio, 112th AES Convention, Preprint 5554 (2002)

    Google Scholar 

  4. McAulay, R., Quatieri, T.: Speech Analysis/Synthesis Based on a Sinusoidal Representation. IEEE Trans. Acoustic, Speech and Signal Processing 34(4), 744–754 (1986)

    Article  Google Scholar 

  5. Nieuwenhuijse, J., Heusdens, R., Deprettere, E.F.: Robust exponential modeling of audio signals. In: Proc. ICASSP 1998, vol. 6, pp. 3581–3584 (1998)

    Google Scholar 

  6. Vera-Candeas, P., Ruiz-Reyes, N., Rosa-Zurera, M., Martinez-Muñoz, D., Lopez- Ferreras, F.: Transient Modeling by Matching Pursuits with a Wavelet Dictionary for Parametric Audio Coding. IEEE Signal Processing Letters 11(3), 349–352 (2004)

    Article  Google Scholar 

  7. Goodwin, M.: Residual modelling in music analysis-synthesis. In: Proc. ICASSP 1996, vol. 2, pp. 1005–1008 (1996)

    Google Scholar 

  8. Mallat, S., Zhang, Z.: Matching pursuits with time-frequency dictionaries. IEEE Trans. on Signal Processing 41, 3397–3415 (1993)

    Article  MATH  Google Scholar 

  9. Ruiz, N., Rosa, M., López, F., Vera, P.: New algorithm for achieving an adaptive tiling of the time axis for audio coding purposes. Electronic Letters 80, 434–435 (2002)

    Article  Google Scholar 

  10. Goodwin, M.M.: Adaptive Signal Models. Theory, Algorithms and Audio Applications. Kluwer Academic Publishers, Dordrecht (1998)

    Google Scholar 

  11. Heusdens, R., Vafin, R., Kleijn, W.B.: Sinusoidal Modelling using Psychoacoustic- Adaptive Matching Pursuits. IEEE Signal Processing Letters 9, 8 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Vera-Candeas, P., Ruiz-Reyes, N., Rosa-Zurera, M., Cuevas-Martinez, J.C., López-Ferreras, F. (2005). Adaptive Signal Models for Wide-Band Speech and Audio Compression. In: Marques, J.S., Pérez de la Blanca, N., Pina, P. (eds) Pattern Recognition and Image Analysis. IbPRIA 2005. Lecture Notes in Computer Science, vol 3523. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11492542_70

Download citation

  • DOI: https://doi.org/10.1007/11492542_70

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-26154-4

  • Online ISBN: 978-3-540-32238-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics