Skip to main content

A Novel Error Mitigation Scheme Based on Replacement Vectors and FEC Codes for Speech Recovery in Loss-Prone Channels

  • Conference paper
  • First Online:
Advances in Speech and Language Technologies for Iberian Languages (IberSPEECH 2016)

Abstract

In this paper, we propose an error mitigation scheme which combines two different approaches, a replacement super vector technique which provides replacements to reconstruct both the LPC coefficients and the excitation signal along bursts of lost packets, and a Forward Error Code (FEC) technique in order to minimize the error propagation after the last lost frame. Moreover, this FEC code is embedded into the bitstream in order to avoid the bitrate increment and keep the codec working in a compliant way on clean transmissions. The success of our recovery technique deeply relies on a quantization of the speech parameters (LPC coefficients and the excitation signal), especially in the case of the excitation signal where a modified version of the well-known Linde-Buzo-Gray (LBG) algorithm is applied. The performance of our proposal is evaluated over the AMR codec in terms of speech quality by using the PESQ algorithm. Our proposal achieves a noticeable improvement over the standard AMR legacy codec under adverse channel conditions without incurring neither on high computational costs or delays during the decoding stage nor consuming any additional bitrate.

J.L. Pérez-Córdoba—This work has been supported by an FPI grant from the Spanish Ministry of Education and by the MICINN TEC2013-46690-P project.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. 3GPP TS 26.090: Mandatory Speech Codec speech processing functions; Adaptive Multi-Rate (AMR) speech codec (1999)

    Google Scholar 

  2. Schroeder, M., Atal, B.: Code-excited linear prediction (CELP): high-quality speech at very low bit rates. IEEE ICASSP 10, 937–940 (1985)

    Google Scholar 

  3. Serizawa, M., Ito, H.: A packet loss recovery method using packet arrived behind the playout time for CELP decoding. IEEE ICASSP 1, 169–172 (2002)

    Google Scholar 

  4. Chibani, M., Lefebvre, R., Gournay, P.: Fast recovery for a CELP-like speech codec after a frame erasure. IEEE Trans. Audio Speech Lang. Process. 15(8), 2485–2495 (2007)

    Article  Google Scholar 

  5. Carmona, J., Pérez-Córdoba, J., Peinado, A., Gomez, A., González, J.: A scalable coding scheme based on interframe dependency limitation. In: IEEE ICASSP, pp. 4805–4808 (2008)

    Google Scholar 

  6. Liao, W., Chen, J., Chen, M.: Adaptive recovery techniques for real-time audio streams. IEEE INFOCOM 2, 815–823 (2001)

    Google Scholar 

  7. Merazka, F.: Packet loss concealment by interpolation for speech over IP network services. In: CIWSP, pp. 1–4 (2013)

    Google Scholar 

  8. Lindbrom, J., Hedelin, P.: Packet loss concealment based on sinusoidal extrapolation. IEEE ICASSP 1, 173–176 (2002)

    Google Scholar 

  9. Hodson, O., Perkins, C., Hardman, V.: A survey of packet loss recovery techniques for streaming audio. IEEE Netw. 12, 40–48 (1998)

    Article  Google Scholar 

  10. Rodbro, C., Murthi, M., Andersen, S., Jensen, S.: Hidden Markov model-based packet loss concealment for voice over IP. IEEE Trans. Audio Speech Lang. Process. 14, 1609–1622 (2006)

    Article  Google Scholar 

  11. López-Oller, D., Gomez, A., Pérez-Córdoba, J.: Residual VQ-quantization for speech frame loss concealment. In: IberSPEECH, November 2014

    Google Scholar 

  12. Zhang, G., Kleijn, W.: Autoregressive model-based speech packet-loss concealment. IEEE ICASSP 1, 4797–4800 (2008)

    Google Scholar 

  13. Ma, Z., Martin, R., Guo, J., Zhang, H.: Nonlinear estimation of missing LSF parameters by a mixture of Dirichlet distributions. In: IEEE ICASSP, pp. 6929–6933, May 2014

    Google Scholar 

  14. Boubakir, C., Berkani, D.: The estimation of line spectral frequencies trajectories based on unscented Kalman filtering. In: International Multi-Conference on Systems, Signals and Devices, pp. 1–6 (2009)

    Google Scholar 

  15. Chazan, D., Hoory, R., Cohen, G., Zibulski, M.: Speech reconstruction from MEL frequency cepstral coefficients and pitch frequency. IEEE ICASSP 3, 1299–1302 (2000)

    Google Scholar 

  16. Merazka, F.: Differential quantization of spectral parameters for CELP based coders in packet networks. In: IECON, pp. 1495–1498, October 2012

    Google Scholar 

  17. Linde, Y., Buzo, A., Gray, R.: An algorithm for vector quantizer design. IEEE Trans. Commun. 28(1), 84–95 (1980)

    Article  Google Scholar 

  18. Gomez, A., Carmona, J., Peinado, A., Sánchez, V.: A multipulse-based forward error correction technique for robust CELP-coded speech transmission over erasure channels. IEEE Trans. Audio Speech Lang. Process. 18, 1258–1268 (2010)

    Article  Google Scholar 

  19. Gomez, A., Carmona, J., González, J., Sánchez, V.: One-pulse FEC coding for robust CELP-coded speech transmission over erasure channels. IEEE Trans. Multimedia 13(5), 894–904 (2011)

    Article  Google Scholar 

  20. Ehara, H., Yoshida, K.: Decoder initializing technique for improving frame-erasure resilience of a CELP speech codec. IEEE Trans. Multimedia 10, 549–553 (2008)

    Article  Google Scholar 

  21. Itakura, F.: Line spectrum representation of linear predictive coefficients of speech signals. J. Acoust. Soc. Am. 57, S35 (1975)

    Article  Google Scholar 

  22. Kondoz, A.: Digital Speech: Coding for Low Bit Rate Communications Systems. Wiley, Hoboken (1994)

    Google Scholar 

  23. Soong, F., Juang, B.: Line spectrum pair (LSP) and speech data compression. IEEE ICASSP 9, 37–40 (1984)

    Google Scholar 

  24. López-Oller, D., Gomez, A., Pérez-Córdoba, J.: Source-based error mitigation for speech transmissions over erasure channels. In: EUSIPCO, pp. 1242–1246, September 2014

    Google Scholar 

  25. Gómez, A., Peinado, A., Sánchez, V., Rubio, A.: A source model mitigation technique for distributd speech recognition over lossy packet channels. In: Proceedings of EUROSPEECH, pp. 2733–2736 (2003)

    Google Scholar 

  26. Geiser, B., Vary, P.: High rate data hiding in ACELP speech codecs. In: IEEE ICASSP, pp. 4005–4008, April 2008

    Google Scholar 

  27. López-Oller, D., Gomez, A.M., Córdoba, J.L.P., Geiser, B., Vary, P.: Steganographic pulse-based recovery for robust ACELP transmission over erasure channels. In: Torre Toledano, D., Ortega Giménez, A., Teixeira, A., González Rodríguez, J., Hernández Gómez, L., San Segundo Hernández, R., Ramos Castro, D. (eds.) IberSPEECH 2012. CCIS, vol. 328, pp. 257–266. Springer, Heidelberg (2012). doi:10.1007/978-3-642-35292-8_27

    Chapter  Google Scholar 

  28. ITU-T Recomendation P.862: Perceptual evaluation of speech quality (PESQ) (2001)

    Google Scholar 

  29. ITU-R BS.1534-1: Method for the subjective assessment of intermediate quality level of coding systems (2001)

    Google Scholar 

  30. Garofolo, J., et al.: The Structure and Format of the DARPA TIMIT CD-ROM Prototype (1990)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Domingo López-Oller .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

López-Oller, D., Gomez, A.M., Pérez-Córdoba, J.L. (2016). A Novel Error Mitigation Scheme Based on Replacement Vectors and FEC Codes for Speech Recovery in Loss-Prone Channels. In: Abad, A., et al. Advances in Speech and Language Technologies for Iberian Languages. IberSPEECH 2016. Lecture Notes in Computer Science(), vol 10077. Springer, Cham. https://doi.org/10.1007/978-3-319-49169-1_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49169-1_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49168-4

  • Online ISBN: 978-3-319-49169-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics