Steganographic Pulse-Based Recovery for Robust ACELP Transmission over Erasure Channels

López-Oller, Domingo; Gomez, Angel M.; Córdoba, José Luis Pérez; Geiser, Bernd; Vary, Peter

doi:10.1007/978-3-642-35292-8_27

Domingo López-Oller⁷,
Angel M. Gomez⁷,
José Luis Pérez Córdoba⁷,
Bernd Geiser⁸ &
…
Peter Vary⁸

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 328))

733 Accesses
1 Citations

Abstract

This paper presents an ACELP-based speech transmission scheme that is robust to frame erasures. The scheme is based on the steganographic transmission of media-specific FEC codes. These FEC codes are intended to prevent the adaptive codebook desynchronization frequently found in the decoder after a frame erasure. They are based on a multipulse representation of the previous frame excitation. By means of steganographic methods, the FEC codes are embedded into the codec bitstream, thus causing no bit rate increase. In particular, an ACELP-specfic steganography approach exploits the inefficiencies in the ACELP codebook search and imposes certain algebraic restrictions which allow the hiding of data in the ACELP codewords. Effectively, side information can be transmitted without compromising the codec speech quality. The performance of our proposal is evaluated with the well-known AMR ACELP codec, both in terms of speech quality and intelligibility. To this end, objective measures, i.e. PESQ and STOI, are applied. The proposed coding scheme achieves a noticeable improvement over the legacy codec under adverse channel conditions without consuming any additional bit rate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Shroeder, M., Atal, B.: Code-excited linear prediction (celp): high-quality speech at very low bit rates. In: IEEE International Conference on Acoustics, Speech and Signal Proccessing, vol. 10 (1985)
Google Scholar
Carmona, J., Gomez, A., Peinado, A., Perez-Cordoba, J., Gonzalez, J.: A multipulse fec scheme based on amplitude estimation for celp codecs over erasure channels. In: INTERSPEECH (2010)
Google Scholar
Gomez, A., Carmona, J., Peinado, A., Sanchez, V.: A multipulse-based forward error correction technique for robust celp-coded speech transmission over erasure channels. IEEE Trans. Audio Speech Lang. Process. (2010)
Google Scholar
Gomez, A., Carmona, J., Gonzalez, J., Sanchez, V.: One-pulse fec coding for robust celp-coded speech transmission over erasure channels. IEEE Trans. Multimedia (2011)
Google Scholar
Serizawa, M., Ito, H.: A packet loss recovery method using packet arrived behind the playout time for celp decoding. In: IEEE Int. Conf. Acoust., Speech, Signal Process., vol. 1, pp. 169–172 (2002)
Google Scholar
Carmona, J.L., Perez-Cordoba, J.L., Peinado, A.M., Gomez, A.M., Gonzalez, J.A.: A scalable coding scheme based on interframe dependency limitation. In: IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), vol. 4, pp. 4805–4808 (April 2008)
Google Scholar
Vaillancourt, T., Jelinek, M., Salami, R., Lefebvre, R.: Efficient frame erasure concealment in predictive speech codecs using glottal pulse resynchronisation. In: IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), vol. 4, pp. 1113–1116 (April 2007)
Google Scholar
Lamel, L.F., Kassel, R.H., Seneff, S.: Speech database development: Design and analysis of the acoustic-phonetic corpus. In: Speech Recognition Workshop (DARPA), pp. 100–110 (February 1986)
Google Scholar
Wong, P., Au, O.: A blind watermarking technique in jpeg compressed domain. In: ICIP (September 2002)
Google Scholar
Noorkami, M., Mersereau, R.: Compressed-domain video watermarking for h.264. In: ICIP (September 2005)
Google Scholar
Siebenhaar, F., Neubauer, C., Herre, J.: Combined compression/watermarking for audio signals. In: 110th Conv. of the AES (May 2001)
Google Scholar
Lu, Z., Yan, B., Sun, S.: Watermarking combined with celp speech coding for authentication. IEICET Trans. on Inf. and Systems E88-D(2), 330–344 (2005)
Article Google Scholar
Geiser, B., Vary, P.: Backwards compatible wideband telephony in mobile networks: Celp watermarking and bandwidth extension. In: ICASSP (April 2007)
Google Scholar
Geiser, B., Vary, P.: High rate data hiding in acelp speech codecs. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2008)
Google Scholar
Vary, P., Geiser, B.: Steganographic wideband telephony using narrowband speech codecs. In: Conference Record of Asilomar Conference on Signals, Systems, and Computers (November 2007)
Google Scholar
Atal, B., Remde, J.: A new model of lpc excitation for producing natural-sounding speech at low bit rates. In: IEEE Int. Conf. Acoustics, Speech, and Signal Processing, vol. 7, pp. 614–617 (May 1982)
Google Scholar
3GPP TS 26.090, Mandatory speech codec speech processing functions; adaptive multi-rate (amr) speech codec (1999)
Google Scholar
ITU-T G.729 Recommendation, Coding of speech at 8 kbit/s using conjugate-structure algebraic-code-excited linear prediction (cs-acelp) (1993)
Google Scholar
Geiser, B., Mertz, F., Vary, P.: Steganographic packet loss concealment for wireless voip. In: Conference on Voice Communication (SprachKommunikation), pp. 1–4 (October 2008)
Google Scholar
ITU-T Recomendation P.862, Perceptual evaluation of speech quality (pesq) (2001)
Google Scholar
Tang, Y., Cooke, M.: Subjetctive and objetive evaluation of speech intelligibility enhancement under constant energy and duration constraints. In: INTERSPEECH (2011)
Google Scholar
Garofolo, J.S.: The structure and format of the darpa timit cd-rom prototype
Google Scholar
Goldsworthy, R., Greenberg, J.: Analysis of speech-based speech transmission index methods with implications in nonlinear operations. J. Acoust. Soc. Am. (116), 3679–3689 (2004)
Google Scholar
Taal, C., Hendriks, R., Heusdens, R., Jensen, J.: An algorithm for intelligibility prediction of time–frequency weighted noisy speech. IEEE Trans. on Audio, Speech, and Language Processing 19, 2125–2136 (2011)
Article Google Scholar
Gomez, A., Schwerin, B., Paliwal, K.: Improving objective intelligibility prediction by combining correlation and coherence based methods with a measure based on the negative distortion ratio. Speech Communication (54), 503–515 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Teoría de la Señal, Telemática y Comunicaciones, University of Granada, Spain
Domingo López-Oller, Angel M. Gomez & José Luis Pérez Córdoba
Institute of Communication Systems and Data Processing, RWTH Aachen University, Germany
Bernd Geiser & Peter Vary

Authors

Domingo López-Oller
View author publications
You can also search for this author in PubMed Google Scholar
Angel M. Gomez
View author publications
You can also search for this author in PubMed Google Scholar
José Luis Pérez Córdoba
View author publications
You can also search for this author in PubMed Google Scholar
Bernd Geiser
View author publications
You can also search for this author in PubMed Google Scholar
Peter Vary
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Escuela Politecnica Superior, Universidad Autonoma de Madrid. C/ Francisco, Tomas y Valiente 11, 28049, Madrid, Spain
Doroteo Torre Toledano
Centro Politécnico Superior, Edificio Ada Byron, C/ María de Luna nº 1, 50018, Zaragoza, Spain
Alfonso Ortega Giménez
Universidade de Aveiro, Campus Universitário Aveiro, 3810-193, Aveiro, Portugal
António Teixeira
Escuela Politecnica Superior, Universidad Autonoma de Madrid, C/ Francisco, Tomas y Valiente 11, 28049, Madrid, Spain
Joaquín González Rodríguez
E.T.S.I.Telecomunicacion, Universidad Politécnica de Madrid, Ciudad Universitaria s/n, 28040, Madrid, Spain
Luis Hernández Gómez & Rubén San Segundo Hernández &
Escuela Politecnica Superior, Universidad Autonoma de Madrid, C/ Francisco, Tomas y Valiente 11, 28049, Madrid, Spain
Daniel Ramos Castro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

López-Oller, D., Gomez, A.M., Córdoba, J.L.P., Geiser, B., Vary, P. (2012). Steganographic Pulse-Based Recovery for Robust ACELP Transmission over Erasure Channels. In: Torre Toledano, D., et al. Advances in Speech and Language Technologies for Iberian Languages. Communications in Computer and Information Science, vol 328. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35292-8_27

Download citation

DOI: https://doi.org/10.1007/978-3-642-35292-8_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35291-1
Online ISBN: 978-3-642-35292-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics