Skip to main content
Log in

Simultaneous speech coding and de-noising in a dictionary based quantized CS framework

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

Speech compression or speech coding is inevitable for effective communication of speech signals in resource limited scenarios and researcher’s have been working on achieving lower and lower transmission bit rates (BR) without much compromise on the quality of speech. Medium BR hybrid speech coding schemes have gained much interest in the recent years with most of them based on CELP, the basic medium bit-rate coding scheme. In this work, we provide an insight to the capabilities of compressive sensing (CS) in speech processing and propose a novel idea in the quantized framework. Three major aspects demonstrated in this paper are (1) Inherent de-noising of noisy speech by the CS based coder along with compression (2) Quantization of CS measurements to achieve medium transmission bit-rates and (3) Enhancement of quality and compression performance of the coder with better sparse representations of speech using dictionaries. The results indicate that the proposed scheme offers better compression in comparison with basic Gaussian codebook CELP. The CS scheme has the added advantage of inherent noise suppression and provides more robustness to background noise in comparison with parameter extraction based medium bit-rate speech coding systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. http://dsp.rice.edu/.

References

  • Aharon, M., Elad, M., & Bruckstein, A. (2006). K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing, 54, 4311–4322.

    Article  Google Scholar 

  • Andreas, S. (1994). Spanias, speech coding: A tutorial review, Proceedings of the IEEE, vol. 82(10).

  • Chu, W. C. (2003). Speech coding algorithms foundation and evolution of standardized coders. Hoboken: Wiley.

    Book  MATH  Google Scholar 

  • Dai, W., Pham R. V., & Milenkovic O. (2009). A comparative study of quantized compressive sensing schemes, IEEE International Symposium on Information Theory, pp. 11–15.

  • Daniels M. L., & Rao B. D. (2012). Compressed sensing based scalable speech coders, Proceedings of ASILOMAR, pp. 92–96.

  • Donoho, D. L. (2006). Compressed sensing. IEEE Transactions on Information Theory, 52, 1289–1306.

    Article  MathSciNet  MATH  Google Scholar 

  • Eldar, Y. C., & Kutyniok, G. (2012). Compressed sensing: Theory and applications. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Foucart, S., & Rauhut, H. (2013). A mathematical introduction to compressive sensing (Vol. XVIII). New York: Springer.

    Book  MATH  Google Scholar 

  • Giacobello, D., Christensen, M. G., Murthi, M. N., Jensen, S. H., & Moonen, M. (2010). Retrieving sparse patterns using a compressed sensing framework: applications to speech coding based on sparse linear prediction. IEEE Signal Processing Letters, 17, 103–106.

    Article  Google Scholar 

  • Gunawan, T.S., Khalifa, O.O., Shafie, A.A., & Ambikairajah, E. (2011) Speech compression using compressive sensing on a multicore system. In Proceedings of 4th International Conference on Mechatronics (ICOM), pp. 1–4.

  • Hu, Y., & Loizou, P. (2007). Subjective evaluation and comparison of speech enhancement algorithms. Journal of Speech Communications, 49, 588–601.

    Article  Google Scholar 

  • Jafari M. G. & Plumbey M. D., (2008). An adaptive orthogonal sparsifying transform for speech signals, Proceedings of IEEE Conference on Communications, Control and Signal Processing (ISCCSP), pp. 786–790.

  • Jafari M. G. & Plumbley M. D. (2009). Speech denoising based on a greedy adaptive dictionary algorithm, Proceedings of European Signal Processing Conference, pp. 1423–1426.

  • Kadambe, S., & Davis, J. (2010). Compressive sensing and vector quantization based image compression, Proceedings of IEEE ASILOMAR, pp. 2023–2027.

  • Kamboh, A. M., Lawrence, K. C., Thomas, A. M., & Tsai, P. I. (2005). Design of a CELP coder and analysis of various quantization techniques. Ann Arbor: University of Michigan.

    Google Scholar 

  • Kassim L.A., Khalifa, O.O., & Gunawan T.S. (2012). Compressive sensing based low bit rate speech encoder. In International Conference on Computer & Communication Engineering (ICCCE), pp. 302–307.

  • Kondoz, A. M. (2004). Digital speech—coding for low bit rate communication systems (2nd ed.). New York: Chichester.

    Book  Google Scholar 

  • Lin K.-H., Lin C.-H., Chung K.-H., & Lin K.-S. (2013). A compressive sensing-based speech signal processing system for wearable computing device in IPTV environment. In Third International Congress on Multimedia Technology, Atlantis Press.

  • Murray J. F. & Kreutz-Delgado K. (2004). Sparse image coding using learned dictionaries, IEEE Workshop on Machine Learning for Signal Processing, pp. 579–588.

  • Nowak, R. D., & Wright, S. J. (2007). Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems. IEEE Journal of Selected Topics in Signal Processing, 1(4), 586–597.

    Article  Google Scholar 

  • Pham, D. S., & Venkatesh, S. (2013). Compressive speech enhancement. Journal of Speech Communication, 55, 757–768.

    Article  Google Scholar 

  • Plumbey, M. D., & Jafari, M. G. (2011). Fast dictionary learning for sparse representations of speech signal. IEEE Journal of Selected Topics in Signal Processing, 5, 1025–1031.

    Article  Google Scholar 

  • Rubinstein R., Bruckstein A. M., & Elad M. (2010). Dictionaries for sparse representation modelling, Invited paper, proceedings of IEEE, pp. 1045–1057.

  • Sanderson, C. (2008). Biometric person recognition: Face, speech and fusion. Saarbrucken: VDM.

    Google Scholar 

  • Shirazinia, A., Chatterjee, S., & Skoglund, M. (2013). Analysis-by-synthesis quantization for compressed sensing measurements. IEEE Transaction on Signal Processing, 61(22), 5789–5800.

    Article  MathSciNet  Google Scholar 

  • Sigg, C. D., Dikk, T., & Buhmann, J. M. (2012). Speech enhancement using generative dictionary learning. IEEE Transaction on Audio, Speech and Language Processing, 20(6), 1698–1712.

    Article  Google Scholar 

  • Wang, Y., Xu, Z., Li, G., Chang L., & Hong C. (2011). Compressive sensing framework for speech signal synthesis using a hybrid dictionary, Proceedings of IEEE CISP, pp. 2400–2403

  • Wu, D., Zhu W.-P., & Swamy M.N.S. On sparsity issues in compressive sensing based speech enhancement. In Proceedings of IEEE ISCAS, 2012, pp. 285–288.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Deepak Mishra.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ramdas, V., Gorthi, S.S.R.K. & Mishra, D. Simultaneous speech coding and de-noising in a dictionary based quantized CS framework. Int J Speech Technol 19, 509–523 (2016). https://doi.org/10.1007/s10772-016-9345-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-016-9345-5

Keywords

Navigation