Skip to main content
Log in

Maximizing embedding capacity for speech steganography: a segment-growing approach

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

It has been proven that the higher the correlation level between samples in the time-domain of a digital signal, the stronger the energy com paction property in the Discrete Cosine Transform (DCT) domain. This paper aims to investigate the limits of the DCT energy compaction property in speech signals by segmenting the cover speech signal into correlated segments and hide in each segment. The Hiding process is performed using a hiding strategy in spired by the Amplitude Modulation (AM) technique. Due to segmentation, the homogeneity is expected to increase which causes the energy of the signal to be strongly compacted in a few critical DCT coefficients, and therefore, a substantial amount of insignificant DCT coefficients can be replaced with the secret data without sacrificing the quality of the signal. Experimental results have proven the effectiveness of the proposed scheme which outperforms other speech steganography techniques recently published in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. https://www.transparencymarketresearch.com/pressrelease/voip-services-market.htm

  2. https://speech-to-text-demo.ng.bluemix.net/

References

  1. Abdulla AA, Sellahewa H, Jassim SA (2019) Improving embedding efficiency for digital steganography by exploiting similarities between secret and cover images. Multimedia Tools and Applications 78(13):17799–17823

    Article  Google Scholar 

  2. Abdulla AA, Sellahewa H, Jassim SA (2014) Steganography based on pixel intensity value decomposition. In: Mobile multimedia/image processing, security, and applications 2014, vol 9120, pp 912005. International Society for Optics and Photonics

  3. Abdulla AA (2015) Exploiting similarities between secret and cover images for improved embedding efficiency and security in digital steganography. Ph.D. Thesis, University of Buckingham

  4. Ahmed MA, Kiah MLM, Zaidan BB, Zaidan AA (2010) A novel embedding method to increase capacity and robustness of low-bit encoding audio steganography technique using noise gate software logic algorithm. J Appl Sci 10(1):59–64

    Article  Google Scholar 

  5. Ahmed N, Natarajan T, Rao KR (1974) Discrete cosine transform. IEEE transactions on Computers 100(1):90–93

    Article  MathSciNet  Google Scholar 

  6. BallesterosL DM, MorenoA JM (2012) Highly transparent steganography model of speech signals using efficient wavelet masking. Expert Syst Appl 39 (10):9141–9149

    Article  Google Scholar 

  7. Baziyad M, Rabie T, Kamel I (2018) Extending steganography payload capacity using the l* a* b* color space. In: 2018 International conference on innovations in information technology (IIT), pp 1–6. IEEE

  8. Baziyad M, Rabie T, Kamel I (2020) Achieving stronger compaction for dct-based steganography: A region-growing approach. In: World conference on information systems and technologies, pp 251–261. Springer

  9. Boroumand M, Fridrich J (2018) Deep learning for detecting processing history of images. Electronic Imaging 2018(7):1–9

    Article  Google Scholar 

  10. Chellappa R, Theodoridis S (2014) Academic press library in signal processing. Academic Press, Cambridge

    Google Scholar 

  11. Erfani Y, Siahpoush S (2009) Robust audio watermarking using improved ts echo hiding. Digital Signal Processing 19(5):809–814

    Article  Google Scholar 

  12. Kanhe A, Aghila G (2016) Dct based audio steganography in voiced and un-voiced frames. In: Proceedings of the international conference on informatics and analytics, p 47. ACM

  13. Kanhe A, Aghila G (2018) A dct–svd-based speech steganography in voiced frames. Circuits, Systems, and Signal Processing 37(11):5049–5068

    Article  Google Scholar 

  14. Karampidis K, Kavallieratou E, Papadourakis G (2018) A review of image steganalysis techniques for digital forensics. Journal of information security and applications 40:217–235

    Article  Google Scholar 

  15. Kathum AM, Al-Saad SN (2016) Speech steganography system using lifting wavelet transform. International Information Institute (Tokyo). Information 19 (10B):4633

    Google Scholar 

  16. Katz J, Lindell Y (2014) Introduction to modern cryptography. Chapman and Hall/CRC, Boca Raton

    Book  Google Scholar 

  17. Kitawaki N, Honda M, Itoh K (1984) Speech-quality assessment methods for speech-coding systems. IEEE Commun Mag 22(10):26–33

    Article  Google Scholar 

  18. Korzhik VI, Morales-Luna G, Fedyanin I (2013) Audio watermarking based on echo hiding with zero error probability. IJCSA 10(1):1–10

    Google Scholar 

  19. Kurapati S, Joglekar SP, Prasanna VI, Tyagi S, Thodime G, Khandelwal P, Manchenella C, Singh MK, Thodime RV (2013) System and method for providing network level and nodal level vulnerability protection in voip networks. US Patent 8,582,567

  20. Lai E (2003) Practical digital signal processing. Elsevier, Amsterdam

    Google Scholar 

  21. Liu J, Zhou K, Tian H (2012) Least-significant-digit steganography in low bitrate speech. In: 2012 IEEE international conference on communications (ICC), pp 1133–1137. IEEE

  22. Metcalfe RM, Boggs DR (1976) Ethernet: Distributed packet switching for local computer networks. Commun ACM 19(7):395–404

    Article  Google Scholar 

  23. Nassif AB, Shahin I, Attili I, Azzeh M, Shaalan K (2019) Speech recognition using deep neural networks: A systematic review. IEEE Access 7:19143–19165

    Article  Google Scholar 

  24. Rabie T, Baziyad M (2017) Visual fidelity without sacrificing capacity: an adaptive laplacian pyramid approach to information hiding. Journal of Electronic Imaging 26(6):063001

    Article  Google Scholar 

  25. Rabie T, Baziyad M (2019) The pixogram: addressing high payload demands for video steganography. IEEE Access 7:21948–21962

    Article  Google Scholar 

  26. Rabie T, Baziyad M (2020) Pixocomp: a novel video compression scheme utilizing temporal pixograms. Multimedia Tools and Applications, pp 1–18

  27. Rabie T, Baziyad M, Kamel I (2018) Enhanced high capacity image steganography using discrete wavelet transform and the laplacian pyramid. Multimedia Tools and Applications 77(18):1–26

    Article  Google Scholar 

  28. Rabie T, Baziyad M, Kamel I (2019) High payload steganography: Surface-fitting the transform domain. In: 2019 International conference on communications, signal processing, and their applications (ICCSPA), pp 1–6. IEEE

  29. Rabie T, Guerchi D (2015) Spectral magnitude speech steganography. International Journal of Computer Applications, 116(5)

  30. Rabie T, Kamel I (2016) On the embedding limits of the discrete cosine transform. Multimedia Tools and Applications 75(10):5939–5957

    Article  Google Scholar 

  31. Rabie T, Kamel I (2017) High-capacity steganography: a global-adaptive-region discrete cosine transform approach. Multimedia Tools and Applications 76(5):6473–6493

    Article  Google Scholar 

  32. Rabie T, Kamel I (2017) Toward optimal embedding capacity for transform domain steganography: a quad-tree adaptive-region approach. Multimedia Tools and Applications 76(6):8627–8650

    Article  Google Scholar 

  33. Rabie T, Kamel I, Baziyad M (2018) Maximizing embedding capacity and stego quality: curve-fitting in the transform domain. Multimedia Tools and Applications 77(7):8295–8326

    Article  Google Scholar 

  34. Rekik S, Guerchi D, Selouani S-A, Hamam H (2012) Speech steganography using wavelet and fourier transforms. EURASIP Journal on Audio, Speech, and Music Processing 2012(1):20

    Article  Google Scholar 

  35. Sedighi V, Cogranne R, Fridrich J (2015) Content-adaptive steganography by minimizing statistical detectability. IEEE Transactions on Information Forensics and Security 11(2):221–234

    Article  Google Scholar 

  36. Shahin I (2016) Employing emotion cues to verify speakers in emotional talking environments. J Intell Syst 25(1):3–17

    Article  MathSciNet  Google Scholar 

  37. Shahin I, Nassif AB, Hamsa S (2018) Novel cascaded gaussian mixture model-deep neural network classifier for speaker identification in emotional talking environments. Neural Comput & Applic 32(7):1–13

    Google Scholar 

  38. Shahin I, Nassif AB, Hamsa S (2019) Emotion recognition using hybrid gaussian mixture model and deep neural network. IEEE Access 7:26777–26787

    Article  Google Scholar 

  39. Shahin IMA (2013) Employing both gender and emotion cues to enhance speaker identification performance in emotional talking environments. International Journal of Speech Technology 16(3):341–351

    Article  Google Scholar 

  40. Shih FY (2017) Digital watermarking and steganography: fundamentals and techniques. CRC press, Boca Raton

    Book  Google Scholar 

  41. Shirali-Shahreza S, Manzuri-Shalmani MT (2008) High capacity error free wavelet domain speech steganography. In: 2008 IEEE international conference on acoustics, speech and signal processing, pp 1729–1732. IEEE

  42. Sridevi R, Damodaram A, Narasimham SVL (2009) Efficient method of audio steganography by modified lsb algorithm and strong encryption key with enhanced security. Journal of Theoretical & Applied Information Technology, 5(6)

  43. Sweldens W (1998) The lifting scheme: A construction of second generation wavelets. SIAM journal on mathematical analysis 29(2):511–546

    Article  MathSciNet  Google Scholar 

  44. Xu T, Yang Z (2009) Simple and effective speech steganography in g. 723.1 low-rate codes. In: 2009 International conference on wireless communications & signal processing, pp 1–4. IEEE

  45. Xu T, Yang Z, Shao X (2009) Novel speech secure communication system based on information hiding and compressed sensing. In: 2009 Fourth international conference on systems and networks communications, pp 201–206. IEEE

  46. Zhang Y (2002) Sip-based voip network and its interworking with the pstn. Electronics & Communication Engineering Journal 14(6):273–282

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammed Baziyad.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Baziyad, M., Shahin, I., Rabie, T. et al. Maximizing embedding capacity for speech steganography: a segment-growing approach. Multimed Tools Appl 80, 24469–24490 (2021). https://doi.org/10.1007/s11042-020-10228-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-10228-6

Keywords

Navigation