Abstract
It has been proven that the higher the correlation level between samples in the time-domain of a digital signal, the stronger the energy com paction property in the Discrete Cosine Transform (DCT) domain. This paper aims to investigate the limits of the DCT energy compaction property in speech signals by segmenting the cover speech signal into correlated segments and hide in each segment. The Hiding process is performed using a hiding strategy in spired by the Amplitude Modulation (AM) technique. Due to segmentation, the homogeneity is expected to increase which causes the energy of the signal to be strongly compacted in a few critical DCT coefficients, and therefore, a substantial amount of insignificant DCT coefficients can be replaced with the secret data without sacrificing the quality of the signal. Experimental results have proven the effectiveness of the proposed scheme which outperforms other speech steganography techniques recently published in the literature.
Similar content being viewed by others
References
Abdulla AA, Sellahewa H, Jassim SA (2019) Improving embedding efficiency for digital steganography by exploiting similarities between secret and cover images. Multimedia Tools and Applications 78(13):17799–17823
Abdulla AA, Sellahewa H, Jassim SA (2014) Steganography based on pixel intensity value decomposition. In: Mobile multimedia/image processing, security, and applications 2014, vol 9120, pp 912005. International Society for Optics and Photonics
Abdulla AA (2015) Exploiting similarities between secret and cover images for improved embedding efficiency and security in digital steganography. Ph.D. Thesis, University of Buckingham
Ahmed MA, Kiah MLM, Zaidan BB, Zaidan AA (2010) A novel embedding method to increase capacity and robustness of low-bit encoding audio steganography technique using noise gate software logic algorithm. J Appl Sci 10(1):59–64
Ahmed N, Natarajan T, Rao KR (1974) Discrete cosine transform. IEEE transactions on Computers 100(1):90–93
BallesterosL DM, MorenoA JM (2012) Highly transparent steganography model of speech signals using efficient wavelet masking. Expert Syst Appl 39 (10):9141–9149
Baziyad M, Rabie T, Kamel I (2018) Extending steganography payload capacity using the l* a* b* color space. In: 2018 International conference on innovations in information technology (IIT), pp 1–6. IEEE
Baziyad M, Rabie T, Kamel I (2020) Achieving stronger compaction for dct-based steganography: A region-growing approach. In: World conference on information systems and technologies, pp 251–261. Springer
Boroumand M, Fridrich J (2018) Deep learning for detecting processing history of images. Electronic Imaging 2018(7):1–9
Chellappa R, Theodoridis S (2014) Academic press library in signal processing. Academic Press, Cambridge
Erfani Y, Siahpoush S (2009) Robust audio watermarking using improved ts echo hiding. Digital Signal Processing 19(5):809–814
Kanhe A, Aghila G (2016) Dct based audio steganography in voiced and un-voiced frames. In: Proceedings of the international conference on informatics and analytics, p 47. ACM
Kanhe A, Aghila G (2018) A dct–svd-based speech steganography in voiced frames. Circuits, Systems, and Signal Processing 37(11):5049–5068
Karampidis K, Kavallieratou E, Papadourakis G (2018) A review of image steganalysis techniques for digital forensics. Journal of information security and applications 40:217–235
Kathum AM, Al-Saad SN (2016) Speech steganography system using lifting wavelet transform. International Information Institute (Tokyo). Information 19 (10B):4633
Katz J, Lindell Y (2014) Introduction to modern cryptography. Chapman and Hall/CRC, Boca Raton
Kitawaki N, Honda M, Itoh K (1984) Speech-quality assessment methods for speech-coding systems. IEEE Commun Mag 22(10):26–33
Korzhik VI, Morales-Luna G, Fedyanin I (2013) Audio watermarking based on echo hiding with zero error probability. IJCSA 10(1):1–10
Kurapati S, Joglekar SP, Prasanna VI, Tyagi S, Thodime G, Khandelwal P, Manchenella C, Singh MK, Thodime RV (2013) System and method for providing network level and nodal level vulnerability protection in voip networks. US Patent 8,582,567
Lai E (2003) Practical digital signal processing. Elsevier, Amsterdam
Liu J, Zhou K, Tian H (2012) Least-significant-digit steganography in low bitrate speech. In: 2012 IEEE international conference on communications (ICC), pp 1133–1137. IEEE
Metcalfe RM, Boggs DR (1976) Ethernet: Distributed packet switching for local computer networks. Commun ACM 19(7):395–404
Nassif AB, Shahin I, Attili I, Azzeh M, Shaalan K (2019) Speech recognition using deep neural networks: A systematic review. IEEE Access 7:19143–19165
Rabie T, Baziyad M (2017) Visual fidelity without sacrificing capacity: an adaptive laplacian pyramid approach to information hiding. Journal of Electronic Imaging 26(6):063001
Rabie T, Baziyad M (2019) The pixogram: addressing high payload demands for video steganography. IEEE Access 7:21948–21962
Rabie T, Baziyad M (2020) Pixocomp: a novel video compression scheme utilizing temporal pixograms. Multimedia Tools and Applications, pp 1–18
Rabie T, Baziyad M, Kamel I (2018) Enhanced high capacity image steganography using discrete wavelet transform and the laplacian pyramid. Multimedia Tools and Applications 77(18):1–26
Rabie T, Baziyad M, Kamel I (2019) High payload steganography: Surface-fitting the transform domain. In: 2019 International conference on communications, signal processing, and their applications (ICCSPA), pp 1–6. IEEE
Rabie T, Guerchi D (2015) Spectral magnitude speech steganography. International Journal of Computer Applications, 116(5)
Rabie T, Kamel I (2016) On the embedding limits of the discrete cosine transform. Multimedia Tools and Applications 75(10):5939–5957
Rabie T, Kamel I (2017) High-capacity steganography: a global-adaptive-region discrete cosine transform approach. Multimedia Tools and Applications 76(5):6473–6493
Rabie T, Kamel I (2017) Toward optimal embedding capacity for transform domain steganography: a quad-tree adaptive-region approach. Multimedia Tools and Applications 76(6):8627–8650
Rabie T, Kamel I, Baziyad M (2018) Maximizing embedding capacity and stego quality: curve-fitting in the transform domain. Multimedia Tools and Applications 77(7):8295–8326
Rekik S, Guerchi D, Selouani S-A, Hamam H (2012) Speech steganography using wavelet and fourier transforms. EURASIP Journal on Audio, Speech, and Music Processing 2012(1):20
Sedighi V, Cogranne R, Fridrich J (2015) Content-adaptive steganography by minimizing statistical detectability. IEEE Transactions on Information Forensics and Security 11(2):221–234
Shahin I (2016) Employing emotion cues to verify speakers in emotional talking environments. J Intell Syst 25(1):3–17
Shahin I, Nassif AB, Hamsa S (2018) Novel cascaded gaussian mixture model-deep neural network classifier for speaker identification in emotional talking environments. Neural Comput & Applic 32(7):1–13
Shahin I, Nassif AB, Hamsa S (2019) Emotion recognition using hybrid gaussian mixture model and deep neural network. IEEE Access 7:26777–26787
Shahin IMA (2013) Employing both gender and emotion cues to enhance speaker identification performance in emotional talking environments. International Journal of Speech Technology 16(3):341–351
Shih FY (2017) Digital watermarking and steganography: fundamentals and techniques. CRC press, Boca Raton
Shirali-Shahreza S, Manzuri-Shalmani MT (2008) High capacity error free wavelet domain speech steganography. In: 2008 IEEE international conference on acoustics, speech and signal processing, pp 1729–1732. IEEE
Sridevi R, Damodaram A, Narasimham SVL (2009) Efficient method of audio steganography by modified lsb algorithm and strong encryption key with enhanced security. Journal of Theoretical & Applied Information Technology, 5(6)
Sweldens W (1998) The lifting scheme: A construction of second generation wavelets. SIAM journal on mathematical analysis 29(2):511–546
Xu T, Yang Z (2009) Simple and effective speech steganography in g. 723.1 low-rate codes. In: 2009 International conference on wireless communications & signal processing, pp 1–4. IEEE
Xu T, Yang Z, Shao X (2009) Novel speech secure communication system based on information hiding and compressed sensing. In: 2009 Fourth international conference on systems and networks communications, pp 201–206. IEEE
Zhang Y (2002) Sip-based voip network and its interworking with the pstn. Electronics & Communication Engineering Journal 14(6):273–282
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Baziyad, M., Shahin, I., Rabie, T. et al. Maximizing embedding capacity for speech steganography: a segment-growing approach. Multimed Tools Appl 80, 24469–24490 (2021). https://doi.org/10.1007/s11042-020-10228-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-10228-6