Abstract
Image steganography is a procedure of hiding any messages within an image. In this paper, our major goal is to conceal images within audio, and we converted this audio steganography problem to image steganography by utilizing the mel-spectrogram of the audio files as the cover medium. Previously, this audio steganography problem was implemented using statistical methods like least significant bit (LSB) encoding. Here we explore the use of deep neural networks (DNNs), and we propose a new technique to hide images within the audio using deep generative models which allow us to optimize the perceptual quality of the reconstructed audio and image by our model. We showed that our model efficiently hides images within audio and evades detection by steganalysis tools, is robust to different color spectrum images, and can hide multiple image data in single audio.
Similar content being viewed by others
References
Allen J B, Rabiner L R (1977) A unified approach to short-time fourier analysis and synthesis. Proc IEEE 65(11):1558–1564. https://doi.org/10.1109/PROC.1977.10770
Almohammad A, Ghinea G (2010) Stego image quality and the reliability of psnr. In: 2010 2nd International Conference on Image Processing Theory, Tools and Applications, pp 215–220
Asad M, Gilani J, Khalid A (2011) An enhanced least significant bit modification technique for audio steganography. In: International Conference on Computer Networks and Information Technology, pp 143–147
Balgurgi P P, Jagtap S K (2012) Intelligent processing: An approach of audio steganography. In: 2012 International Conference on Communication, Information Computing Technology (ICCICT), pp 1–6
Baluja S (2017) Hiding images in plain sight: Deep steganography. In: Neural Information Processing Systems. http://www.esprockets.com/papers/nips2017.pdf
Bender W, Gruhl D, Morimoto N, Lu A (1996) Techniques for data hiding. IBM Syst J 35(3.4):313–336. https://doi.org/10.1147/sj.353.0313
Boehm B (2014) Stegexpose - A tool for detecting LSB steganography. CoRR, arXiv:1410.6656
Choi Y, Choi M, Kim M, Ha J, Kim S, Choo J (2018) Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 8789–8797
Djebbar F, Ayad B, Hamam H, Abed-Meraim K (2011) A view on latest audio steganography techniques. In: 2011 International Conference on Innovations in Information Technology, pp 409–414
Dumitrescu S, Xiaolin Wu, Memon N (2002) On steganalysis of random lsb embedding in continuous-tone images. In: Proceedings. International Conference on Image Processing, vol 3, pp 641–644
Dumitrescu S, Wu X, Wang Z (2003) Detection of lsb steganography via sample pair analysis. Inf Hiding:355–372
Fridrich J, Goljan M, Du R (2001) Reliable detection of lsb steganography in color and grayscale images. MMamp;Sec ’01. Association for Computing Machinery, New York, pp 27–30, DOI https://doi.org/10.1145/1232454.1232466, (to appear in print)
Gandikota R., Mishra D. (2019) Hiding audio in images: A deep learning approach. In: 8-th International Conference on Pattern Recognition and Machine Intelligence (PReMI 2019), Lecture Notes in Computer Science, vol 11942. Springer, Cham, pp 389–399
Griffin D, Jae Lim (1984) Signal estimation from modified short-time fourier transform. IEEE Trans Acoust Speech Signal Process 32(2):236–243. https://doi.org/10.1109/TASSP.1984.1164317
Gruhl D, Lu A, Bender W (1996) Echo hiding. In: Anderson R (ed) Information Hiding. Springer, Berlin, pp 295–315
Ito K (2017) The lj speech dataset. https://keithito.com/LJ-Speech-Dataset/
Hayes J, Danezis G (1996) Generating steganographic images via adversarial training. In: Advances in Neural Information Processing Systems
Khare N, Devan P, Chowdhary C, Bhattacharya S, Singh G, Singh S, Yoon B (2020) Smo-dnn: Spider monkey optimization and deep neural network hybrid classifier model for intrusion detection. Electronics 9:692. https://doi.org/10.3390/electronics9040692
Liao X, Li K, Yin J (2017) Separable data hiding in encrypted image based on compressive sensing and discrete fourier transform. Multimed Tools Appl 76. https://doi.org/10.1007/s11042-016-3971-4
Liao X, Yin J, Chen M, Qin Z (2020) Adaptive payload distribution in multiple images steganography based on image texture features. IEEE Trans Depend Sec Comput:1–1. https://doi.org/10.1109/TDSC.2020.3004708
Liao X, Yu Y, Li B, Li Z, Qin Z (2020) A new payload partition strategy in color image steganography. IEEE Trans Circ Syst Video Technol 30 (3):685–696. https://doi.org/10.1109/TCSVT.2019.2896270
Lin T-Y, Maire M, Belongie S J, Bourdev L D, Girshick R B, Hays J, Perona P, Ramanan D, Dollár P, Zitnick C L (2014) Microsoft COCO: common objects in context. CoRR, arXiv:1405.0312
Litao Gang, Akansu A N, Ramkumar M (2001) Mp3 resistant oblivious steganography. In: 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221), vol 3, pp 1365–1368 vol.3
Morkel T, Eloff J H P, Olivier M S (2005) An overview of image steganography. In: Eloff J H P, Labuschagne L, Eloff M M, Venter H S (eds) Proceedings of the ISSA 2005 New Knowledge Today Conference, 29 June - 1 July 2005. ISSA, Pretoria, pp 1–11. http://icsa.cs.up.ac.za/issa/2005/Proceedings/Full/098_Article.pdf
Liu P-Y, Lam EY (2018) Image reconstruction using deep learning. CoRR, arXiv:1809.10410
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. CoRR, arXiv:1505.04597
Saroha K, Singh P K (2010) Article:a variant of lsb steganography for hiding images in audio. Int J Comput Appl 11(6):12–16. Published By Foundation of Computer Science
Hochreiter S, Schmidhuber J (1997) Long short-term memory. In: Neural Computation
Standard I (2008) Ieee standard for floating-point arithmetic. IEEE Std 7542008, pp 1–70. https://doi.org/10.1109/IEEESTD.2008.4610935
Toderici G, Vincent D, Johnston N, Hwang S J, Minnen D, Shor J, Covell M (2016) Full resolution image compression with recurrent neural networks. CoRR, arXiv:1608.05148
Wang Y, Yang K, Yi X, Zhao X, Xu Z (2018) Cnn-based steganalysis of MP3 steganography in the entropy code domain. In: Böhme R, Pasquini C, Boato G, Schöttle P (eds) Proceedings of the 6th ACM Workshop on Information Hiding and Multimedia Security. ACM, Innsbruck, pp 55–65, DOI https://doi.org/10.1145/3206004.3206011, (to appear in print)
Wang Z, Simoncelli E P, Bovik A C (2003) Multiscale structural similarity for image quality assessment. In: The Thrity-Seventh Asilomar Conference on Signals, Systems Computers, 2003, vol 2, pp 1398–1402
Westfeld A, Pfitzmann A (2000) Attacks on steganographic systems. Inform Hiding:61–76
Yari I A, Zargari S (2017) An overview and computer forensic challenges in image steganography. In: 2017 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData), pp 360–364
Zhang K, Cuesta-Infante A, Xu L, Veeramachaneni K (2019) Steganogan: High capacity image steganography with gans. arXiv:1901.03892
Zhao H, Gallo O, Frosio I, Kautz J (2017) Loss functions for image restoration with neural networks. IEEE Trans Comput Imaging 3(1):47–57
Zhu J, Kaplan R, Johnson J, Fei-Fei L (2018) Hidden: Hiding data with deep networks. CoRR, arXiv:1807.09937
Acknowledgements
We would like to thank Ms. Aswathy P, research Scholar, Avionics, Department, IIST Trivandrum for her constant support and intellectual assistance in organizing this paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Paul, S., Mishra, D. Hiding images within audio using deep generative model. Multimed Tools Appl 82, 5049–5072 (2023). https://doi.org/10.1007/s11042-022-13034-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-13034-4