Abstract
Recently, deep learning-based steganography emerges, where the end-to-end steganography is a promising direction. However, most of the existing approaches are developed for the image which are not suitable for the audio. In this paper, we design a CNN-based end-to-end framework that consists of an encoder and a decoder. The encoder achieves encoding the secret message into the audio cover and the corresponding decoder is used to extract the message. Specifically, a derivative-based distortion function is adopted as the loss function of the encoder. Besides, instead of directly generating the stego audios, the encoder in our framework generates the modification vector of the audio sampling value. In this way, the distortion incurred by message embedding can be further reduced. The experiment results show that, compared with the existing approach based on generative adversarial network (GAN), even without an adversarial steganalytic network, stego audios perform relatively more imperceptible. In addition, considering some possible pollution of stego audios in the transmission, we further improve the robustness of our approach by introducing noise simulation layers into the framework.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chen, B., Wang, J., Chen, Y., Jin, Z., Shim, H.J., Shi, Y.Q.: High-capacity robust image steganography via adversarial network. KSII Trans. Internet Inf. Syst. 14(1) (2020)
Chen, B., Luo, W., Li, H.: Audio steganalysis with convolutional neural network. In: Proceedings of the 5th ACM Workshop on Information Hiding and Multimedia Security, pp. 85–90. ACM (2017)
Chen, K., Zhou, H., Li, W., Yang, K., Zhang, W., Yu, N.: Derivative-based steganographic distortion and its non-additive extensions for audio. IEEE Trans. Circuits Syst. Video Technol. (2019)
Filler, T., Judas, J., Fridrich, J.: Minimizing additive distortion in steganography using syndrome-trellis codes. IEEE Trans. Inf. Forensics Secur. 6(3), 920–935 (2011)
Fridrich, J., Filler, T.: Practical methods for minimizing embedding impact in steganography. In: Security, Steganography, and Watermarking of Multimedia Contents IX, vol. 6505, p. 650502. International Society for Optics and Photonics (2007)
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Hayes, J., Danezis, G.: Generating steganographic images via adversarial training. In: Advances in Neural Information Processing Systems, pp. 1954–1963 (2017)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Lee, D., Oh, T.W., Kim, K.: Deep audio steganalysis in time domain. In: Proceedings of the 2020 ACM Workshop on Information Hiding and Multimedia Security, pp. 11–21. ACM (2020)
Lin, Y., Wang, R., Yan, D., Dong, L., Zhang, X.: Audio steganalysis with improved convolutional neural network. In: Proceedings of the ACM Workshop on Information Hiding and Multimedia Security, pp. 210–215. ACM (2019)
Lin, Z., Huang, Y., Wang, J.: RNN-SM: fast steganalysis of VoIP streams using recurrent neural network. IEEE Trans. Inf. Forensics Secur. 13(7), 1854–1868 (2018)
Liu, Q., Sung, A.H., Qiao, M.: Temporal derivative-based spectrum and mel-cepstrum audio steganalysis. IEEE Trans. Inf. Forensics Secur. 4(3), 359–368 (2009)
Liu, Q., Sung, A.H., Qiao, M.: Derivative-based audio steganalysis. ACM Trans. Multimedia Comput. Commun. Appl. (TOMM) 7(3), 18 (2011)
Luo, W., Li, H., Yan, Q., Yang, R., Huang, J.: Improved audio steganalytic feature and its applications in audio forensics. ACM Trans. Multimedia Comput. Commun. Appl. (TOMM) 14(2), 43 (2018)
Luo, W., Zhang, Y., Li, H.: Adaptive audio steganography based on advanced audio coding and syndrome-trellis coding. In: Kraetzer, C., Shi, Y.-Q., Dittmann, J., Kim, H.J. (eds.) IWDW 2017. LNCS, vol. 10431, pp. 177–186. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-64185-0_14
Mielikainen, J.: LSB matching revisited. IEEE Signal Process. Lett. 13(5), 285–287 (2006)
Paszke, A., et al.: Automatic differentiation in pytorch (2017)
Qu, Y., et al.: Decentralized privacy using blockchain-enabled federated learning in fog computing. IEEE Internet Things J. (2020)
Qu, Y., Yu, S., Zhou, W., Tian, Y.: GAN-driven personalized spatial-temporal private data sharing in cyber-physical social systems. IEEE Trans. Netw. Sci. Eng. (2020)
Rix, A.W., Beerends, J.G., Hollier, M.P., Hekstra, A.P.: Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs. In: 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No. 01CH37221), vol. 2, pp. 749–752. IEEE (2001)
Shi, H., Dong, J., Wang, W., Qian, Y., Zhang, X.: SSGAN: secure steganography based on generative adversarial networks. In: Zeng, B., Huang, Q., El Saddik, A., Li, H., Jiang, S., Fan, X. (eds.) PCM 2017. LNCS, vol. 10735, pp. 534–544. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-77380-3_51
Tang, W., Tan, S., Li, B., Huang, J.: Automatic steganographic distortion learning using a generative adversarial network. IEEE Signal Process. Lett. 24(10), 1547–1551 (2017)
Wang, M., Xu, C., Chen, X., Hao, H., Zhong, L., Yu, S.: Differential privacy oriented distributed online learning for mobile social video prefetching. IEEE Trans. Multimedia 21(3), 636–651 (2019)
Yang, J., Ruan, D., Huang, J., Kang, X., Shi, Y.Q.: An embedding cost learning framework using GAN. IEEE Trans. Inf. Forensics Secur. 15, 839–851 (2019)
Zhang, K.A., Cuesta-Infante, A., Xu, L., Veeramachaneni, K.: SteganoGAN: high capacity image steganography with GANs. arXiv preprint arXiv:1901.03892 (2019)
Zhang, R., Dong, S., Liu, J.: Invisible steganography via generative adversarial networks. Multimedia Tools Appl. 78(7), 8559–8575 (2018). https://doi.org/10.1007/s11042-018-6951-z
Zhou, C., Fu, A., Yu, S., Yang, W., Wang, H., Zhang, Y.: Privacy-preserving federated learning in fog computing. IEEE Internet Things J. (2020)
Zhu, J., Kaplan, R., Johnson, J., Fei-Fei, L.: Hidden: hiding data with deep networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 657–672 (2018)
Acknowledgement
This work was supported by the National Natural Science Foundation of China (Grant No. U1736215, 61672302, 61901237), Zhejiang Natural Science Foundation (Grant No. LY20F020010, LY17F020010) and K.C. Wong Magna Fund in Ningbo University.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wang, J., Wang, R., Dong, L., Yan, D. (2020). Robust, Imperceptible and End-to-End Audio Steganography Based on CNN. In: Yu, S., Mueller, P., Qian, J. (eds) Security and Privacy in Digital Economy. SPDE 2020. Communications in Computer and Information Science, vol 1268. Springer, Singapore. https://doi.org/10.1007/978-981-15-9129-7_30
Download citation
DOI: https://doi.org/10.1007/978-981-15-9129-7_30
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-9128-0
Online ISBN: 978-981-15-9129-7
eBook Packages: Computer ScienceComputer Science (R0)