Robust, Imperceptible and End-to-End Audio Steganography Based on CNN

Wang, Jie; Wang, Rangding; Dong, Li; Yan, Diqun

doi:10.1007/978-981-15-9129-7_30

Jie Wang⁸,
Rangding Wang⁸,
Li Dong⁸ &
…
Diqun Yan⁸

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1268))

Included in the following conference series:

International Conference on Security and Privacy in Digital Economy

1833 Accesses
2 Citations

Abstract

Recently, deep learning-based steganography emerges, where the end-to-end steganography is a promising direction. However, most of the existing approaches are developed for the image which are not suitable for the audio. In this paper, we design a CNN-based end-to-end framework that consists of an encoder and a decoder. The encoder achieves encoding the secret message into the audio cover and the corresponding decoder is used to extract the message. Specifically, a derivative-based distortion function is adopted as the loss function of the encoder. Besides, instead of directly generating the stego audios, the encoder in our framework generates the modification vector of the audio sampling value. In this way, the distortion incurred by message embedding can be further reduced. The experiment results show that, compared with the existing approach based on generative adversarial network (GAN), even without an adversarial steganalytic network, stego audios perform relatively more imperceptible. In addition, considering some possible pollution of stego audios in the transmission, we further improve the robustness of our approach by introducing noise simulation layers into the framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/philipperemy/timit.

References

Chen, B., Wang, J., Chen, Y., Jin, Z., Shim, H.J., Shi, Y.Q.: High-capacity robust image steganography via adversarial network. KSII Trans. Internet Inf. Syst. 14(1) (2020)
Google Scholar
Chen, B., Luo, W., Li, H.: Audio steganalysis with convolutional neural network. In: Proceedings of the 5th ACM Workshop on Information Hiding and Multimedia Security, pp. 85–90. ACM (2017)
Google Scholar
Chen, K., Zhou, H., Li, W., Yang, K., Zhang, W., Yu, N.: Derivative-based steganographic distortion and its non-additive extensions for audio. IEEE Trans. Circuits Syst. Video Technol. (2019)
Google Scholar
Filler, T., Judas, J., Fridrich, J.: Minimizing additive distortion in steganography using syndrome-trellis codes. IEEE Trans. Inf. Forensics Secur. 6(3), 920–935 (2011)
Article Google Scholar
Fridrich, J., Filler, T.: Practical methods for minimizing embedding impact in steganography. In: Security, Steganography, and Watermarking of Multimedia Contents IX, vol. 6505, p. 650502. International Society for Optics and Photonics (2007)
Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Google Scholar
Hayes, J., Danezis, G.: Generating steganographic images via adversarial training. In: Advances in Neural Information Processing Systems, pp. 1954–1963 (2017)
Google Scholar
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Lee, D., Oh, T.W., Kim, K.: Deep audio steganalysis in time domain. In: Proceedings of the 2020 ACM Workshop on Information Hiding and Multimedia Security, pp. 11–21. ACM (2020)
Google Scholar
Lin, Y., Wang, R., Yan, D., Dong, L., Zhang, X.: Audio steganalysis with improved convolutional neural network. In: Proceedings of the ACM Workshop on Information Hiding and Multimedia Security, pp. 210–215. ACM (2019)
Google Scholar
Lin, Z., Huang, Y., Wang, J.: RNN-SM: fast steganalysis of VoIP streams using recurrent neural network. IEEE Trans. Inf. Forensics Secur. 13(7), 1854–1868 (2018)
Article Google Scholar
Liu, Q., Sung, A.H., Qiao, M.: Temporal derivative-based spectrum and mel-cepstrum audio steganalysis. IEEE Trans. Inf. Forensics Secur. 4(3), 359–368 (2009)
Article Google Scholar
Liu, Q., Sung, A.H., Qiao, M.: Derivative-based audio steganalysis. ACM Trans. Multimedia Comput. Commun. Appl. (TOMM) 7(3), 18 (2011)
Google Scholar
Luo, W., Li, H., Yan, Q., Yang, R., Huang, J.: Improved audio steganalytic feature and its applications in audio forensics. ACM Trans. Multimedia Comput. Commun. Appl. (TOMM) 14(2), 43 (2018)
Google Scholar
Luo, W., Zhang, Y., Li, H.: Adaptive audio steganography based on advanced audio coding and syndrome-trellis coding. In: Kraetzer, C., Shi, Y.-Q., Dittmann, J., Kim, H.J. (eds.) IWDW 2017. LNCS, vol. 10431, pp. 177–186. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-64185-0_14
Chapter Google Scholar
Mielikainen, J.: LSB matching revisited. IEEE Signal Process. Lett. 13(5), 285–287 (2006)
Article Google Scholar
Paszke, A., et al.: Automatic differentiation in pytorch (2017)
Google Scholar
Qu, Y., et al.: Decentralized privacy using blockchain-enabled federated learning in fog computing. IEEE Internet Things J. (2020)
Google Scholar
Qu, Y., Yu, S., Zhou, W., Tian, Y.: GAN-driven personalized spatial-temporal private data sharing in cyber-physical social systems. IEEE Trans. Netw. Sci. Eng. (2020)
Google Scholar
Rix, A.W., Beerends, J.G., Hollier, M.P., Hekstra, A.P.: Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs. In: 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No. 01CH37221), vol. 2, pp. 749–752. IEEE (2001)
Google Scholar
Shi, H., Dong, J., Wang, W., Qian, Y., Zhang, X.: SSGAN: secure steganography based on generative adversarial networks. In: Zeng, B., Huang, Q., El Saddik, A., Li, H., Jiang, S., Fan, X. (eds.) PCM 2017. LNCS, vol. 10735, pp. 534–544. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-77380-3_51
Chapter Google Scholar
Tang, W., Tan, S., Li, B., Huang, J.: Automatic steganographic distortion learning using a generative adversarial network. IEEE Signal Process. Lett. 24(10), 1547–1551 (2017)
Article Google Scholar
Wang, M., Xu, C., Chen, X., Hao, H., Zhong, L., Yu, S.: Differential privacy oriented distributed online learning for mobile social video prefetching. IEEE Trans. Multimedia 21(3), 636–651 (2019)
Article Google Scholar
Yang, J., Ruan, D., Huang, J., Kang, X., Shi, Y.Q.: An embedding cost learning framework using GAN. IEEE Trans. Inf. Forensics Secur. 15, 839–851 (2019)
Article Google Scholar
Zhang, K.A., Cuesta-Infante, A., Xu, L., Veeramachaneni, K.: SteganoGAN: high capacity image steganography with GANs. arXiv preprint arXiv:1901.03892 (2019)
Zhang, R., Dong, S., Liu, J.: Invisible steganography via generative adversarial networks. Multimedia Tools Appl. 78(7), 8559–8575 (2018). https://doi.org/10.1007/s11042-018-6951-z
Article Google Scholar
Zhou, C., Fu, A., Yu, S., Yang, W., Wang, H., Zhang, Y.: Privacy-preserving federated learning in fog computing. IEEE Internet Things J. (2020)
Google Scholar
Zhu, J., Kaplan, R., Johnson, J., Fei-Fei, L.: Hidden: hiding data with deep networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 657–672 (2018)
Google Scholar

Download references

Acknowledgement

This work was supported by the National Natural Science Foundation of China (Grant No. U1736215, 61672302, 61901237), Zhejiang Natural Science Foundation (Grant No. LY20F020010, LY17F020010) and K.C. Wong Magna Fund in Ningbo University.

Author information

Authors and Affiliations

College of Information Science and Engineering, Ningbo University, Zhejiang, China
Jie Wang, Rangding Wang, Li Dong & Diqun Yan

Authors

Jie Wang
View author publications
You can also search for this author in PubMed Google Scholar
Rangding Wang
View author publications
You can also search for this author in PubMed Google Scholar
Li Dong
View author publications
You can also search for this author in PubMed Google Scholar
Diqun Yan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rangding Wang .

Editor information

Editors and Affiliations

University of Technology Sydney, Sydney, NSW, Australia
Shui Yu
IBM Zurich Research Laboratory, Zurich, Switzerland
Peter Mueller
Ningbo University, Ningbo, China
Jiangbo Qian

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, J., Wang, R., Dong, L., Yan, D. (2020). Robust, Imperceptible and End-to-End Audio Steganography Based on CNN. In: Yu, S., Mueller, P., Qian, J. (eds) Security and Privacy in Digital Economy. SPDE 2020. Communications in Computer and Information Science, vol 1268. Springer, Singapore. https://doi.org/10.1007/978-981-15-9129-7_30

Download citation

DOI: https://doi.org/10.1007/978-981-15-9129-7_30
Published: 22 October 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-9128-0
Online ISBN: 978-981-15-9129-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics