Frame-level steganalysis of QIM steganography in compressed speech based on multi-dimensional perspective of codeword correlations

Wei, Miao; Li, Songbin; Liu, Peng; Huang, Yongfeng; Yan, Qiandong; Wang, Jingang; Zhang, Cheng

doi:10.1007/s12652-021-03608-9

Frame-level steganalysis of QIM steganography in compressed speech based on multi-dimensional perspective of codeword correlations

Original Research
Published: 27 November 2021

Volume 14, pages 8421–8431, (2023)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Miao Wei¹,
Songbin Li ORCID: orcid.org/0000-0001-7243-5159¹,
Peng Liu¹,
Yongfeng Huang²,
Qiandong Yan¹,
Jingang Wang¹ &
…
Cheng Zhang³

275 Accesses
3 Citations
Explore all metrics

Abstract

In this paper, a frame-level steganalysis of Quantization Index Modulation (QIM) steganography in compressed speech streams is proposed for the first time. The proposed method builds a neural network classification framework based on multi-dimensional perspective of codeword correlations, which is inspired by cognitive biology. Four dimensions are employed: global-to-local, local-to-global, forward and backward. First, the codeword embedding method is utilized to map each codeword into a compact representation. Next, Bi-LSTM is used to consider the steganographic features in time sequence and reverse time sequence. Subsequently, a dual-thread attention mechanism is designed to extract local and global features at the same time. Finally, a channel attention mechanism is employed to increase the weight that contributes the most to the current task and the convolution and fully connected layers are used to generate the frame-level steganographic label. Experimental results show that the proposed method is effective and practical in frame-level detection tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hierarchical Representation Network for Steganalysis of QIM Steganography in Low-Bit-Rate Speech Signals

A Common Steganalysis Method of Low Embedding Rate Steganography in Compressed Speech Based on Hierarchy Feature Extraction and Fusion

Improving Audio Steganalysis Using Deep Residual Networks

Notes

Our codes and dataset can be found at https://zenodo.org/record/5457267.

References

Aoki N (2010) A semi-lossless steganography technique for G.711 telephony speech. In: Proceedings—2010 6th international conference on intelligent information hiding and multimedia signal processing, IIHMSP, pp 534–537, https://doi.org/10.1109/IIHMSP.2010.136
Benyassine A, Shlomot E, Su H, Massaloux D, Lamblin C, Petit J (1997) Itu-t recommendation g.729 annex b: a silence compression scheme for use with g.729 optimized for v.70 digital simultaneous voice and data applications. IEEE Commun Mag 35(9):64–73. https://doi.org/10.1109/35.620527
Article Google Scholar
Chen B, Wornell G (2001) Quantization index modulation: a class of provably good methods for digital watermarking and information embedding. IEEE Trans Inf Theory 47(4):1423–1443. https://doi.org/10.1109/18.923725
Article MathSciNet MATH Google Scholar
Chen B, Luo W, Li H (2017) Audio steganalysis with convolutional neural network. In: IH and MMSec 2017—proceedings of the 2017 ACM workshop on information hiding and multimedia security, ACM Press, New York, USA, pp 85–90, https://doi.org/10.1145/3082031.3083234, http://dl.acm.org/citation.cfm?doid=3082031.3083234
Chollet F (2017) Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)
Gers F, Schmidhuber J, Cummins F (1999) Learning to forget: Continual prediction with LSTM. IEE conference publication, IEE, Vol. 2, pp. 850–855. https://doi.org/10.1049/cp:19991218
Gong C, Yi X, Zhao X, Ma Y (2019) Recurrent convolutional neural networks for AMR steganalysis based on pulse position. In: IH and MMSec 2019—proceedings of the ACM workshop on information hiding and multimedia security, association for computing machinery, Inc, New York, NY, USA, pp 2–13, https://dl.acm.org/doi/10.1145/3335203.3335708
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778, https://doi.org/10.1109/CVPR.2016.90
Holub V, Fridrich J, Denemark T (2014) Universal distortion function for steganography in an arbitrary domain. EURASIP J Inf Secur https://doi.org/10.1186/1687-417X-2014-1
Hu J, Shen L, Albanie S, Sun G, Wu E (2017) Squeeze-and-excitation networks. 1709.01507
Hu Y, Huang Y, Yang Z, Huang Y (2021) Detection of heterogeneous parallel steganography for low bit-rate VoIP speech streams. Neurocomputing 419:70–79. https://doi.org/10.1016/j.neucom.2020.08.002
Article Google Scholar
Hua G, Huang J, Shi Y, Goh J, Thing V (2016) Twenty years of digital audio watermarking-a comprehensive review. Signal Process 128(C):222–242
Article Google Scholar
Huang Y, Xiao B, Xiao H (2008) Implementation of covert communication based on steganography. In: Proceedings—2008 4th international conference on intelligent information hiding and multimedia signal processing, IIH-MSP 2008, pp 1512–1515, https://doi.org/10.1109/IIH-MSP.2008.174
Huang Y, Tang S, Zhang Y (2011) Detection of covert voice-over Internet protocol communications using sliding window-based steganalysis. IET Commun 5(7):929–936. https://doi.org/10.1049/iet-com.2010.0348
Article Google Scholar
Huang Y, Liu C, Tang S, Bai S (2012) Steganography integration into a low-bit rate speech codec. IEEE Trans Inf Forensics Secur 7(6):1865–1875. https://doi.org/10.1109/TIFS.2012.2218599
Article Google Scholar
Huang Y, Tao H, Xiao B, Chang C (2017) Steganography in low bit-rate speech streams based on quantization index modulation controlled by keys. Sci China Technol Sci 60(10):1585–1596. https://doi.org/10.1007/s11431-016-0707-3
Article Google Scholar
Kazuhiro G (2009) global and local processing in vision: perspectives from comparative cognition. Shinrigaku Kenkyu Jpn J Psychol 80(4):352
Article Google Scholar
Kim M, Kim J, Shin M (2020) Word embedding based knowledge representation with extracting relationship between scientific terminologies. Intell Autom Soft Comput 26(1):141–147
Google Scholar
Lin Z, Huang Y, Wang J (2018) RNN-SM: fast steganalysis of VoIP streams using recurrent neural network. IEEE Trans Inf Forensics Secur 13(7):1854–1868. https://doi.org/10.1109/TIFS.2018.2806741
Article Google Scholar
Liu L, Li M, Li Q, Liang Y (2008) Perceptually transparent information hiding in G.729 bitstream. In: Proceedings—2008 4th international conference on intelligent information hiding and multimedia signal processing, IIH-MSP 2008, pp. 406–409, https://doi.org/10.1109/IIH-MSP.2008.297
Munoz R, David O, Ponomaryov V, Reyes R, Cruz C, Ponomaryov D (2019) Steganographic framework for hiding a color image into digital images. In: 2019 IEEE international scientific-practical conference: problems of infocommunications science and technology, PIC S and T 2019—proceedings, Institute of Electrical and Electronics Engineers Inc., pp. 63–66, https://doi.org/10.1109/PICST47496.2019.9061223
Navon D (1977) Forest before trees: the precedence of global features in visual perception. Cogn Psychol 9(3):353–383. https://doi.org/10.1016/0010-0285(77)90012-3
Article Google Scholar
Ren Y, Wu H, Wang L (2018) An AMR adaptive steganography algorithm based on minimizing distortion. Multimed Tools Appl 77(10):12095–12110. https://doi.org/10.1007/s11042-017-4860-1
Article Google Scholar
Ren Y, Yang H, Wu H, Tu W, Wang L (2019) A secure AMR fixed codebook steganographic scheme based on pulse distribution model. IEEE Trans Inf Forensics Secur 14(10):2649–2661. https://doi.org/10.1109/TIFS.2019.2905760
Article Google Scholar
Sadek M, Khalifa A, Mostafa M (2015) Video steganography: a comprehensive review. Multimed Tools Appl 74(17):7063–7094
Article Google Scholar
Tian H, Liu J, Li S (2014) Improving security of quantization-index-modulation steganography in low bit-rate speech streams. Multimed Syst 20(2):143–154. https://doi.org/10.1007/s00530-013-0302-8
Article Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) arXiv, Curran Associates, Inc., pp. 5998–6008, http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf
Wu H, Yi B, Ding F, Feng G, Zhang X (2021) Linguistic steganalysis with graph neural networks. IEEE Signal Process Lett 28:558–562. https://doi.org/10.1109/LSP.2021.3062233
Article Google Scholar
Xiao B, Huang Y, Tang S (2008) An approach to information hiding in low bit-rate speech stream. In: IEEE GLOBECOM 2008—2008 IEEE global telecommunications conference, pp. 1–5
Yang H, Yang Z, Bao Y, Huang Y (2019) Hierarchical representation network for steganalysis of qim steganography in low-bit-rate speech signals. In: International conference on information and communications security, Springer, pp. 783–798
Yang H, Yang Z, Bao Y, Liu S, Huang Y (2020) Fcem: a novel fast correlation extract model for real time steganalysis of voip stream via multi-head attention. In: ICASSP 2020—2020 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp. 2822–2826, https://doi.org/10.1109/ICASSP40776.2020.9054361
Zhao H, Dai Q, Ren J, Wei W, Xiao Y, Li C (2018) Robust information hiding in low-resolution videos with quantization index modulation in DCT-CS domain. Multimed Tools Appl 77(14):18827–18847. https://doi.org/10.1007/s11042-017-5223-7
Article Google Scholar

Download references

Acknowledgements

This work was supported in part by the Important Science and Technology Project of Hainan Province under Grant ZDKJ2020010, partly by the Hainan Provincial Natural Science Foundation of China under grant 618QN309, and partly by the IACAS Free Exploration Project.

Author information

Authors and Affiliations

Institute of Acoustics, Chinese Academy of Sciences, Beijing, 100190, China
Miao Wei, Songbin Li, Peng Liu, Qiandong Yan & Jingang Wang
Department of Electronic Engineering, Tsinghua University, Beijing, 100084, China
Yongfeng Huang
The University of Melbourne, Melbourne, VIC3010, Australia
Cheng Zhang

Authors

Miao Wei
View author publications
You can also search for this author in PubMed Google Scholar
Songbin Li
View author publications
You can also search for this author in PubMed Google Scholar
Peng Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yongfeng Huang
View author publications
You can also search for this author in PubMed Google Scholar
Qiandong Yan
View author publications
You can also search for this author in PubMed Google Scholar
Jingang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Cheng Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Songbin Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wei, M., Li, S., Liu, P. et al. Frame-level steganalysis of QIM steganography in compressed speech based on multi-dimensional perspective of codeword correlations. J Ambient Intell Human Comput 14, 8421–8431 (2023). https://doi.org/10.1007/s12652-021-03608-9

Download citation

Received: 21 January 2021
Accepted: 18 November 2021
Published: 27 November 2021
Issue Date: July 2023
DOI: https://doi.org/10.1007/s12652-021-03608-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Frame-level steganalysis of QIM steganography in compressed speech based on multi-dimensional perspective of codeword correlations

Abstract

Access this article

Similar content being viewed by others

Hierarchical Representation Network for Steganalysis of QIM Steganography in Low-Bit-Rate Speech Signals

A Common Steganalysis Method of Low Embedding Rate Steganography in Compressed Speech Based on Hierarchy Feature Extraction and Fusion

Improving Audio Steganalysis Using Deep Residual Networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Frame-level steganalysis of QIM steganography in compressed speech based on multi-dimensional perspective of codeword correlations

Abstract

Access this article

Similar content being viewed by others

Hierarchical Representation Network for Steganalysis of QIM Steganography in Low-Bit-Rate Speech Signals

A Common Steganalysis Method of Low Embedding Rate Steganography in Compressed Speech Based on Hierarchy Feature Extraction and Fusion

Improving Audio Steganalysis Using Deep Residual Networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation