Abstract
Most existing speech steganography breaks the continuity of adjacent pitch delay, which obviously degrades their statistically undetectability. This paper presents a novel steganographic scheme for low bit-rate speech stream against pitch delay steganalysis. Three measures are adopted to enhance steganographic security. First, the short-term stability of pitch delay and the statistical distribution of adjacent subframe are considered for designing a distortion function. Second, syndrome-trellis codes (STCs) is utilized to minimize the overall embedding impact based on the defined distortion function. Third, the suboptimal pitch delay is searched to maintain speech quality. Experimental results demonstrate that our scheme achieves higher level of security, especially in the case of low embedding rate. When the relative embedding rate is 0.2 for 10.2 kbit/s AMR stream, the test error rate of our method rises by 12.44% compared with the existing algorithm.
Keywords
This work was supported by National Key Technology R&D Program under 2016YFB0801003 and 2016QY15Z2500, NSFC under U1636102, U1736214, 61872356 and 61802393, and Project of Beijing Municipal Science & Technology Commission under Z181100002718001.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011). http://www.csie.ntu.edu.tw/~cjlin/libsvm
Chi, Y.U., Huang, L.S., Yang, W., Chen, Z.L., Miao, H.B.: A 3G speech data hiding method based on pitch period. J. Chin. Comput. Syst. 33(7), 1445–1449 (2012)
Wang, D., Zhang, X.: Thchs-30: A Free Chinese Speech Corpus (2015). http://arxiv.org/abs/1512.01882
Filler, T., Judas, J., Fridrich, J.: Minimizing additive distortion in steganography using syndrome-trellis codes. IEEE Trans. Inf. Forensics Secur. 6(3), 920–935 (2011)
MSCSP Functions: Adaptive multi-rate (AMR) speech codec. Voice Activity Detector (VAD) (2012)
Group, I.T.S., et al.: Coding of speech at 8 kbits/s using conjugate-structure algebraic-code-excited linear-prediction (CS-ACELP). In: International Telecommunication Union Telecommunication Standardization Sector, Draft Recommendation, Version 6 (1995)
Guo, L., Ni, J., Shi, Y.Q.: An efficient JPEG steganographic scheme using uniform embedding. In: IEEE International Workshop on Information Forensics and Security, pp. 169–174 (2012)
Hess, W., OShaughnessy, D.: Pitch determination of speech signals: Algorithms and devices by Wolfgang Hess (1984)
Holub, V., Fridrich, J., Denemark, T.: Universal distortion function for steganography in an arbitrary domain. EURASIP J. Inf. Secur. 1(1), 1 (2014)
Huang, Y., Liu, C., Tang, S., Bai, S.: Steganography integration into a low-bit rate speech codec. IEEE Trans. Inf. Forensics Secur. 7(6), 1865–1875 (2012)
DRSC ITU-T for multimedia communications transmitting at 5.3 and 6.3 kbit/s. ITU-T Recommendation G 723 (2006)
Iwakiri, M., Matsui, K.: Embedding a text into conjugate structure algebraic code excited linear prediction audio codes. Trans. Inf. Process. Soc. Jpn 39, 2623–2630 (1998)
Liang, X.H.Y., Xia, M.: Steganalysis of speech compressed based on voicing features. J. Comput. Res. Develop. 46(s1), 173–176 (2009)
Liu, C.H., Bai, S., Huang, Y.F., Yang, Y., Song-Bin, L.I.: An information hiding algorithm based on pitch prediction. Comput. Eng. 39(2), 137–140 (2013)
Nishimura, A.: Data hiding in pitch delay data of the adaptive multi-rate narrow-band speech codec. In: International Conference on Intelligent Information Hiding & Multimedia Signal Processing, pp. 483–486 (2009)
Nishimura, A.: Steganographic band width extension for the AMR codec of low-bit-rate modes. In: INTERSPEECH 2009, Conference of the International Speech Communication Association, Brighton, United Kingdom, September, pp. 2611–2614 (2009)
Ren, Y., Yang, J., Wang, J., Wang, L.: AMR steganalysis based on second-order difference of pitch delay. IEEE Trans. Inf. Forensics Secur. 12(6), 1345–1357 (2017)
Rix, A.W., Beerends, J.G., Hollier, M.P., Hekstra, A.P.: Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, Proceedings (ICASSP 2001), vol. 2, pp. 749–752. IEEE (2001)
Song-Bin, L.I., Jia, Y.Z., Jiang-Yun, F.U., Dai, Q.X.: Detection of pitch modulation information hiding based on codebook correlation network. Chin. J. Comput. 37(10), 2107–2116 (2014)
Sullivan, T.: The CMU audio databases (1996)
Wu, Z.J., Yang, W., Yang, Y.X.: ABS-based speech information hiding approach. Electron. Lett. 39(22), 1617–1619 (2003)
Yan, S., Tang, G., Sun, Y.: Steganography for low bit-rate speech based on pitch period prediction. Appl. Res. Comput. 32(6), 1774–1777 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Gong, C., Yi, X., Zhao, X. (2019). Pitch Delay Based Adaptive Steganography for AMR Speech Stream. In: Yoo, C., Shi, YQ., Kim, H., Piva, A., Kim, G. (eds) Digital Forensics and Watermarking. IWDW 2018. Lecture Notes in Computer Science(), vol 11378. Springer, Cham. https://doi.org/10.1007/978-3-030-11389-6_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-11389-6_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11388-9
Online ISBN: 978-3-030-11389-6
eBook Packages: Computer ScienceComputer Science (R0)