Skip to main content
Log in

Speech watermarking based tamper detection and recovery scheme with high tolerable tamper rate

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Speech watermarking has been widely used for tamper detection and recovery. In this paper, we propose a speech watermarking based tamper detection and recovery method after analyzing the characteristics of continuous and discontinuous tamper. The authentication watermarks and recovery watermarks are embedded into the original speech using align embedding and misalign embedding strategies, respectively. In particular, the misalign embedding strategy which distributes the recovery watermarks repeatedly and widely can effectively prevent the speech segment and its recovery watermarks from being tampered simultaneously, which significantly increases the tolerable tamper rate (TTR) of the proposed method. Several experiments concerning inaudibility, recovery rate, sound quality of recovered speech, and recovery percentage were carried out to evaluate the proposed method. The obtained results suggested that the proposed method had good inaudibility. Moreover, it could tolerate high tamper rate (around 50%) and provide satisfactory recovery rate (100%) and speech quality (PESQ≥ 3.0 ODG and LSD ≤ 1.0 dB) under continuous tamper (for N ≥ 6). Similarly, it could recovery most of the speech after discontinuous tamper even under high tamper rate. These results verified the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. And JF, Goljan M (1999) Images with self-correcting capabilities. In: International conference on image processing

  2. Dhawan A, Mitra SK (2008) Hybrid audio watermarking with spread spectrum and singular value decomposition in India Conference

  3. Galajit K, Karnjana J, Unoki M, Aimmanee P (2019) Semi-fragile speech watermarking based on singular-spectrum analysis with cnn-based parameter estimation for tampering detection. APSIPA Trans Signal Inf Process 8:1–13

    Article  Google Scholar 

  4. He HJ, Zhang JS, Chen F (2009) Adjacent-block based statistical detection method for self-embedding watermarking techniques. Signal Process 89 (8):1557–1566

    Article  Google Scholar 

  5. Hoffmann E, Kolossa D, Köhler B, Orglmeister R (2012) Using information theoretic distance measures for solving the permutation problem of blind source separation of speech signals. EURASIP J Audio Speech Music Process 2012:14

    Article  Google Scholar 

  6. Hu H, Lee T (2019) Hybrid blind audio watermarking for proprietary protection, tamper proofing, and self-recovery. IEEE Access 7:180,395–180,408

    Article  Google Scholar 

  7. Hu Y, Loizou PC (2008) Evaluation of objective quality measures for speech enhancement. IEEE Trans Audio Speech Lang Process 16(1):229–238

    Article  Google Scholar 

  8. Hua G, Huang J, Shi YQ, Goh J, Thing VLL (2016) Twenty years of digital audio watermarking - a comprehensive review. Signal Process 128:222–242

    Article  Google Scholar 

  9. Ito K (2017) The lj speech dataset. https://keithito.com/LJ-speech-dataset/

  10. Kanhe A, Gnanasekaran A (2018) Robust image-in-audio watermarking technique based on dct-svd transform. EURASIP J Audio Speech Music Process 2018:1

    Article  Google Scholar 

  11. Kanhe A, Aghila G (2018) Robust image-in-audio watermarking technique based on DCT-SVD transform. EURASIP J Audio Speech Music Process 2018:16

    Article  Google Scholar 

  12. Karnjana J, Galajit K, Aimmanee P, Wutiwiwatchai C, Unoki M (2017) Speech watermarking scheme based on singular-spectrum analysis for tampering detection and identification. In: 2017 Asia-Pacific signal and information processing association annual summit and conference, APSIPA ASC 2017, Kuala Lumpur, Malaysia, December 12–15, 2017, pp 193–202

  13. Karnjana J, Galajit K, Aimmanee P, Wutiwiwatchai C, Unoki M (2017) Speech watermarking scheme based on singular-spectrum analysis for tampering detection and identification. In: Proceedings of 2017 Asia-Pacific signal and information processing association annual summit and conference, APSIPA ASC 2017, Kuala Lumpur, Malaysia, December 12–15, 2017, pp 193–202

  14. Lei BY, Soon IY, Tan E (2013) Robust svd-based audio watermarking scheme with differential evolution optimization. IEEE Trans Audio Speech Lang Process 21(11):2368–2378

    Article  Google Scholar 

  15. Li J, Lu W, Zhang C, Wei J, Cao X, Dang J (2016) A study on detection and recovery of speech signal tampering. In: 2016 IEEE, Trustcom/BigDataSE/ISPA, Tianjin, China, August 23–26, 2016, pp 678–682

  16. Lin X, Kang X (2017) Exposing speech tampering via spectral phase analysis. Digit Signal Process 60:63–74

    Article  Google Scholar 

  17. Liu Z, Fan Z, Jing W, Wang H, Huang J (2016) Authentication and recovery algorithm for speech signal based on digital watermarking. Signal Process 123(C):157–166

    Article  Google Scholar 

  18. Liu Z, Zhang F, Wang J, Wang H, Huang J (2016) Authentication and recovery algorithm for speech signal based on digital watermarking. Signal Process 123:157–166

    Article  Google Scholar 

  19. Liu Z, Luo D, Huang J, Wang J, Qi C (2017) Tamper recovery algorithm for digital speech signal based on DWT and DCT. Multimedia Tools Appl 76(10):12,481–12,504

    Article  Google Scholar 

  20. Liu Z, Luo D, Huang J, Wang J, Qi C (2017) Tamper recovery algorithm for digital speech signal based on DWT and DCT. Multimedia Tools Appl 76(10):12,481–12,504

    Article  Google Scholar 

  21. Liu Z, Li Y, Sun F, He J, Qi C, Luo D (2018) A robust recoverable algorithm used for digital speech forensics based on DCT. In: Cloud computing and security - 4th international conference, ICCCS 2018, Haikou, China, June 8-10, 2018, Revised selected papers, Part VI, pp 300–311

  22. Lu W, Chen Z, Ling L, Cao X, Wei J, Xiong N, Jian L, Dang J (2018) Watermarking based on compressive sensing for digital speech detection and recovery? Sensors 18(7):2390

    Article  Google Scholar 

  23. Mubeen Z, Afzal M, Ali Z, Khan S, Imran M (2021) Detection of impostor and tampered segments in audio by using an intelligent system. Comput Electr Eng 91:107122

    Article  Google Scholar 

  24. Nematollahi MA, Gamboa-Rosales H, Martinez-Ruiz FJ, Rosa-Vargas JIDL, Al-Haddad SAR, Esmaeilpour M (2017) Multi-factor authentication model based on multipurpose speech watermarking and online speaker recognition. Multimedia Tools Appl 76(5):7251

    Article  Google Scholar 

  25. Nematollahi MA, Al-Haddad SAR (2013) An overview of digital speech watermarking. Int J Speech Technol 16(4):471–488

    Article  Google Scholar 

  26. Podilchuk CI, Delp EJ (2001) Digital watermarking: algorithms and applications. IEEE Signal Proc Mag 18(4):33–46

    Article  Google Scholar 

  27. Qian Q, Wang H, Abdullahi SM, Wang H, Shi C (2016) Speech authentication and recovery scheme in encrypted domain. In: Digital forensics and watermarking - 15th international workshop, IWDW 2016, Beijing, China, September 17–19, 2016, Revised Selected Papers, pp 46–60

  28. Qian Q, Wang H, Sun X, Cui Y, Wang H, Shi C (2018) Speech authentication and content recovery scheme for security communication and storage. Telecommun Syst 67(4):635–649

    Article  Google Scholar 

  29. Qin C, Ping J, Zhang X, Jing D, Wang J (2017) Fragile image watermarking with pixel-wise recovery based on overlapping embedding strategy. Signal Process 138(C):280–293

    Article  Google Scholar 

  30. Rakhmawati L, Wirawan, Suwadi (2019) A recent survey of self-embedding fragile watermarking scheme for image authentication with recovery capability. EURASIP J Image Video Process 2019:61

    Article  Google Scholar 

  31. Rigoni R, Freitas PG, Farias MCQ (2016) Detecting tampering in audio-visual content using QIM watermarking. Inf Sci 328:127–143

    Article  Google Scholar 

  32. Saeed S, Mohammad Ali A (2015) A source-channel coding approach to digital image protection and self-recovery. IEEE Trans Image Process A Publ IEEE Signal Process Soc 24(7):2266–77

    MathSciNet  Google Scholar 

  33. Sarreshtedari S, Akhaee MA, Abbasfar A (2015) A watermarking method for digital speech self-recovery. IEEE/ACM Trans Audio Speech Lang Process 23(11):1917–1925

    Google Scholar 

  34. Shokri S, Ismail M, Zainal N, Moghaddasi M (2017) Audio-speech watermarking using a channel equalizer. Wirel Pers Commun 95 (4):4457–4476

    Article  Google Scholar 

  35. Unoki M, Miyauchi R (2017) Detection of tampering in speech signals with inaudible watermarking technique. In: Proceedings of 8th international conference on intelligent information hiding and multimedia signal processing, IIH-MSP, Piraeus-Athens, Greece, pp 118–121

  36. Wang S, Yuan W, Wang J, Unoki M (2019) Detection of speech tampering using sparse representations and spectral manipulations based information hiding. Speech Comm 112:1–14

    Article  Google Scholar 

  37. Wang S, Yuan W, Wang J, Unoki M (2019) Inaudible speech watermarking based on self-compensated echo-hiding and sparse subspace clustering. In: IEEE international conference on acoustics, speech and signal processing, ICASSP, 2019, Brighton, United Kingdom, May 12-17, 2019, pp 2632–2636

  38. Wang S, Yuan W, Zhang Z, Wang J, Unoki M (2021) Tampering detection for speech signals using synchronization code and lsf-based watermarks. In: Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA, ASC 2021, Tokyo, Japan, December 14-17, 2021. IEEE, pp 1621–1626

  39. Zhang Q, Zhang D, Xu F (2021) An encrypted speech authentication and tampering recovery method based on perceptual hashing. Multimedia Tools Appl 80(16):24,925–24,948

    Article  Google Scholar 

  40. Zhu X, Ho ATS, Marziliano P (2007) A new semi-fragile image watermarking with robust tampering restoration using irregular sampling. Signal Process Image Commun 22(5):515–528

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (62176182, 62201314, and 61771340), in part by the Natural Science Foundation of Shandong Province (ZR2020QF007), and in part by the Scientific Research Project of Tianjin Education Commission under Grant 19PTZWHZ00020.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weitao Yuan.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, S., Yuan, W., Zhang, Z. et al. Speech watermarking based tamper detection and recovery scheme with high tolerable tamper rate. Multimed Tools Appl 83, 6711–6729 (2024). https://doi.org/10.1007/s11042-023-15580-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15580-x

Keywords

Navigation