Skip to main content

Muting Machine Speech Using Audio Watermarking

  • Conference paper
  • First Online:
Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2018)

Abstract

Spoken dialog systems have become popular and are used in a home environment, such as smart speakers. A problem will occur when two or more smart speakers are in the same environment, in which a dialog system misdetects the other dialog systems voice as a users voice. In this paper, a method to mute synthesized speech is proposed to prevent a speech recognizer from recognizing speech uttered by a machine. The audio watermark technique is used to indicate that a machine utters the speech, and the speech recognizer attenuates the observed speech if it contains the watermark. The watermark is embedded in high frequency so that humans cannot perceive the watermark and the watermark is robustly extracted. From the experimental result, we found that the proposed method robustly determine the existence of the watermark when the SNR is no less than 0 dB.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Arnold, M., Chen, X.M., Baum, P., Gries, U., Doërr, G.: A phase-based audio watermarking system robust to acoustic path propagation. IEEE Trans. Inf. Forensics Secur. 9(3), 411–425 (2014). https://doi.org/10.1109/TIFS.2013.2293952

    Article  Google Scholar 

  2. Embleton, T.F.W.: Tutorial on sound propagation outdoors. J. Acoust. Soc. Am. 100(1), 31–48 (1996). https://doi.org/10.1121/1.415879

    Article  Google Scholar 

  3. Furui, S.: Toward robust speech recognition under adverse conditions. In: ESCA Tutorial and Research Workshop on Speech Processing in Adverse Conditions (1992)

    Google Scholar 

  4. Grant, R., McGregor, P.E.: Method for integrating computer processes with an interface controlled by voice actuated grammars. U.S. Patent No. 6,208,972, March 2001

    Google Scholar 

  5. Kojima, T., Oizumi, A., Okayasu, K., Parampalli, U.: An audio data hiding based on complete complementary codes and its application to an evacuation guiding system. In: The Sixth International Workshop on Signal Design and Its Applications in Communications, pp. 118–121, October 2013. https://doi.org/10.1109/IWSDA.2013.6849077

  6. Lie, W.N., Chang, L.C.: Robust and high-quality time-domain audio watermarking based on low-frequency amplitude modification. IEEE Trans. Multimed. 8(1), 46–59 (2006)

    Article  Google Scholar 

  7. Marx, M.T., et al.: System and method for developing interactive speech applications. U.S. Patent No. 6,173,266, January 2011

    Google Scholar 

  8. Matsuoka, H., Nakashima, Y., Yoshimura, T.: Acoustic communication system using mobile terminal microphones. NTT DoCoMo Tech. J. 8(2), 4–12 (2006)

    Google Scholar 

  9. Nakashima, Y., Matsuoka, H., Yoshimura, T.: Evaluation and demonstration of acoustic OFDM. In: 2006 Fortieth Asilomar Conference on Signals, Systems and Computers, pp. 1747–1751, October 2006. https://doi.org/10.1109/ACSSC.2006.355061

  10. Nematollahi, M.A., Al-Haddad, S.A.R.: An overview of digital speech watermarking. Int. J. Speech Technol. 16, 471–488 (2013)

    Article  Google Scholar 

  11. Nishimura, A.: Data hiding for audio signals that are robust with respect to air transmission and a speech codec. In: Proceedings of the International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp. 601–604, August 2008. https://doi.org/10.1109/IIH-MSP.2008.333

  12. Nishimura, A.: Encoding data by frequency modulation of a high-low siren emitted by an emergency vehicle. In: 2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp. 255–259, August 2014. https://doi.org/10.1109/IIH-MSP.2014.70

  13. Suzuki, Y., Nishimura, R., Tao, H.: Audio watermark enhanced by LDPC coding for air transmission. In: Proceedings of the International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp. 23–26, December 2006. https://doi.org/10.1109/IIH-MSP.2006.265111

Download references

Acknowledgment

Part of this work was supported by JSPS Kakenhi JP17H00823.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Akinori Ito .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ito, A. (2019). Muting Machine Speech Using Audio Watermarking. In: Pan, JS., Ito, A., Tsai, PW., Jain, L. (eds) Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing. IIH-MSP 2018. Smart Innovation, Systems and Technologies, vol 110. Springer, Cham. https://doi.org/10.1007/978-3-030-03748-2_9

Download citation

Publish with us

Policies and ethics