Abstract:
Artificial intelligence technology has been developing rapidly, and speech synthesis models have become increasingly mature, capable of generating highly realistic synthe...Show MoreMetadata
Abstract:
Artificial intelligence technology has been developing rapidly, and speech synthesis models have become increasingly mature, capable of generating highly realistic synthetic audio used to disseminate misinformation, which poses a serious security risk problem. Digital watermarking technology can effectively protect digital content. Deep learning is currently achieving significant research success in digital watermarking. However, the current robustness against audio manipulation remains understudied. Based on this, we propose a robust audio watermarking method based on deep learning against manipulation attacks. Specifically, the embedding of watermarking information is performed in the encoder and the extraction of watermarking information is performed in the decoder; In addition, various audio attacks are simulated during iterative training, a sampling noise layer is used to increase robustness, and a discriminator is used to distinguish between encoded audio and original audio to improve the invisibility of the watermark. We comprehensively evaluate the performance of our model against various manipulation attacks. Experimental results demonstrate that the framework effectively embeds and extracts watermarked signals, exhibiting strong robustness.
Published in: IEEE Signal Processing Letters ( Volume: 32)