Training Deep Neural Networks (DNN) is a time-consuming process and requires a large amount of training data, which motivates studies working on protecting the intellectual property (IP) of DNN models by employing various watermarking techniques. Unfortunately, in recent years, adversaries have been exploiting the vulnerabilities of the employed watermarking techniques to remove the embedded watermarks. In this paper, we investigate and introduce a novel watermark removal attack, called AdvNP, against all the existing four different types of DNN watermarking schemes via input preprocessing by injecting Adversarial Naturalness-aware Perturbations. In contrast to the prior studies, our proposed method is the first work that generalizes all the existing four watermarking schemes well without involving any model modification, which preserves the fidelity of the target model. We conduct the experiments against four state-of-the-art (SOTA) watermarking schemes on two real tasks (e.g., image classification on ImageNet, face recognition on CelebA) across multiple DNN models. Overall, our proposed AdvNP significantly invalidates the watermarks against the four watermarking schemes on two real-world datasets, i.e., 60.9% on the average attack success rate and up to 97% in the worse case. Moreover, our AdvNP could well survive the image denoising techniques and outperforms the baseline in both the fidelity preserving and watermark removal. Furthermore, we introduce two defense methods to enhance the robustness of DNN watermarking against our AdvNP. Our experimental results pose real threats to the existing watermarking schemes and call for more practical and robust watermarking techniques to protect the copyright of pre-trained DNN models. The source code and models are available at ttps://github.com/GitKJ123/AdvNP.

Supplementary Material

MP4 File (MM22-fp2962.mp4)

Here is the video presentation of our work: Rethinking the Vulnerability of DNN Watermarking: Are Watermarks Robust against Naturalness-aware Perturbations? In this paper, we investigate and introduce a novel watermark removal attack, called AdvNP, against all the existing four different types of DNN watermarking schemes via input preprocessing by injecting Adversarial Naturalness-aware Perturbations. We conduct the experiments against four state-of-the-art watermarking schemes on two real tasks (e.g image classification on ImageNet, face recognition on CelebA) across multiple DNN models. Overall, our proposed AdvNP could invalidates the watermarks against the four watermarking schemes on two real-world datasets and could well survive the image denoising techniques and outperforms the baseline in both the fidelity preserving and watermark removal. We also introduce two defense methods to enhance the robustness of DNN watermarking against our AdvNP. Thanks for watching our presentation.

Download
24.03 MB

References

[1]

2022. Keras-vggface. https://github.com/rcmalli/keras-vggface.

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Free Fine-tuning: A Plug-and-Play Watermarking Scheme for Deep Neural Networks

Chameleon DNN Watermarking: Dynamically Public Model Ownership Verification

A Feature-Map-Based Large-Payload DNN Watermarking Algorithm

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations