research-article

Hierarchical Learning and Dummy Triplet Loss for Efficient Deepfake Detection

Authors:

Nicolas Beuve,

Wassim Hamidouche,

Olivier DéforgesAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications and Applications, Volume 20, Issue 3

Article No.: 89, Pages 1 - 18

https://doi.org/10.1145/3626101

Published: 09 December 2023 Publication History

Get Access

Abstract

The advancement of generative models has made it easier to create highly realistic Deepfake videos. This accessibility has led to a surge in research on Deepfake detection to mitigate potential misuse. Typically, Deepfake detection models utilize binary backbones, even though the training dataset contains additional exploitable information, such as the Deepfake generation method employed for each video. However, recent findings suggest that inferring a binary class from a multi-class backbone yields superior performance compared to directly employing a binary backbone. Building upon this research, our article introduces two novel methods to infer a binary class from a multi-class backbone. The first method, named root dummies, leverages the dummy triplet loss, which employs fixed vectors (i.e., dummies) instead of mined positives and negatives in the triplet loss. By training the multi-class backbone with these dummies, we can easily infer a binary class during testing by adjusting the number of dummies (from six during training to two during inference). Through this approach, we achieve an accuracy improvement of 0.23% compared to the existing inference method, without requiring additional training. The second proposed method is transfer learning. It involves training a classifier, such as a support vector machine, to predict binary classes based on the image embeddings generated by the multi-class backbone. Although this method necessitates additional training, it further enhances the model’s performance, resulting in an accuracy increase of 1.79%. In summary, our proposed methods improve the accuracy of Deepfake detection by simply modifying the number of classes during training, making them suitable for integration into a variety of existing Deepfake training pipelines. Additionally, to foster reproducible research, we have made the source code of our solution publicly available at https://github.com/beuve/DmyT.

References

[1]

Darius Afchar, Vincent Nozick, Junichi Yamagishi, and Isao Echizen. 2018. MesoNet: A compact facial video forgery detection network. In Proceedings of the 2018 IEEE International Workshop on Information Forensics and Security (WIFS ’18). 1–7.

Abstract

References

Cited By

Index Terms

Recommendations

Confidence-based Weighted Loss for Multi-label Classification with Missing Labels

Cooperative Bi-path Metric for Few-shot Learning

Fully Unsupervised Deepfake Video Detection Via Enhanced Contrastive Learning

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Full Text

Share

Share this Publication link

Share on social media

Affiliations