skip to main content
10.1145/3652583.3658082acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

CodeDetector: Revealing Forgery Traces with Codebook for Generalized Deepfake Detection

Published: 07 June 2024 Publication History

Abstract

The malicious use of deepfake technologies poses a significant threat to social security, emphasizing the urgent necessity to advance deepfake detection. Existing detection models tend to overfit specific forgery traces within the training set, resulting in a weak generalization performance on unseen data. Considering that the feature distribution of real images is limited compared to the diverse forgery patterns, capturing the real feature space and treating features outside the distribution as potential forgery traces can mitigate overfitting and enhance the detector's generalization ability. Furthermore, since facial manipulation generates artifacts in pixels and disrupts the global consistency of the data, potential forgery traces at both pixel and feature levels can contribute to detection. In this paper, we propose a novel two-stage deepfake detection model named CodeDetector, which utilizes a codebook to capture the feature space of real faces and obtain potential forgery traces by facial reconstruction for detection. In the codebook learning stage, a codebook is used to capture the real feature distribution. In the detector training stage, we obtain pixel-level and feature-level residuals as potential forgery traces through facial reconstruction on real and fake faces to guide the model's attention to forgery clues. Specifically, we propose a Quantized Residual-Guided Attention Module and a Dual Residual Attention Module, calculating residuals and utilizing the attention mechanism to enhance global feature representation. Additionally, an Indices Prediction Module is introduced to ensure the accuracy of residual guidance that enhances the robustness of reconstruction during detection. Extensive experiments have demonstrated that CodeDetector outperforms state-of-the-art in deepfake detection cross-dataset benchmark.

References

[1]
Darius Afchar, Vincent Nozick, Junichi Yamagishi, and Isao Echizen. 2018. Mesonet: a compact facial video forgery detection network. In 2018 IEEE international workshop on information forensics and security (WIFS). IEEE, 1--7.
[2]
Weiming Bai, Yufan Liu, Zhipeng Zhang, Bing Li, andWeiming Hu. 2023. AUNet: Learning Relations Between Action Units for Face Forgery Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 24709--24719.
[3]
Belhassen Bayar and Matthew C. Stamm. 2016. A Deep Learning Approach to Universal Image Manipulation Detection Using a New Convolutional Layer. In Proceedings of the 4th ACM Workshop on Information Hiding and Multimedia Security, IH&MMSec 2016, Vigo, Galicia, Spain, June 20--22, 2016, Fernando Pérez- González, Patrick Bas, Tanya Ignatenko, and François Cayre (Eds.). ACM, 5--10. https://doi.org/10.1145/2909827.2930786
[4]
Junyi Cao, Chao Ma, Taiping Yao, Shen Chen, Shouhong Ding, and Xiaokang Yang. 2022. End-to-end reconstruction-classification learning for face forgery detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4113--4122.
[5]
O. Chapelle, B. Scholkopf, and A. Zien, Eds. 2009. Semi-Supervised Learning (Chapelle, O. et al., Eds.; 2006) [Book reviews]. IEEE Transactions on Neural Networks 20, 3 (2009), 542--542. https://doi.org/10.1109/TNN.2009.2015974
[6]
Liang Chen, Yong Zhang, Yibing Song, Jue Wang, and Lingqiao Liu. 2022. Ost: Improving generalization of deepfake detection via one-shot test-time training. Advances in Neural Information Processing Systems 35 (2022), 24597--24610.
[7]
Shen Chen, Taiping Yao, Yang Chen, Shouhong Ding, Jilin Li, and Rongrong Ji. 2021. Local relation learning for face forgery detection. In Proceedings of the AAAI conference on artificial intelligence, Vol. 35. 1081--1088.
[8]
H. Dang, F. Liu, J. Stehouwer, X. Liu, and A. K. Jain. 2020. On the Detection of Digital Face Manipulation. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, Los Alamitos, CA, USA, 5780--5789. https://doi.org/10.1109/CVPR42600.2020.00582
[9]
Deepfakes. 2022. faceswap. https://github.com/deepfakes/faceswap
[10]
Jiankang Deng, Jia Guo, Evangelos Ververas, Irene Kotsia, and Stefanos Zafeiriou. 2020. Retinaface: Single-shot multi-level face localisation in the wild. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5203--5212.
[11]
Brian Dolhansky, Joanna Bitton, Ben Pflaum, Jikuo Lu, Russ Howes, Menglin Wang, and Cristian Canton Ferrer. 2020. The deepfake detection challenge (dfdc) dataset. arXiv preprint arXiv:2006.07397 (2020).
[12]
Patrick Esser, Robin Rombach, and Bjorn Ommer. 2021. Taming transformers for high-resolution image synthesis. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 12873--12883.
[13]
Jun Fu, Jing Liu, Haijie Tian, Yong Li, Yongjun Bao, Zhiwei Fang, and Hanqing Lu. 2019. Dual attention network for scene segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3146--3154.
[14]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2020. Generative adversarial networks. Commun. ACM 63, 11 (oct 2020), 139--144. https://doi.org/10.1145/ 3422622
[15]
Juan Hu, Xin Liao, Difei Gao, Satoshi Tsutsui, Qian Wang, Zheng Qin, and Mike Zheng Shou. 2023. Recap: Detecting Deepfake Video with Unpredictable Tampered Traces via Recovering Faces and Mapping Recovered Faces. arXiv preprint arXiv:2308.09921 (2023).
[16]
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-toimage translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1125--1134.
[17]
Tero Karras, Samuli Laine, and Timo Aila. 2019. A Style-Based Generator Architecture for Generative Adversarial Networks. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16--20, 2019. Computer Vision Foundation / IEEE, 4401--4410. https://doi.org/10.1109/CVPR. 2019.00453
[18]
Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 4401--4410.
[19]
Hasam Khalid and Simon S Woo. 2020. Oc-fakedect: Classifying deepfakes using one-class variational autoencoder. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. 656--657.
[20]
Diederik P. Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14--16, 2014, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1312.6114
[21]
Marissa Koopman, Andrea Macarulla Rodriguez, and Zeno Geradts. 2018. Detection of deepfake video manipulation. In The 20th Irish machine vision and image processing conference (IMVIP). 133--136.
[22]
Jiaming Li, Hongtao Xie, Jiahong Li, Zhongyuan Wang, and Yongdong Zhang. 2021. Frequency-aware discriminative feature learning supervised by singlecenter loss for face forgery detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 6458--6467.
[23]
Lingzhi Li, Jianmin Bao, Ting Zhang, Hao Yang, Dong Chen, FangWen, and Baining Guo. 2020. Face x-ray for more general face forgery detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5001--5010.
[24]
Yuezun Li, Xin Yang, Pu Sun, Honggang Qi, and Siwei Lyu. 2019. Celeb-DF (v2): a new dataset for DeepFake Forensics [J]. arXiv preprint arXiv (2019).
[25]
Yuezun Li, Xin Yang, Pu Sun, Honggang Qi, and Siwei Lyu. 2020. Celeb-df: A large-scale challenging dataset for deepfake forensics. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3207--3216.
[26]
Jialun Peng, Dong Liu, Songcen Xu, and Houqiang Li. 2021. Generating diverse structure for image inpainting with hierarchical VQ-VAE. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10775--10784.
[27]
Yuyang Qian, Guojun Yin, Lu Sheng, Zixuan Chen, and Jing Shao. 2020. Thinking in frequency: Face forgery detection by mining frequency-aware clues. In European conference on computer vision. Springer, 86--103.
[28]
Ali Razavi, Aaron Van den Oord, and Oriol Vinyals. 2019. Generating diverse high-fidelity images with vq-vae-2. Advances in neural information processing systems 32 (2019).
[29]
Andreas Rossler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. 2019. Faceforensics: Learning to detect manipulated facial images. In Proceedings of the IEEE/CVF international conference on computer vision. 1--11.
[30]
Lukas Ruff, Robert A Vandermeulen, Nico Görnitz, Alexander Binder, Emmanuel Müller, Klaus-Robert Müller, and Marius Kloft. 2019. Deep semi-supervised anomaly detection. arXiv preprint arXiv:1906.02694 (2019).
[31]
Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision. 618--626.
[32]
Liang Shi, Jie Zhang, and Shiguang Shan. 2023. Real Face Foundation Representation Learning for Generalized Deepfake Detection. arXiv preprint arXiv:2303.08439 (2023).
[33]
Wenzhe Shi, Jose Caballero, et al. 2016. Real-time single image and video superresolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1874--1883.
[34]
Zenan Shi, Haipeng Chen, and Dong Zhang. 2023. Transformer-Auxiliary Neural Networks for Image Manipulation Localization by Operator Inductions. IEEE Transactions on Circuits and Systems for Video Technology 33, 9 (2023), 4907--4920. https://doi.org/10.1109/TCSVT.2023.3251444
[35]
Karen Simonyan and AndrewZisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[36]
Ke Sun, Hong Liu, Qixiang Ye, Yue Gao, Jianzhuang Liu, Ling Shao, and Rongrong Ji. 2021. Domain general face forgery detection by learning to weight. In Proceedings of the AAAI conference on artificial intelligence, Vol. 35. 2638--2646.
[37]
Ke Sun, Taiping Yao, Shen Chen, Shouhong Ding, Jilin Li, and Rongrong Ji. 2022. Dual contrastive learning for general face forgery detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 2316--2324.
[38]
Aaron Van Den Oord, Oriol Vinyals, et al. 2017. Neural discrete representation learning. Advances in neural information processing systems 30 (2017).
[39]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, ?ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).
[40]
Chengrui Wang and Weihong Deng. 2021. Representative Forgery Mining for Fake Face Detection. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19--25, 2021. Computer Vision Foundation / IEEE, 14923--14932. https://doi.org/10.1109/CVPR46437.2021.01468
[41]
Yuan Wang, Kun Yu, Chen Chen, Xiyuan Hu, and Silong Peng. 2023. Dynamic Graph Learning With Content-Guided Spatial-Frequency Relation Reasoning for Deepfake Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7278--7287.
[42]
Bosheng Yan, Chang-Tsun Li, and Xuequan Lu. 2022. Deepfake Detection via Joint Unsupervised Reconstruction and Supervised Classification. arXiv preprint arXiv:2211.13424 (2022).
[43]
Ziming Yang, Jian Liang, Yuting Xu, Xiao-Yu Zhang, and Ran He. 2023. Masked relation learning for deepfake detection. IEEE Transactions on Information Forensics and Security 18 (2023), 1696--1708.
[44]
N. Yu, L. Davis, and M. Fritz. 2019. Attributing Fake Images to GANs: Learning and Analyzing GAN Fingerprints. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE Computer Society, Los Alamitos, CA, USA, 7555--7565. https://doi.org/10.1109/ICCV.2019.00765
[45]
Tianchen Zhao, Xiang Xu, Mingze Xu, Hui Ding, Yuanjun Xiong, and Wei Xia. 2021. Learning self-consistency for deepfake detection. In Proceedings of the IEEE/CVF international conference on computer vision. 15023--15033.
[46]
Wanyi Zhuang, Qi Chu, Zhentao Tan, Qiankun Liu, Haojie Yuan, Changtao Miao, Zixiang Luo, and Nenghai Yu. 2022. UIA-ViT: Unsupervised inconsistency-aware method based on vision transformer for face forgery detection. In European Conference on Computer Vision. Springer, 391--407.

Index Terms

  1. CodeDetector: Revealing Forgery Traces with Codebook for Generalized Deepfake Detection

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICMR '24: Proceedings of the 2024 International Conference on Multimedia Retrieval
    May 2024
    1379 pages
    ISBN:9798400706196
    DOI:10.1145/3652583
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 June 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. codebook vector quantization
    2. deepfake detection

    Qualifiers

    • Research-article

    Funding Sources

    • 2023 Shenzhen sustainable supporting funds for colleges and universities
    • Shenzhen Science and Technology Program

    Conference

    ICMR '24
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 254 of 830 submissions, 31%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 134
      Total Downloads
    • Downloads (Last 12 months)134
    • Downloads (Last 6 weeks)22
    Reflects downloads up to 13 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media