skip to main content
10.1145/3581783.3613842acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Spatio-Temporal Catcher: A Self-Supervised Transformer for Deepfake Video Detection

Published:27 October 2023Publication History

ABSTRACT

As deepfake technology has become increasingly sophisticated and accessible, making it easier for individuals with malicious intent to create convincing fake content, which has raised considerable concern in the multimedia and computer vision community. Despite significant advances in deepfake video detection, most existing methods mainly focused on model architecture and training processes with little focus on data perspectives. In this paper, we argue that data quality has become the main bottleneck of current research. To be specific, in the pre-training phase, the domain shift between pre-training and target datasets may lead to poor generalization ability. Meanwhile, in the training phase, the low fidelity of the existing datasets leads to detectors relying on specific low-level visual artifacts or inconsistency. To overcome the shortcomings, (1). In the pre-training phase, pre-train our model on high-quality facial videos by utilizing data-efficient reconstruction-based self-supervised learning to solve domain shift. (2). In the training phase, we develop a novel spatio-temporal generator that can synthesize various high-quality "fake" videos in large quantities at a low cost, which enables our model to learn more general spatio-temporal representations in a self-supervised manner. (3). Additinally, to take full advantage of synthetic "fake" videos, we adopt diversity losses at both frame and video levels to explore the diversity of clues in "fake" videos. Our proposed framework is data-efficient and does not require any real-world deepfake videos. Extensive experiments demonstrate that our method significantly improves the generalization capability. Particularly on the most challenging CDF and DFDC datasets, our method outperforms the baselines by 8.88% and 7.73% points, respectively.

References

  1. [n. d.]. Contributing data to deepfake detection research. https://ai.googleblog. com/2019/09/contributing-data-to-deepfake-detection.html. Accessed: 2021-11-13.Google ScholarGoogle Scholar
  2. [n. d.]. FaceSwap. https://github.com/MarekKowalski/FaceSwap. [Accessed: 2020-11-12].Google ScholarGoogle Scholar
  3. Darius Afchar, Vincent Nozick, Junichi Yamagishi, and Isao Echizen. 2018. Mesonet: a compact facial video forgery detection network. In 2018 IEEE international workshop on information forensics and security (WIFS). IEEE, 1--7.Google ScholarGoogle ScholarCross RefCross Ref
  4. Anurag Arnab, Mostafa Dehghani, Georg Heigold, Chen Sun, Mario Lučić, and Cordelia Schmid. 2021. Vivit: A video vision transformer. In Proceedings of the IEEE/CVF international conference on computer vision. 6836--6846.Google ScholarGoogle ScholarCross RefCross Ref
  5. Jianmin Bao, Dong Chen, Fang Wen, Houqiang Li, and Gang Hua. 2017. CVAE-GAN: fine-grained image generation through asymmetric training. In Proceedings of the IEEE international conference on computer vision. 2745--2754.Google ScholarGoogle ScholarCross RefCross Ref
  6. Belhassen Bayar and Matthew C Stamm. 2016. A deep learning approach to universal image manipulation detection using a new convolutional layer. In Proceedings of the 4th ACM workshop on information hiding and multimedia security. 5--10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Gedas Bertasius, Heng Wang, and Lorenzo Torresani. 2021. Is space-time attention all you need for video understanding?. In International conference on machine learning, Vol. 2. 4.Google ScholarGoogle Scholar
  8. Ben Pflaum Nicole Baram Cristian Canton Ferrer Brian Dolhansky, Russ Howes. 2019. The Deepfake Detection Challenge (DFDC) Preview Dataset. arXiv 1910.08854 (2019).Google ScholarGoogle Scholar
  9. Alexander Buslaev, Vladimir I Iglovikov, Eugene Khvedchenya, Alex Parinov, Mikhail Druzhinin, and Alexandr A Kalinin. 2020. Albumentations: fast and flexible image augmentations. Information 11, 2 (2020), 125.Google ScholarGoogle ScholarCross RefCross Ref
  10. Junyi Cao, Chao Ma, Taiping Yao, Shen Chen, Shouhong Ding, and Xiaokang Yang. 2022. End-to-End Reconstruction-Classification Learning for Face Forgery Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4113--4122.Google ScholarGoogle ScholarCross RefCross Ref
  11. Lucy Chai, David Bau, Ser-Nam Lim, and Phillip Isola. 2020. What makes fake images detectable? understanding properties that generalize. In European conference on computer vision. Springer, 103--120.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Liang Chen, Yong Zhang, Yibing Song, Lingqiao Liu, and Jue Wang. 2022. Self-supervised Learning of Adversarial Example: Towards Good Generalizations for Deepfake Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18710--18719.Google ScholarGoogle ScholarCross RefCross Ref
  13. Liang Chen, Yong Zhang, Yibing Song, Jue Wang, and Lingqiao Liu. 2022. OST: Improving Generalization of DeepFake Detection via One-Shot Test-Time Training. In Advances in Neural Information Processing Systems, Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho (Eds.). https://openreview.net/forum?id=YPoRoad6gzYGoogle ScholarGoogle Scholar
  14. Kevin Dale, Kalyan Sunkavalli, Micah K Johnson, Daniel Vlasic, Wojciech Matusik, and Hanspeter Pfister. 2011. Video face replacement. In Proceedings of the 2011 SIGGRAPH Asia conference. 1--10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Hao Dang, Feng Liu, Joel Stehouwer, Xiaoming Liu, and Anil K Jain. 2020. On the detection of digital face manipulation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern recognition. 5781--5790.Google ScholarGoogle ScholarCross RefCross Ref
  16. Sowmen Das, Selim Seferbekov, Arup Datta, Md Islam, Md Amin, et al. 2021. Towards solving the deepfake problem: An analysis on improving deepfake detection using dynamic face augmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3776--3785.Google ScholarGoogle ScholarCross RefCross Ref
  17. Jiankang Deng, Jia Guo, Evangelos Ververas, Irene Kotsia, and Stefanos Zafeiriou. 2020. Retinaface: Single-shot multi-level face localisation in the wild. In Pro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5203--5212.Google ScholarGoogle Scholar
  18. Brian Dolhansky, Russ Howes, Ben Pflaum, Nicole Baram, and Cristian Canton-Ferrer. 2019. The Deepfake Detection Challenge (DFDC) Preview Dataset. CoRR abs/1910.08854 (2019).Google ScholarGoogle Scholar
  19. Shichao Dong, Jin Wang, Renhe Ji, Jiajun Liang, Haoqiang Fan, and Zheng Ge. 2023. Implicit Identity Leakage: The Stumbling Block to Improving Deepfake Detection Generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  20. Xiaoyi Dong, Jianmin Bao, Dongdong Chen, Ting Zhang, Weiming Zhang, Neng-hai Yu, Dong Chen, Fang Wen, and Baining Guo. 2022. Protecting celebrities from deepfake with identity consistency transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9468--9478.Google ScholarGoogle ScholarCross RefCross Ref
  21. Ricard Durall, Margret Keuper, and Janis Keuper. 2020. Watch your up-convolution: Cnn based generative deep neural networks are failing to reproduce spectral distributions. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 7890--7899.Google ScholarGoogle ScholarCross RefCross Ref
  22. Quanfu Fan, Rameswar Panda, et al. 2021. An Image Classifier Can Suffice For Video Understanding. arXiv preprint arXiv:2106.14104 (2021).Google ScholarGoogle Scholar
  23. Christoph Feichtenhofer, Haoqi Fan, Yanghao Li, and Kaiming He. 2022. Masked Autoencoders As Spatiotemporal Learners. arXiv preprint arXiv:2205.09113 (2022).Google ScholarGoogle Scholar
  24. Pierre Foret, Ariel Kleiner, Hossein Mobahi, and Behnam Neyshabur. 2020. Sharpness-aware minimization for efficiently improving generalization. arXiv preprint arXiv:2010.01412 (2020).Google ScholarGoogle Scholar
  25. Joel Frank, Thorsten Eisenhofer, Lea Schönherr, Asja Fischer, Dorothea Kolossa, and Thorsten Holz. 2020. Leveraging frequency analysis for deep fake image recognition. In International conference on machine learning. PMLR, 3247--3258.Google ScholarGoogle Scholar
  26. Pablo Garrido, Levi Valgaerts, Ole Rehmsen, Thorsten Thormahlen, Patrick Perez, and Christian Theobalt. 2014. Automatic face reenactment. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4217--4224.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Pablo Garrido, Levi Valgaerts, Hamid Sarmadi, Ingmar Steiner, Kiran Varanasi, Patrick Perez, and Christian Theobalt. 2015. Vdub: Modifying face video of actors for plausible visual alignment to a dubbed audio track. In Computer graphics forum, Vol. 34. Wiley Online Library, 193--204.Google ScholarGoogle Scholar
  28. Shiming Ge, Fanzhao Lin, Chenyu Li, Daichi Zhang, Weiping Wang, and Dan Zeng. 2022. Deepfake Video Detection via Predictive Representation Learning. ACM Trans. Multimedia Comput. Commun. Appl. 18, 2s, Article 115 (oct 2022), 21 pages. https://doi.org/10.1145/3536426Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2020. Generative adversarial networks. Commun. ACM 63, 11 (2020), 139--144.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Jinjin Gu, Yujun Shen, and Bolei Zhou. 2020. Image processing using multi-code gan prior. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3012--3021.Google ScholarGoogle ScholarCross RefCross Ref
  31. Zhihao Gu, Yang Chen, Taiping Yao, Shouhong Ding, Jilin Li, Feiyue Huang, and Lizhuang Ma. 2021. Spatiotemporal inconsistency learning for deepfake video detection. In Proceedings of the 29th ACM International Conference on Multimedia. 3473--3481.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Jiazhi Guan, Hang Zhou, Zhibin Hong, Errui Ding, Jingdong Wang, Cheng-bin Quan, and Youjian Zhao. 2022. Delving into Sequential Patches for Deep-fake Detection. In Advances in Neural Information Processing Systems, Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho (Eds.). https: //openreview.net/forum?id=osPA8Bs4MJBGoogle ScholarGoogle Scholar
  33. Alexandros Haliassos, Rodrigo Mira, Stavros Petridis, and Maja Pantic. 2022. Leveraging Real Talking Faces via Self-Supervision for Robust Forgery Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14950--14962.Google ScholarGoogle ScholarCross RefCross Ref
  34. Alexandros Haliassos, Konstantinos Vougioukas, Stavros Petridis, and Maja Pantic. 2021. Lips don't lie: A generalisable and robust approach to face forgery detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5039--5049.Google ScholarGoogle ScholarCross RefCross Ref
  35. Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Girshick. 2022. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16000--16009.Google ScholarGoogle ScholarCross RefCross Ref
  36. Zhenliang He, Wangmeng Zuo, Meina Kan, Shiguang Shan, and Xilin Chen. 2019. Attgan: Facial attribute editing by only changing what you want. IEEE transactions on image processing 28, 11 (2019), 5464--5478.Google ScholarGoogle Scholar
  37. Liming Jiang, Ren Li, Wayne Wu, Chen Qian, and Chen Change Loy. 2020. Deeperforensics-1.0: A large-scale dataset for real-world face forgery detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2889--2898.Google ScholarGoogle ScholarCross RefCross Ref
  38. Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In CVPR. 4401--4410.Google ScholarGoogle Scholar
  39. Sohail Ahmed Khan and Hang Dai. 2021. Video transformer for deepfake detection with incremental learning. In Proceedings of the 29th ACM International Conference on Multimedia. 1821--1828.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Ali Khodabakhsh, Raghavendra Ramachandra, Kiran Raja, Pankaj Wasnik, and Christoph Busch. 2018. Fake face detection methods: Can they be generalized?. In 2018 international conference of the biometrics special interest group (BIOSIG). IEEE, 1--6.Google ScholarGoogle ScholarCross RefCross Ref
  41. Davis E King. 2009. Dlib-ml: A machine learning toolkit. The Journal of Machine Learning Research 10 (2009), 1755--1758.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Iryna Korshunova, Wenzhe Shi, Joni Dambre, and Lucas Theis. 2017. Fast face-swap using convolutional neural networks. In Proceedings of the IEEE international conference on computer vision. 3677--3685.Google ScholarGoogle ScholarCross RefCross Ref
  43. Jiaming Li, Hongtao Xie, Jiahong Li, Zhongyuan Wang, and Yongdong Zhang. 2021. Frequency-aware discriminative feature learning supervised by single-center loss for face forgery detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 6458--6467.Google ScholarGoogle ScholarCross RefCross Ref
  44. Jiaming Li, Hongtao Xie, Lingyun Yu, and Yongdong Zhang. 2022. Wavelet-enhanced Weakly Supervised Local Feature Learning for Face Forgery Detection. In Proceedings of the 30th ACM International Conference on Multimedia. 1299--1308.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Lingzhi Li, Jianmin Bao, Hao Yang, Dong Chen, and Fang Wen. 2020. Advancing high fidelity identity swapping for forgery detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5074--5083.Google ScholarGoogle ScholarCross RefCross Ref
  46. Lingzhi Li, Jianmin Bao, Ting Zhang, Hao Yang, Dong Chen, Fang Wen, and Baining Guo. 2020. Face x-ray for more general face forgery detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5001--5010.Google ScholarGoogle ScholarCross RefCross Ref
  47. Xiaodan Li, Yining Lang, Yuefeng Chen, Xiaofeng Mao, Yuan He, Shuhui Wang, Hui Xue, and Quan Lu. 2020. Sharp multiple instance learning for deepfake video detection. In Proceedings of the 28th ACM international conference on multimedia. 1864--1872.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Yuezun Li and Siwei Lyu. 2018. Exposing deepfake videos by detecting face warping artifacts. arXiv preprint arXiv:1811.00656 (2018).Google ScholarGoogle Scholar
  49. Yuezun Li, Xin Yang, Pu Sun, Honggang Qi, and Siwei Lyu. 2020. Celeb-df: A large-scale challenging dataset for deepfake forensics. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3207--3216.Google ScholarGoogle ScholarCross RefCross Ref
  50. Honggu Liu, Xiaodan Li, Wenbo Zhou, Yuefeng Chen, Yuan He, Hui Xue, Weiming Zhang, and Nenghai Yu. 2021. Spatial-phase shallow learning: rethinking face forgery detection in frequency domain. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 772--781.Google ScholarGoogle ScholarCross RefCross Ref
  51. Ze Liu, Jia Ning, Yue Cao, Yixuan Wei, Zheng Zhang, Stephen Lin, and Han Hu. 2022. Video swin transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3202--3211.Google ScholarGoogle ScholarCross RefCross Ref
  52. Ilya Loshchilov and Frank Hutter. 2018. Decoupled Weight Decay Regularization. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  53. Yuchen Luo, Yong Zhang, Junchi Yan, and Wei Liu. 2021. Generalizing face forgery detection with high-frequency features. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 16317--16326.Google ScholarGoogle ScholarCross RefCross Ref
  54. Iacopo Masi, Aditya Killekar, Royston Marian Mascarenhas, Shenoy Pratik Gurudatt, and Wael AbdAlmageed. 2020. Two-branch recurrent network for isolating deepfakes in videos. In European conference on computer vision. Springer, 667--684.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Jacek Naruniec, Leonhard Helminger, Christopher Schroers, and Romann M Weber. 2020. High-resolution neural face swapping for visual effects. In Computer Graphics Forum, Vol. 39. Wiley Online Library, 173--184.Google ScholarGoogle Scholar
  56. Ryota Natsume, Tatsuya Yatagawa, and Shigeo Morishima. 2018. RSGAN: face swapping and editing using face and hair representation in latent spaces. In ACM SIGGRAPH 2018 Posters. 1--2.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Yuval Nirkin, Iacopo Masi, Anh Tran Tuan, Tal Hassner, and Gerard Medioni. 2018. On face segmentation, face swapping, and face perception. In 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018). IEEE, 98--105.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Yuyang Qian, Guojun Yin, Lu Sheng, Zixuan Chen, and Jing Shao. 2020. Thinking in frequency: Face forgery detection by mining frequency-aware clues. In European conference on computer vision. Springer, 86--103.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Andreas Rossler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. 2019. Faceforensics++: Learning to detect manipulated facial images. In Proceedings of the IEEE/CVF international conference on computer vision. 1--11.Google ScholarGoogle ScholarCross RefCross Ref
  60. Ekraam Sabir, Jiaxin Cheng, Ayush Jaiswal, Wael AbdAlmageed, Iacopo Masi, and Prem Natarajan. 2019. Recurrent convolutional strategies for face manipulation detection in videos. Interfaces (GUI) 3, 1 (2019), 80--87.Google ScholarGoogle Scholar
  61. Kaede Shiohara and Toshihiko Yamasaki. 2022. Detecting Deepfakes with Self-Blended Images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18720--18729.Google ScholarGoogle ScholarCross RefCross Ref
  62. Luchuan Song, Zheng Fang, Xiaodan Li, Xiaoyi Dong, Zhenchao Jin, Yuefeng Chen, and Siwei Lyu. 2022. Adaptive Face Forgery Detection in Cross Domain. In Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XXXIV. Springer, 467--484.Google ScholarGoogle Scholar
  63. Luchuan Song, Xiaodan Li, Zheng Fang, Zhenchao Jin, YueFeng Chen, and Chen-liang Xu. 2022. Face Forgery Detection via Symmetric Transformer. In Proceedings of the 30th ACM International Conference on Multimedia. 4102--4111.Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Ke Sun, Taiping Yao, Shen Chen, Shouhong Ding, Jilin Li, and Rongrong Ji. 2022. Dual contrastive learning for general face forgery detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 2316--2324.Google ScholarGoogle ScholarCross RefCross Ref
  65. Zhan Tong, Yibing Song, Jue Wang, and Limin Wang. 2022. Videomae: Masked autoencoders are data-efficient learners for self-supervised video pre-training. arXiv preprint arXiv:2203.12602 (2022).Google ScholarGoogle Scholar
  66. Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research 9, 11 (2008).Google ScholarGoogle Scholar
  67. Chengrui Wang and Weihong Deng. 2021. Representative forgery mining for fake face detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 14923--14932.Google ScholarGoogle ScholarCross RefCross Ref
  68. Xinggang Wang, Yongluan Yan, Peng Tang, Xiang Bai, and Wenyu Liu. 2018. Revisiting multiple instance neural networks. Pattern Recognition 74 (2018), 15--24.Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Zhi Wang, Yiwen Guo, and Wangmeng Zuo. 2022. Deepfake Forensics via An Adversarial Game. IEEE Transactions on Image Processing (2022).Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Yanbo Xu, Yueqin Yin, Liming Jiang, Qianyi Wu, Chengyao Zheng, Chen Change Loy, Bo Dai, and Wayne Wu. 2022. TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7683--7692.Google ScholarGoogle ScholarCross RefCross Ref
  71. Xinsheng Xuan, Bo Peng, Wei Wang, and Jing Dong. 2019. On the generalization of GAN image forensics. In Chinese conference on biometric recognition. Springer, 134--141.Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Sukmin Yun, Jaehyung Kim, Dongyoon Han, Hwanjun Song, Jung-Woo Ha, and Jinwoo Shin. 2022. Time Is MattEr: Temporal Self-supervision for Video Transformers. In International Conference on Machine Learning. PMLR, 25804--25816.Google ScholarGoogle Scholar
  73. Daichi Zhang, Fanzhao Lin, Yingying Hua, Pengju Wang, Dan Zeng, and Shiming Ge. 2022. Deepfake video detection with spatiotemporal dropout transformer. In Proceedings of the 30th ACM International Conference on Multimedia. 5833--5841.Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Gang Zhang, Meina Kan, Shiguang Shan, and Xilin Chen. 2018. Generative adversarial network with spatial attention for face attribute editing. In Proceedings of the European conference on computer vision (ECCV). 417--432.Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Hanqing Zhao, Wenbo Zhou, Dongdong Chen, Tianyi Wei, Weiming Zhang, and Nenghai Yu. 2021. Multi-attentional deepfake detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2185--2194.Google ScholarGoogle ScholarCross RefCross Ref
  76. Tianchen Zhao, Xiang Xu, Mingze Xu, Hui Ding, Yuanjun Xiong, and Wei Xia. 2021. Learning self-consistency for deepfake detection. In Proceedings of the IEEE/CVF international conference on computer vision. 15023--15033.Google ScholarGoogle ScholarCross RefCross Ref
  77. Yinglin Zheng, Jianmin Bao, Dong Chen, Ming Zeng, and Fang Wen. 2021. Exploring temporal coherence for more general video face forgery detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 15044--15054Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Spatio-Temporal Catcher: A Self-Supervised Transformer for Deepfake Video Detection

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      MM '23: Proceedings of the 31st ACM International Conference on Multimedia
      October 2023
      9913 pages
      ISBN:9798400701085
      DOI:10.1145/3581783

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 27 October 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate995of4,171submissions,24%

      Upcoming Conference

      MM '24
      MM '24: The 32nd ACM International Conference on Multimedia
      October 28 - November 1, 2024
      Melbourne , VIC , Australia
    • Article Metrics

      • Downloads (Last 12 months)261
      • Downloads (Last 6 weeks)60

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader