Skip to main content

Visual Explanations for Exposing Potential Inconsistency of Deepfakes

  • Conference paper
  • First Online:
Digital Forensics and Watermarking (IWDW 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13825))

Included in the following conference series:

Abstract

In recent years, the rapid development of Deepfake has aroused public concerns. Existing Deepfake detection methods mainly focus on improving the accuracy. However, when real-world victims require additional interpretable results to refute, the accuracy of these methods is certainly insufficient. To mitigate this issue, we delve into forgery traces and propose a novel framework, named Find-X, that presents additional visual information as an explanation of the results. Specifically, we design a new module named Separation Potential Inconsistency (SPI) which aims to visually explain the forgery traces of fake videos. Find-x detection of Deepfake consists of three stages: (1) A frequency-aware module and a spatial-aware module to enhance the features. (2) A multi-scale feature extraction module to extract richer features. (3) A classification module and a SPI module to output the visual explanations. Our method outperforms state-of-the-art competitors on three popular benchmark datasets: FaceForensics++, Celeb-DF, and DeepFakeDetection. In addition, extensive visualization experiments on FaceForensics++ demonstrate that SPI can effectively separate the potentially inconsistent features of videos generated by five different Deepfake methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Arnab, A., Dehghani, M., Heigold, G., Sun, C., Lučić, M., Schmid, C.: ViViT: a video vision transformer. In: 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, pp. 6816–6826 (2021). https://doi.org/10.1109/ICCV48922.2021.00676

  2. Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, pp. 1800–1807 (2017). https://doi.org/10.1109/CVPR.2017.195

  3. Cozzolino, D., Rössler, A., Thies, J., Nießner, M., Verdoliva, L.: ID-reveal: identity-aware deepfake video detection. In: 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, pp. 15088–15097 (2021). https://doi.org/10.1109/ICCV48922.2021.01483

  4. Diao, Q., Jiang, Y., Wen, B., Sun, J., Yuan, Z.: MetaFormer: a unified meta framework for fine-grained recognition. CoRR abs/2203.02751 (2022). https://doi.org/10.48550/arXiv.2203.02751

  5. Dufour, N., Gully, A.: DeepFakeDetection dataset (2019). https://ai.googleblog.com/2019/09/contributing-data-to-deepfake-detection.html

  6. Fridrich, J.J., Kodovský, J.: Rich models for steganalysis of digital images. IEEE Trans. Inf. Forensics Secur. 7(3), 868–882 (2012). https://doi.org/10.1109/TIFS.2012.2190402

    Article  Google Scholar 

  7. Gu, Y., Zhao, X., Gong, C., Yi, X.: Deepfake video detection using audio-visual consistency. In: Zhao, X., Shi, Y.-Q., Piva, A., Kim, H.J. (eds.) IWDW 2020. LNCS, vol. 12617, pp. 168–180. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-69449-4_13

    Chapter  Google Scholar 

  8. Gu, Z., et al.: Spatiotemporal inconsistency learning for deepfake video detection. In: Shen, H.T., et al. (eds.) MM 2021: ACM Multimedia Conference, pp. 3473–3481. ACM, Virtual Event, China (2021). https://doi.org/10.1145/3474085.3475508

  9. Gu, Z., Chen, Y., Yao, T., Ding, S., Li, J., Ma, L.: Delving into the local: dynamic inconsistency learning for deepfake video detection. In: Thirty-Sixth AAAI Conference on Artificial Intelligence, pp. 744–752. AAAI Press, Virtual Event (2022)

    Google Scholar 

  10. Guo, J., Han, K., Wu, H., Xu, C., Tang, Y., Xu, C., Wang, Y.: CMT: convolutional neural networks meet vision transformers. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, Louisiana (2022)

    Google Scholar 

  11. Haliassos, A., Vougioukas, K., Petridis, S., Pantic, M.: Lips don’t lie: a generalisable and robust approach to face forgery detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, pp. 5039–5049. Virtual (2021)

    Google Scholar 

  12. Hu, J., Liao, X., Liang, J., Zhou, W., Qin, Z.: FInfer: frame inference-based deepfake detection for high-visual-quality videos. In: Thirty-Sixth AAAI Conference on Artificial Intelligence, pp. 951–959. AAAI Press, Virtual Event (2022)

    Google Scholar 

  13. Hu, Y., Zhao, H., Yu, Z., Liu, B., Yu, X.: Exposing deepfake videos with spatial, frequency and multi-scale temporal artifacts. In: Zhao, X., Piva, A., Comesaña-Alfaro, P. (eds.) IWDW 2021. LNCS, vol. 13180, pp. 47–57. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-95398-0_4

    Chapter  Google Scholar 

  14. Hu, Z., Xie, H., Wang, Y., Li, J., Wang, Z., Zhang, Y.: Dynamic inconsistency-aware deepfake video detection. In: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI 2021, pp. 736–742. Ijcai.org, Virtual Event/Montreal, Canada (2021). https://doi.org/10.24963/ijcai.2021/102

  15. Jiang, Y., Chang, S., Wang, Z.: TransGAN: two pure transformers can make one strong GAN, and that can scale up. In: Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, pp. 14745–14758. Virtual (2021)

    Google Scholar 

  16. Lee, C.C.: Elimination of redundant operations for a fast Sobel operator. IEEE Trans. Syst. Man Cybern. 13(2), 242–245 (1983). https://doi.org/10.1109/TSMC.1983.6313122

    Article  Google Scholar 

  17. Li, J., Xie, H., Li, J., Wang, Z., Zhang, Y.: Frequency-aware discriminative feature learning supervised by single-center loss for face forgery detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, pp. 6458–6467. Virtual (2021)

    Google Scholar 

  18. Li, L., et al.: Face X-ray for more general face forgery detection. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, pp. 5000–5009 (2020). https://doi.org/10.1109/CVPR42600.2020.00505

  19. Li, Y., Yang, X., Sun, P., Qi, H., Lyu, S.: Celeb-DF: a large-scale challenging dataset for deepfake forensics. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, pp. 3204–3213 (2020). https://doi.org/10.1109/CVPR42600.2020.00327

  20. Liu, H., et al.: Spatial-phase shallow learning: Rethinking face forgery detection in frequency domain. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, pp. 772–781. Virtual (2021)

    Google Scholar 

  21. Liu, R., et al.: FuseFormer: fusing fine-grained information in transformers for video inpainting. In: 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, pp. 14020–14029 (2021). https://doi.org/10.1109/ICCV48922.2021.01378

  22. Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2021)

    Google Scholar 

  23. Luo, Y., Zhang, Y., Yan, J., Liu, W.: Generalizing face forgery detection with high-frequency features. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, pp. 16317–16326. Computer Vision Foundation/IEEE, Virtual (2021). https://doi.org/10.1109/CVPR46437.2021.01605

  24. Pei, P., Zhao, X., Li, J., Cao, Y., Yi, X.: Vision transformer based video hashing retrieval for tracing the source of fake videos. CoRR abs/2112.08117 (2021). https://arxiv.org/abs/2112.08117

  25. Qian, Y., Yin, G., Sheng, L., Chen, Z., Shao, J.: Thinking in frequency: face forgery detection by mining frequency-aware clues. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12357, pp. 86–103. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58610-2_6

    Chapter  Google Scholar 

  26. Rössler, A., Cozzolino, D., Verdoliva, L., Riess, C., Thies, J., Nießner, M.: FaceForensics++: learning to detect manipulated facial images. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), pp. 1–11 (2019). https://doi.org/10.1109/ICCV.2019.00009

  27. Sorkine, O., Cohen-Or, D., Lipman, Y., Alexa, M., Rössl, C., Seidel, H.: Laplacian surface editing. In: Boissonnat, J., Alliez, P. (eds.) Second Eurographics Symposium on Geometry Processing, Nice, France, 8–10 July 2004. ACM International Conference Proceeding Series, Nice, France, vol. 71, pp. 175–184 (2004). https://doi.org/10.2312/SGP/SGP04/179-188

  28. Sun, Z., Han, Y., Hua, Z., Ruan, N., Jia, W.: Improving the efficiency and robustness of deepfakes detection through precise geometric features. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, pp. 3609–3618. Virtual (2021)

    Google Scholar 

  29. Tan, M., Le, Q.V.: EfficientNet: rethinking model scaling for convolutional neural networks. In: Proceedings of the 36th International Conference on Machine Learning, ICML 2019, Long Beach, California, vol. 97, pp. 6105–6114 (2019)

    Google Scholar 

  30. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA, pp. 5998–6008 (2017)

    Google Scholar 

  31. Wang, C., Deng, W.: Representative forgery mining for fake face detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, pp. 14923–14932. Virtual (2021)

    Google Scholar 

  32. Wang, W., Xie, E., Li, X., Fan, D.P., Shao, L.: PVTV 2: improved baselines with pyramid vision transformer. CoRR abs/2106.13797 (2021)

    Google Scholar 

  33. Wang, W., et al.: Pyramid vision transformer: a versatile backbone for dense prediction without convolutions. In: 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, pp. 548–558 (2021). https://doi.org/10.1109/ICCV48922.2021.00061

  34. Yang, C., Ma, J., Wang, S., Liew, A.W.: Preventing deepfake attacks on speaker authentication by dynamic lip movement analysis. IEEE Trans. Inf. Forensics Secur. 16, 1841–1854 (2021). https://doi.org/10.1109/TIFS.2020.3045937

    Article  Google Scholar 

  35. Yang, J., Li, A., Xiao, S., Lu, W., Gao, X.: MTD-net: Learning to detect deepfakes images by multi-scale texture difference. IEEE Trans. Inf. Forensics Secur. 16, 4234–4245 (2021). https://doi.org/10.1109/TIFS.2021.3102487

    Article  Google Scholar 

  36. Yuan, Y., et al.: HRFormer: high-resolution vision transformer for dense predict. In: Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, pp. 7281–7293. Virtual (2021)

    Google Scholar 

  37. Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016). https://doi.org/10.1109/LSP.2016.2603342

    Article  Google Scholar 

  38. Zhao, H., Zhou, W., Chen, D., Wei, T., Zhang, W., Yu, N.: Multi-attentional deepfake detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, pp. 2185–2194. Virtual (2021)

    Google Scholar 

  39. Zhao, T., Xu, X., Xu, M., Ding, H., Xiong, Y., Xia, W.: Learning self-consistency for deepfake detection. In: 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, pp. 15003–15013 (2021). https://doi.org/10.1109/ICCV48922.2021.01475

Download references

Acknowledgments

This work was supported by National Key Technology Research and Development Program under 2020AAA0140000.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xianfeng Zhao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pei, P., Zhao, X., Cao, Y., Hu, C. (2023). Visual Explanations for Exposing Potential Inconsistency of Deepfakes. In: Zhao, X., Tang, Z., Comesaña-Alfaro, P., Piva, A. (eds) Digital Forensics and Watermarking. IWDW 2022. Lecture Notes in Computer Science, vol 13825. Springer, Cham. https://doi.org/10.1007/978-3-031-25115-3_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-25115-3_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-25114-6

  • Online ISBN: 978-3-031-25115-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics