Exposing Deepfake Videos with Spatial, Frequency and Multi-scale Temporal Artifacts

Hu, Yongjian; Zhao, Hongjie; Yu, Zeqiong; Liu, Beibei; Yu, Xiangyu

doi:10.1007/978-3-030-95398-0_4

Yongjian Hu¹¹,
Hongjie Zhao¹¹,
Zeqiong Yu¹¹,
Beibei Liu¹¹ &
…
Xiangyu Yu¹¹

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 13180))

Included in the following conference series:

International Workshop on Digital Watermarking

945 Accesses
2 Citations

Abstract

The deepfake technique replaces the face in a source video with a fake face which is generated using deep learning tools such as generative adversarial networks (GANs). Even the facial expression can be well synchronized, making it difficult to identify the fake videos. Using features from multiple domains has been proved effective in the literature. It is also known that the temporal information is particularly critical in detecting deepfake videos, since the face-swapping of a video is implemented frame by frame. In this paper, we argue that the temporal differences between authentic and fake videos are complex and can not be adequately depicted from a single time scale. To obtain a complete picture of the temporal deepfake traces, we design a detection model with a short-term feature extraction module and a long-term feature extraction module. The short-term module captures the gradient information of adjacent frames. which is incorporated with the frequency and spatial information to make a multi-domain feature set. The long-term module then reveals the artifacts from a longer period of context. The proposed algorithm is tested on several popular databases, namely FaceForensics++, DeepfakeDetection (DFD), TIMIT-DF and FFW. Experimental results have validated the effectiveness of our algorithm through improved detection performance compared with related works.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Luo, Y., Zhang, Y., Yan, J., et al.: Generalizing face forgery detection with high-frequency features. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16317–16326 (2021)
Google Scholar
Odena, A., Dumoulin, V., Olah, C.: Deconvolution and checkerboard artifacts. Distill 1(10), e3 (2016)
Article Google Scholar
Popescu, A.C., Farid, H.: Exposing digital forgeries in color filter array interpolated images. IEEE Trans. Sig. Process. 53(10), 3948–3959 (2005)
Article MathSciNet Google Scholar
Frank, J., Eisenhofer, T., Schönherr, L., et al.: Leveraging frequency analysis for deep fake image recognition. In: International Conference on Machine Learning, pp. 3247–3258. PMLR (2020)
Google Scholar
Han, B., Han, X., Zhang, H., et al.: Fighting fake news: two stream network for deepfake detection via learnable SRM. IEEE Trans. Biom. Behav. Identity Sci. 3(3), 320–331 (2021)
Article Google Scholar
Wang, Z., et al.: Deep spatial gradient and temporal depth learning for face anti-spoofing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020)
Google Scholar
Yang, J., Li, A., Xiao, S., et al.: MTD-Net: learning to detect deepfakes images by multi-scale texture difference. IEEE Trans. Inf. Forensics Secur. 16, 4234–4245 (2021)
Article Google Scholar
Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(11), 2579–2605 (2008)
MATH Google Scholar
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)
Korshunov, P., Marcel, S.: Deepfakes: a new threat to face recognition? Assessment and detection. arXiv preprint arXiv:1812.08685 (2018)
Rossler, A., Cozzolino, D., Verdoliva, L., et al.: FaceForensics++: learning to detect manipulated facial images. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1–11 (2019)
Google Scholar
Khodabakhsh, A., Ramachandra, R., Raja, K., et al.: Fake face detection methods: can they be generalized? In: 2018 International Conference of the Biometrics Special Interest Group (BIOSIG), pp. 1–6. IEEE (2018)
Google Scholar
Afchar, D., Nozick, V., Yamagishi, J., et al.: MesoNet: a compact facial video forgery detection network. In: 2018 IEEE International Workshop on Information Forensics and Security (WIFS), pp. 1–7. IEEE (2018)
Google Scholar
Bayar, B., Stamm, M.C.: Constrained convolutional neural networks: a new approach towards general purpose image manipulation detection. IEEE Trans. Inf. Forensics Secur. 13(11), 2691–2706 (2018)
Article Google Scholar
Tariq, S., Lee, S., Kim, H., et al.: Detecting both machine and human created fake face images in the wild. In: Proceedings of the 2nd International Workshop on Multimedia Privacy and Security, pp. 81–87 (2018)
Google Scholar
Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017)
Google Scholar
Li, X., Lang, Y., Chen, Y., et al.: Sharp multiple instance learning for deepfake video detection. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 1864–1872 (2020)
Google Scholar
Dang, H., Liu, F., Stehouwer, J., et al.: On the detection of digital face manipulation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5781–5790 (2020)
Google Scholar

Download references

Funding

This work is supported by the National Key Research and Development Project under Grant 2019QY2202, China-Singapore International Joint Research Institute under Grant 206-A018001 and Science and Technology Foundation of Guangzhou Huangpu Development District under Grant 2019GH16.

Author information

Authors and Affiliations

South China University of Technology, Guangzhou, China
Yongjian Hu, Hongjie Zhao, Zeqiong Yu, Beibei Liu & Xiangyu Yu

Authors

Yongjian Hu
View author publications
You can also search for this author in PubMed Google Scholar
Hongjie Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Zeqiong Yu
View author publications
You can also search for this author in PubMed Google Scholar
Beibei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xiangyu Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Beibei Liu .

Editor information

Editors and Affiliations

Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Xianfeng Zhao
University of Florence, Florence, Italy
Alessandro Piva
Universidade de Vigo, Vigo, Spain
Pedro Comesaña-Alfaro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hu, Y., Zhao, H., Yu, Z., Liu, B., Yu, X. (2022). Exposing Deepfake Videos with Spatial, Frequency and Multi-scale Temporal Artifacts. In: Zhao, X., Piva, A., Comesaña-Alfaro, P. (eds) Digital Forensics and Watermarking. IWDW 2021. Lecture Notes in Computer Science(), vol 13180. Springer, Cham. https://doi.org/10.1007/978-3-030-95398-0_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-95398-0_4
Published: 21 January 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-95397-3
Online ISBN: 978-3-030-95398-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics