Detecting facial manipulated images via one-class domain generalization

Xu, Pengxiang; Ma, Zhiyuan; Mei, Xue; Shen, jie

doi:10.1007/s00530-023-01214-7

Detecting facial manipulated images via one-class domain generalization

Regular Paper
Published: 19 January 2024

Volume 30, article number 33, (2024)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Pengxiang Xu^1,2^na1,
Zhiyuan Ma¹^na1,
Xue Mei¹ &
…
jie Shen¹

273 Accesses
2 Citations
Explore all metrics

Abstract

Nowadays, numerous synthesized images and videos generated by facial manipulated techniques have become an emerging problem, which promotes facial manipulation detection to be a significant topic. Much concern about the use of synthesized facial digital contents in society is rising due to their deceptive nature and widespread. To detect such manipulated facial digital contents, many methods have been proposed. Most detection methods focus on specific datasets. It is hard for them to detect facial images or videos manipulated by unknown face synthesis algorithms. In this paper, we propose a method to improve the generalization ability of the facial manipulation detection model using one-class domain generalization. We shape the problem into domain generalization. We divide the dataset into several domains according to different manipulation algorithms. We also try to process the images from the perspective of frequency domain. We utilize two-dimensional wavelet transform to preprocess the images to ensure the effect on compressed images. The results of experiments implemented on FaceForensics++ dataset exceed the baselines and recent works. The feature visualization analyses intuitively show that our method can learn robust feature representation that can be generalized to unseen domains.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving the generalization of face forgery detection via single domain augmentation

Article 08 January 2024

Robust manipulated media localization and detection based on high frequency and texture features

Article Open access 10 February 2025

Low-complexity fake face detection based on forensic similarity

Article 25 February 2021

Data availability

The data are available from the corresponding author on reasonable request.

References

Akhtar, N., Mian, A.: Threat of adversarial attacks on deep learning in computer vision: a survey. IEEE Access 6, 14410–14430 (2018). https://doi.org/10.1109/ACCESS.2018.2807385
Article Google Scholar
Aneja, S., Nießner, M.: Generalized zero and few-shot transfer for facial forgery detection, (2020). arXiv preprint arXiv:2006.11863. Accessed 30 Nov 2021
Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1800–1807 (2017). https://doi.org/10.1109/CVPR.2017.195
Chu, X., Jin, Y., Zhu, W., Wang, Y., Wang, X., Zhang, S., Mei, H.: DNA: domain generalization with diversified neural averaging. In: Proceedings of the 39th International Conference on Machine Learning, vol. 162, pp. 4010–4034 (2022)
Dang, H., Liu, F., Stehouwer, J., Liu, X., Jain, A.K.: On the detection of digital face manipulation. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5780–5789 (2020). https://doi.org/10.1109/CVPR42600.2020.00582
Dolhansky, B., Howes, R., Pflaum, B., Baram, N., Ferrer, C.C.: The deepfake detection challenge (dfdc) preview dataset (2019). arXiv preprint arXiv:1910.08854. Accessed 30 Nov 2021
Durall, R., Keuper, M., Keuper, J.: Watch your up-convolution: Cnn based generative deep neural networks are failing to reproduce spectral distributions. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7887–7896 (2020). https://doi.org/10.1109/CVPR42600.2020.00791
Frank, J., Eisenhofer, T., Schönherr, L., Fischer, A., Kolossa, D., Holz, T.: Leveraging frequency analysis for deep fake image recognition. In: Proceedings of the 37th International Conference on Machine Learning, vol. 119, pp. 3247–3258 (2020)
Ganin, Y., Lempitsky, V.: Unsupervised domain adaptation by backpropagation. In: Proceedings of the 32nd International Conference on Machine Learning, vol. 37, pp. 1180–1189 (2015)
Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., Marchand, M., Lempitsky, V.: Domain-adversarial training of neural networks. J. Mach. Learn. Res. 17(1), 2096–2130 (2016)
MathSciNet Google Scholar
Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F.A., Brendel, W.: Imagenet-trained cnns are biased towards texture; increasing shape bias improves accuracy and robustness, (2018). arXiv preprint arXiv:1811.12231. Accessed 15 Sept 2022
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. Advances in Neural Information Processing Systems. 27, 1–9 (2014)
Google Scholar
Huang, Y., Juefei-Xu, F., Wang, R., Xie, X., Ma, L., Li, J., Miao, W., Liu, Y., Pu, G.: Fakelocator: robust localization of gan-based face manipulations via semantic segmentation networks with bells and whistles, (2020). arXiv preprint arXiv:2001.09598. Accessed 15 Sept 2022
Jung, T., Kim, S., Kim, K.: Deepvision: deepfakes detection using human eye blinking pattern. IEEE Access 8, 83144–83154 (2020). https://doi.org/10.1109/ACCESS.2020.2988660
Article Google Scholar
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of gans for improved quality, stability, and variation, (2017). arXiv preprint arXiv:1710.10196. Accessed 20 Oct 2022
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. IEEE Trans. Pattern Anal. Mach. Intell. 43(12), 4217–4228 (2021). https://doi.org/10.1109/TPAMI.2020.2970919
Article PubMed Google Scholar
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of stylegan. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8107–8116 (2020). https://doi.org/10.1109/CVPR42600.2020.00813
Kim, D.-K., Kim, D., Kim, K.: Facial manipulation detection based on the color distribution analysis in edge region, (2021). arXiv preprint arXiv:2102.01381. Accessed 20 Oct 2022
King, D.E.: Dlib-ml: a machine learning toolkit. J. Mach. Learn. Res. 10, 1755–1758 (2009)
Google Scholar
Kowalski, M.: Faceswap, (2018). https://github.com/marekkowalski/faceswap
Li, Y., Yang, X., Sun, P., Qi, H., Lyu, S.: Celeb-df: a large-scale challenging dataset for deepfake forensics. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3204–3213 (2020). https://doi.org/10.1109/CVPR42600.2020.00327
Liu, D., Dang, Z., Peng, C., Zheng, Y., Li, S., Wang, N., Gao, X.: Fedforgery: generalized face forgery detection with residual federated learning, (2022). arXiv preprint arXiv:2210.09563. Accessed 20 Oct 2022
Liu, Z., Qi, X., Torr, P.H.S.: Global texture enhancement for fake face detection in the wild. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8057–8066 (2020). https://doi.org/10.1109/CVPR42600.2020.00808
Matern, F., Riess, C., Stamminger, M.: Exploiting visual artifacts to expose deepfakes and face manipulations. In: 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW), pp. 83–92 (2019). https://doi.org/10.1109/WACVW.2019.00020
Pu, Y., Gan, Z., Henao, R., Yuan, X., Li, C., Stevens, A., Carin, L.: Variational autoencoder for deep learning of images, labels and captions. Advances in Neural Information Processing Systems. 29, 1–9 (2016)
Google Scholar
Rangwani, H., Aithal, S.K., Mishra, M., Jain, A., Radhakrishnan, V.B.: A closer look at smoothness in domain adversarial training. In: Proceedings of the 39th International Conference on Machine Learning, vol. 162, pp. 18378–18399, (2022)
Rössler, A., Cozzolino, D., Verdoliva, L., Riess, C., Thies, J., Niessner, M.: Faceforensics++: learning to detect manipulated facial images. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1–11 (2019). https://doi.org/10.1109/ICCV.2019.00009
Sun, K., Liu, H., Ye, Q., Gao, Y., Liu, J., Shao, L., Ji, R.: Domain general face forgery detection by learning to weight. In: Proceedings of the AAAI conference on artificial intelligence, vol. 35, pp. 2638–2646 (2021). https://doi.org/10.1609/aaai.v35i3.16367
Suwajanakorn, S., Seitz, S.M., Kemelmacher-Shlizerman, I.: Synthesizing obama: learning lip sync from audio. ACM Trans. Graph. (2017). https://doi.org/10.1145/3072959.3073640
Article Google Scholar
Thies, J., Zollhöfer, M., Nießner, M.: Deferred neural rendering: image synthesis using neural textures. ACM Trans. Graph. 10(1145/3306346), 3323035 (2019)
Google Scholar
Thies, J., Zollhöfer, M., Stamminger, M., Theobalt, C., Nießner, M.: Face2face: real-time face capture and reenactment of rgb videos. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2387–2395 (2016). https://doi.org/10.1109/CVPR.2016.262
Tolosana, R., Vera-Rodriguez, R., Fierrez, J., Morales, A., Ortega-Garcia, J.: Deepfakes and beyond: a survey of face manipulation and fake detection. Inf. Fusion 64, 131–148 (2020). https://doi.org/10.1016/j.inffus.2020.06.014
Article Google Scholar
Tora, M.: Deepfakes, (2018). https://github.com/deepfakes/faceswap/tree/v2.0.0
Tran, V.-N., Kwon, S.-G., Lee, S.-H., Le, H.-S., Kwon, K.-R.: Generalization of forgery detection with meta deepfake detection model. IEEE Access 11, 535–546 (2023). https://doi.org/10.1109/ACCESS.2022.3232290
Article Google Scholar
Van Der Maaten, L.: Barnes-hut-sne, (2013). arXiv preprint arXiv:1710.10196. Accessed 15 Apr 2021
Yu, P., Fei, J., Xia, Z., Zhou, Z., Weng, J.: Improving generalization by commonality learning in face forgery detection. IEEE Trans. Inf. Forensics Secur. 17, 547–558 (2022). https://doi.org/10.1109/TIFS.2022.3146781
Article Google Scholar
Zhang, X., Karaman, S., Chang, S.-F.: Detecting and simulating artifacts in gan fake images. In: 2019 IEEE International Workshop on Information Forensics and Security (WIFS), pp. 1–6, (2019). https://doi.org/10.1109/WIFS47025.2019.9035107
Zhang, X., Wang, S., Liu, C., Zhang, M., Liu, X., Xie, H.: Thinking in patch: towards generalizable forgery detection with patch transformation. In: Duc Nghia Pham, Thanaruk Theeramunkong, Guido Governatori, and Fenrong Liu, editors, PRICAI 2021: Trends in Artificial Intelligence, pp. 337–352, (2021). https://doi.org/10.1007/978-3-030-89370-5_25
Zhao, H., Wei, T., Zhou, W., Zhang, W., Chen, D., Yu, N.: Multi-attentional deepfake detection. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2185–2194, (2021). https://doi.org/10.1109/CVPR46437.2021.00222
Zhou, P., Han, X., Morariu, V.I., Davis, L.S.: Two-stream neural networks for tampered face detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1831–1839, (2017). https://doi.org/10.1109/CVPRW.2017.229
Zhu, J.-Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2242–2251, (2017). https://doi.org/10.1109/ICCV.2017.244
Zhuang, W., Chu, Q., Tan, Z., Liu, Q., Yuan, H., Miao, C., Luo, Z., Yu, N.: Uia-vit: unsupervised inconsistency-aware method based on vision transformer for face forgery detection. In: S. Avidan, G. Brostow, M. Cissé, G. Maria Farinella, T. Hassner, editors, Computer Vision – ECCV 2022, pp. 391–407, (2022). https://doi.org/10.1007/978-3-031-20065-6_23
Zhuang, W., Chu, Q., Yuan, H., Miao, C., Liu, B., Yu, N.: Towards intrinsic common discriminative features learning for face forgery detection using adversarial learning. In: 2022 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6, (2022). https://doi.org/10.1109/ICME52920.2022.9859586

Download references

Acknowledgements

This work was supported by Research Center of Security Video and Image Processing Engineering Technology of Guizhou (China) under Grant SRC-Open Project ([2020]001]).

Funding

This article is funded by Research Center of Security Video and Image Processing Engineering Technology of Guizhou (China) under Grant SRC-Open Project [2020]001]: Pengxiang Xu, Zhiyuan Ma, Xue Mei.

Author information

Pengxiang Xu and Zhiyuan Ma have contributed equally to this work.

Authors and Affiliations

College of Electrical Engineering and Control Science, Nanjing Tech University, No. 30 Puzhu South Road, Nanjing, 211816, Jiangsu, China
Pengxiang Xu, Zhiyuan Ma, Xue Mei & jie Shen
School of Computer Science and Engineering, Nanjing University of Science and Technology, No. 200 Xiaolingwei Street, Nanjing, 210094, Jiangsu, China
Pengxiang Xu

Authors

Pengxiang Xu
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyuan Ma
View author publications
You can also search for this author in PubMed Google Scholar
Xue Mei
View author publications
You can also search for this author in PubMed Google Scholar
jie Shen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xue Mei.

Additional information

Communicated by R. Huang.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Xu, P., Ma, Z., Mei, X. et al. Detecting facial manipulated images via one-class domain generalization. Multimedia Systems 30, 33 (2024). https://doi.org/10.1007/s00530-023-01214-7

Download citation

Received: 08 December 2022
Accepted: 08 December 2023
Published: 19 January 2024
DOI: https://doi.org/10.1007/s00530-023-01214-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detecting facial manipulated images via one-class domain generalization

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Improving the generalization of face forgery detection via single domain augmentation

Robust manipulated media localization and detection based on high frequency and texture features

Low-complexity fake face detection based on forensic similarity

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now