Abstract
Advances in artificial intelligence have led to a surge in digital forensics, resulting in numerous image manipulation and processing tools. Hackers and cybercriminals utilize these techniques to create counterfeit images and videos by placing perturbations on facial traits. We propose a novel defensive framework that employs temporal and spatially aware features to efficiently identify deepfakes. This paper utilizes the facial landmarks in the video to train a self-attenuated VGG16 neural model to obtain the spatial attributes. Further, we generate optical flow feature vectors that extract temporal characteristics from the spatial vector. Another necessity of deepfake detection systems is the need for cross-dataset generalization. We built a custom dataset comprising samples from FaceForensics, Celeb-DF, and Youtube videos. Experimental analysis shows that the system achieves a detection accuracy of 98.4%. We evaluate the robustness of our proposed framework under various adversarial settings, employing the Adversarial Robustness Toolbox, Foolbox, and CleverHans tools. The experimental evaluation shows that the proposed method can classify real and fake videos with an accuracy of 74.27% under diverse holistic conditions. An extensive empirical investigation to evaluate the cross-dataset generalization capacity of the proposed framework is also performed.











Similar content being viewed by others
References
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. Adv. Neural Info. Process. Syst. 27, (2014)
“Faceapp link,” (2015). Available from: https://www.faceapp.com/
Perov, I., Gao, D., Chervoniy, N., Liu, K., Marangonda, S., Umé, C., Dpfks, M., Facenheim, C.S., RP, L., Jiang, J., et al.: “Deepfacelab: A simple, flexible and extensible face swapping framework,” arXiv preprint arXiv:2005.05535, (2020)
WatchMojo, “Another top 10 deepfake videos.”
Thies, J., Zollhofer, M., Stamminger, M., Theobalt, C., Nießner, M.: Face2face: Real-time face capture and reenactment of rgb videos, In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2387–2395, (2016)
Yu, P., Xia, Z., Fei, J., Lu, Y.: A survey on deepfake video detection. Iet Biomet. 10(6), 607–624 (2021)
Li, Y., Yang, X., Sun, P., Qi, H., Lyu, S.: Celeb-df: A large-scale challenging dataset for deepfake forensics, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3207–3216, (2020)
Rossler, A., Cozzolino, D., Verdoliva, L., Riess, C., Thies, J., Nießner, M.: Faceforensics++: Learning to detect manipulated facial images, In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1–11, (2019)
Maria-Irina, Nicolae, Sinn, M., Tran, M.N., Rawat, A., Wistuba, M., Zantedeschi, V., Molloy, I., Edwards, B.: Adversarial robustness toolbox v0.10.0, CoRR, 1807.01069, (2018)
Rauber, J., Brendel, W., Bethge, M.: Foolbox: A python toolbox to benchmark the robustness of machine learning models, arXiv preprint arXiv:1707.04131, (2018)
Papernot, N., Faghri, F., Carlini, N., Goodfellow, I., Feinman, R., Kurakin, A., Xie, C., Sharma, Y., Brown, T., Roy, A., et al.: Technical report on the cleverhans v2. 1.0 adversarial examples library, arXiv preprint arXiv:1610.00768, (2016)
Wang, W., Jiang, X., Wang, S., Wan, M., Sun, T.: Identifying video forgery process using optical flow, In: International Workshop on Digital Watermarking, pp. 244–257, Springer, (2013)
Li, L., Bao, J., Zhang, T., Yang, H., Chen, D., Wen, F., Guo, B.: Face x-ray for more general face forgery detection, In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5001–5010, (2020)
Bonettini, N., Cannas, E.D., Mandelli, S., Bondi, L., Bestagini, P., Tubaro, S.: Video face manipulation detection through ensemble of cnns, In: 2020 25th international conference on pattern recognition (ICPR), pp. 5012–5019, IEEE, (2021)
Li, L., Bao, J., Zhang, T., Yang, H., Chen, D., Wen, F., Guo, B.: Face x-ray for more general face forgery detection, In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 5001–5010, (2020)
Amerini, I., Galteri, L., Caldelli, R., Del Bimbo, A.: Deepfake video detection through optical flow based cnn, in Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pp. 0–0, (2019)
Wang, R., Juefei-Xu, F., Ma, L., Xie, X., Huang, Y., Wang, J., Liu, Y.: Fakespotter: A simple yet robust baseline for spotting ai-synthesized fake faces, arXiv preprint arXiv:1909.06122, (2019)
Dang, H., Liu, F., Stehouwer, J., Liu, X., Jain, A.K.: On the detection of digital face manipulation, In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern recognition, pp. 5781–5790, (2020)
Chen, L., Zhang, Y., Song, Y., Wang, J., Liu, L.: Ost: Improving generalization of deepfake detection via one-shot test-time training, In: Advances in Neural Information Processing Systems, (2022)
Nadimpalli, A.V., Rattani, A.: On improving cross-dataset generalization of deepfake detectors, In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 91–99, (2022)
Wang, T., Cheng, H., Chow, K.P., Nie, L.: Deep convolutional pooling transformer for deepfake detection, arXiv preprint arXiv:2209.05299, (2022)
Tolosana, R., Vera-Rodriguez, R., Fierrez, J., Morales, A., Ortega-Garcia, J.: Deepfakes and beyond:a survey of face manipulation and fake detection. Inf. Fus. 64, 131–148 (2020)
Nirkin, Y., Keller, Y., Hassner, T.: Fsgan: Subject agnostic face swapping and reenactment, in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7184–7193, (2019)
Zhu, J.-Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks, In: Proceedings of the IEEE international conference on computer vision, pp. 2223–2232, (2017)
Tang, C., Chen, S., Fan, L., Xu, L., Liu, Y., Tang, Z., Dou, L.: A large-scale empirical study on industrial fake apps, In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), pp. 183–192, IEEE, (2019)
N. Mondaini, R. Caldelli, A. Piva, M. Barni, and V. Cappellini, “Detection of malevolent changes in digital video for forensic applications,” in Security, steganography, and watermarking of multimedia contents IX, vol. 6505, p. 65050T, International Society for Optics and Photonics, 2007
Fridrich, J., Kodovsky, J.: Rich models for steganalysis of digital images. IEEE Transa. Inf. Foren. Secur. 7(3), 868–882 (2012)
P. Bestagini, S. Milani, M. Tagliasacchi, and S. Tubaro, “Local tampering detection in video sequences,” in 2013 IEEE 15th international workshop on multimedia signal processing (MMSP), pp. 488–493, IEEE, 2013
X. Pan, X. Zhang, and S. Lyu, “Exposing image splicing with inconsistent local noise variances,” in 2012 IEEE International Conference on Computational Photography (ICCP), pp. 1–10, IEEE, 2012
F. Matern, C. Riess, and M. Stamminger, “Exploiting visual artifacts to expose deepfakes and face manipulations,” in 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW), pp. 83–92, IEEE, 2019
W. Luo, J. Huang, and G. Qiu, “Robust detection of region-duplication forgery in digital image,” in 18th International Conference on Pattern Recognition (ICPR’06), vol. 4, pp. 746–749, IEEE, 2006
M. Goljan and J. Fridrich, “Cfa-aware features for steganalysis of color images,” in Media Watermarking, Security, and Forensics 2015, vol. 9409, p. 94090V, International Society for Optics and Photonics, 2015
J. Huang, W. Zou, J. Zhu, and Z. Zhu, “Optical flow based real-time moving object detection in unconstrained scenes,” arXiv preprint arXiv:1807.04890, 2018
J. F. Cohn, A. J. Zlochower, J. J. Lien, and T. Kanade, “Feature-point tracking by optical flow discriminates subtle differences in facial expression,” in Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition, pp. 396–401, IEEE, 1998
Aslani, S., Mahdavi-Nasab, H.: Optical flow based moving object detection and tracking for traffic surveillance. Int. J. Electr. Comput. Energ. Electron. Commun. Eng. 7(9), 1252–1256 (2013)
Souhila, K., Karim, A.: Optical flow based robot obstacle avoidance. Int. J. Adv. Robot. Syst. 4(1), 2 (2007)
A. Dosovitskiy, P. Fischer, E. Ilg, P. Hausser, C. Hazirbas, V. Golkov, P. Van Der Smagt, D. Cremers, and T. Brox, “Flownet: Learning optical flow with convolutional networks,” in Proceedings of the IEEE international conference on computer vision, pp. 2758–2766, 2015
P. Viola and M. Jones, “Rapid object detection using a boosted cascade of simple features,” in Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR 2001, vol. 1, pp. I–I, IEEE, 2001
“Dlib python api tutorials link,” 2015. Available from: http://dlib.net/python/index.html
Horn, B.K., Schunck, B.G.: Determining optical flow. Artif. Intell. 17(1–3), 185–203 (1981)
G. Farnebäck, “Two-frame motion estimation based on polynomial expansion,” in Scandinavian conference on Image analysis, pp. 363–370, Springer, 2003
I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” arXiv preprint arXiv:1412.6572, 2014
A. Kurakin, I. Goodfellow, S. Bengio, et al., “Adversarial examples in the physical world,” 2016
A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” arXiv preprint arXiv:1706.06083, 2017
Papernot, N., McDaniel, P., Jha, Somesh, Fredrikson, Matt, Celik, Z. Berkay, Swami, A.: The limitations of deep learning in adversarial settings, In: 2016 IEEE European Symposium on Security and Privacy (EuroS &P), pp. 372–387, IEEE, (2016)
Carlini, N.: A critique of the deepsec platform for security analysis of deep learning models, arXiv preprint arXiv:1905.07112, (2019)
Rössler, A., Cozzolino, D., Verdoliva, L., Riess, C., Thies, J., Nießner, M.: Faceforensics: A large-scale video dataset for forgery detection in human faces, arXiv preprint arXiv:1803.09179, (2018)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. The authors did not receive support from any organization for the submitted work. This article does not contain any studies with human participants or animals performed by any of the authors. Data openly available in a public repository. The data that support the findings of this study are openly available in FaceForensics [8] at https://github.com/ondyari/FaceForensics, Celeb-DF [7] at https://www.cs.albany.edu/~lsw/celeb-deepfakeforensics.html.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Asha, S., Vinod, P. & Menon, V.G. A defensive framework for deepfake detection under adversarial settings using temporal and spatial features. Int. J. Inf. Secur. 22, 1371–1382 (2023). https://doi.org/10.1007/s10207-023-00695-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10207-023-00695-x