Skip to main content

Advertisement

Log in

Identify videos with facial manipulations based on convolution neural network and dynamic texture

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Recent facial manipulation techniques based on deep learning can create a highly realistic face by changing expression, attributes, identity, or creating an entire face synthesis, that called recently Deep-Fake. With the rapid appearance of such applications, they have raised great security concerns. Therefore, corresponding forensic techniques are proposed to tackle this issue. However, existing techniques are either based on complex deep networks with a binary classification that are unable to distinguish between facial manipulation types or rely on fragile hand-crafted features with unsatisfactory results. To overcome these issues, we propose a learning-based detection method by creating an uncomplicated CNN network called FMD-Net relying on the dynamic textures as input. Moreover, it is able to distinguish between facial manipulation types such as Deepfake, Face2Face, FaceSwap, and NeuralTexture. By using dynamic textures of each video shot, motion and appearance features are combined which helped the network learn manipulation artifacts and provides a robust performance at various compression levels. We conduct extensive experiments on various benchmark datasets (FaceForensics++, DFDC, and Celeb-DF) to empirically demonstrate the superiority and effectiveness of the proposed method with both binary classification and multi-classification against the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Notes

  1. Small sequence of frames

  2. face detection is based on classification and regression tree analysis (CART) [31].

  3. https://github.com/ondyari/FaceForensics

  4. H.264 codec was used to compress all videos with the quantization parameters 23 for light compression (C23) and 40 for strong compression (C40)

  5. https://www.kaggle.com/c/deepfake-detection-challenge

  6. http://www.cs.albany.edu/~lsw/celeb-deepfakeforensics.html

References

  1. Afchar D, Nozick V, Yamagishi J, Echizen I (2018) Mesonet: a compact facial video forgery detection network. In: IEEE international workshop on information forensics and security (WIFS), vol 2018. IEEE, pp 1–7

  2. Amrani M, Hammad M, Jiang F, Wang K, Amrani A (2018) Very deep feature extraction and fusion for arrhythmias detection. Neural Comput & Applic 30(7):2047–2057

    Article  Google Scholar 

  3. Arora M, Kumar M (2021) Autofer: Pca and pso based automatic facial emotion recognition. Multimed Tools Appl 80(2):3039–3049

    Article  Google Scholar 

  4. (auth) PK (2017) MATLAB Deep Learning: With Machine Learning, Neural Networks and Artificial Intelligence,1st edn. Apress

  5. Bakas J, Naskar R, Dixit R (2019) Detection and localization of inter-frame video forgeries based on inconsistency in correlation distribution between haralick coded frames. Multimedia Tools and Applications 78(4):4905–4935 . https://doi.org/10.1007/s11042-018-6570-8

    Article  Google Scholar 

  6. Bansal M, Kumar M, Kumar M, Kumar K (2021) An efficient technique for object recognition using shi-tomasi corner detection algorithm. Soft Comput 25(6):4423–4432

    Article  Google Scholar 

  7. Bayar B, Stamm MC (2016) A deep learning approach to universal image manipulation detection using a new convolutional layer. In: Proceedings of the 4th ACM workshop on information hiding and multimedia security. ACM, pp 5–10

  8. Bishop CM (2006) Pattern recognition and machine learning. springer, Berlin

    MATH  Google Scholar 

  9. Boylan JF (2018) The new york times will deepfake technology destroy democracy?. https://www.nytimes.com/2018/10/17/opinion/deep-fake-technology-democracy.html

  10. Chollet F (2017) Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR

  11. Cozzolino D, Poggi G, Verdoliva L (2017) Recasting residual-based local descriptors as convolutional neural networks: an application to image forgery detection. In: Proceedings of the 5th ACM Workshop on information hiding and multimedia security. ACM, pp 159–164

  12. Dargan S, Kumar M, Ayyagari MR, Kumar G (2019) A survey of deep learning and its applications: a new paradigm to machine learning. Arch Comput Methods Eng, 1–22

  13. Dolhansky B, Howes R, Pflaum B, Baram N, Ferrer CC (2019) The deepfake detection challenge (dfdc) preview dataset. arXiv:191008854

  14. Doretto G, Chiuso A, Wu YN, Soatto S (2003) Dynamic textures. Int J Comput Vis 51(2):91–109

    Article  MATH  Google Scholar 

  15. Elaskily MA, Elnemr HA, Dessouky MM, Faragallah OS (2019) Two stages object recognition based copy-move forgery detection algorithm. Multimed Tools Appl 78(11):15353–15373. https://doi.org/10.1007/s11042-018-6891-7

    Article  Google Scholar 

  16. Fadl S, Han Q, Qiong L (2020) Exposing video inter-frame forgery via histogram of oriented gradients and motion energy image. Multidim Syst Sign Process. 1–20

  17. Fadl SM, Semary NA (2017) Robust copy–move forgery revealing in digital images using polar coordinate system. Neurocomputing 265:57–65. https://doi.org/10.1016/j.neucom.2016.11.091

    Article  Google Scholar 

  18. Fridrich J, Kodovsky J (2012) Rich models for steganalysis of digital images. IEEE Trans Inf Forensics Secur 7(3):868–882. https://doi.org/10.1109/TIFS.2012.2190402

    Article  Google Scholar 

  19. Fung S, Lu X, Zhang C, Li CT (2021) Deepfakeucl: Deepfake detection via unsupervised contrastive learning. arXiv:210411507

  20. Gupta S, Mohan N, Kumar M (2020) A study on source device attribution using still images. Arch Comput Methods Eng 1–15

  21. Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv:150203167

  22. K S, Mehtre B (2018) Detection of inter-frame forgeries in digital videos. https://doi.org/10.1016/j.forsciint.2018.04.056https://doi.org/10.1016/j.forsciint.2018.04.056. http://www.sciencedirect.com/science/article/pii/S0379073818302809. Forensic Sci Int 289:186–206

    Google Scholar 

  23. Khalid H, Woo SS (2020) Oc-fakedect: Classifying deepfakes using one-class variational autoencoder. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 656–657

  24. Korshunov P, Marcel S (2018) Deepfakes: a new threat to face recognition? assessment and detection. arXiv:181208685

  25. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  26. Kumar A, Kumar M, Kaur A (2021a) Face detection in still images under occlusion and non-uniform illumination. Multimed Tools Appl 80(10):14565–14590

  27. Kumar M, Kumar M et al (2021b) Xgboost: 2d-object recognition using shape descriptors and extreme gradient boosting classifier. In: Computational methods and data engineering. Springer, pp 207–222

  28. Kumar P, Vatsa M, Singh R (2020) Detecting face2face facial reenactment in videos. In: The IEEE winter conference on applications of computer vision (WACV)

  29. Laws KI (1980) Textured image segmentation. Tech. rep. University of Southern California Los Angeles Image Processing INST

  30. Li Y, Yang X, Sun P, Qi H, Lyu S (2020) Celeb-df: A large-scale challenging dataset for deepfake forensics

  31. Lienhart R, Kuranov A, Pisarevsky V (2003) Empirical analysis of detection cascades of boosted classifiers for rapid object detection. In: Michaelis B, Krell G (eds) Recognition, pattern. Springer, Berlin, pp 297–304

  32. Matern F, Riess C, Stamminger M (2019) Exploiting visual artifacts to expose deepfakes and face manipulations. In: 2019 IEEE Winter applications of computer vision workshops (WACVW), pp 83–92, DOI https://doi.org/10.1109/WACVW.2019.00020, (to appear in print)

  33. Megahed A, Han Q (2020) Face2face manipulation detection based on histogram of oriented gradients. In: 2020 IEEE 19th International conference on trust, security and privacy in computing and communications (TrustCom), pp 1260–1267, DOI https://doi.org/10.1109/TrustCom50675.2020.00169, (to appear in print)

  34. Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 807–814

  35. Pun CM, Liu B, Yuan XC (2016) Multi-scale noise estimation for image splicing forgery detection. J Vis Commun Image Represent 38:195–206. https://doi.org/10.1016/j.jvcir.2016.03.005

    Article  Google Scholar 

  36. Rahmouni N, Nozick V, Yamagishi J, Echizen I (2017) Distinguishing computer graphics from natural images using convolution neural networks. In: 2017 IEEE Workshop on information forensics and security (WIFS). IEEE, pp 1–6

  37. Rössler A, Cozzolino D, Verdoliva L, Riess C, Thies J, Nießner M (2018) Faceforensics: A large-scale video dataset for forgery detection in human faces. arXiv:180309179

  38. Rössler A, Cozzolino D, Verdoliva L, Riess C, Thies J, Nießner M (2019) Faceforensics++: Learning to detect manipulated facial images. In: Proceedings of the IEEE international conference on computer vision, pp 1–11

  39. Sabir E, Cheng J, Jaiswal A, AbdAlmageed W, Masi I, Natarajan P (2019) Recurrent convolutional strategies for face manipulation detection in videos. Interfaces (GUI) 3(1)

  40. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  41. Szummer M, Picard RW (1996) Temporal texture modeling. In: Proceedings of 3rd IEEE international conference on image processing, vol 3. IEEE, pp 823–826

  42. Tharwat A (2018) Classification assessment methods. Applied Computing and Informatics. https://doi.org/10.1016/j.aci.2018.08.003https://doi.org/10.1016/j.aci.2018.08.003. http://www.sciencedirect.com/science/article/pii/S2210832718301546

  43. Wang G, Zhou J, Wu Y (2020) Exposing deep-faked videos by anomalous co-motion pattern detection. arXiv:200804848

  44. Wu X, Xie Z, Gao Y, Xiao Y (2020) Sstnet: Detecting Manipulated faces through spatial, steganalysis and temporal features. In: ICASSP 2020-2020 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 2952–2956

  45. Zhang Q, Lu W, Weng J (2016) Joint image splicing detection in dct and contourlet transform domain. J Vis Commun Image Represent 40:449–458. https://doi.org/10.1016/j.jvcir.2016.07.013

    Article  Google Scholar 

  46. Zhao G, Pietikäinen M (165) Dynamic texture recognition using volume local binary patterns. In: Dynamical vision. Springer

  47. Zhao G, Pietikainen M (2007) Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence 29(6):915–928

    Article  Google Scholar 

  48. Zhou P, Han X, Morariu VI (2017) Two-stream neural networks for tampered face detection. In: 2017 IEEE conference on computer vision and pattern recognition workshops (CVPRW). IEEE, pp 1831–1839

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China [grant numbers 61771168, 61471141, 61361166006, 61571018, and 61531003]; Key Technology Program of Shenzhen, China, [grant number JSGG20160427185010977]; Basic Research Project of Shenzhen, China [grant number JCYJ20150513151706561].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qi Han.

Ethics declarations

Declaration of competing interests

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Megahed, A., Han, Q. Identify videos with facial manipulations based on convolution neural network and dynamic texture. Multimed Tools Appl 81, 43441–43466 (2022). https://doi.org/10.1007/s11042-022-13102-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13102-9

Keywords

Navigation