Emergence of deepfakes and video tampering detection approaches: A survey

Kingra, Staffy; Aggarwal, Naveen; Kaur, Nirmal

doi:10.1007/s11042-022-13100-x

Emergence of deepfakes and video tampering detection approaches: A survey

Published: 05 August 2022

(2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

2125 Accesses
2 Altmetric
Explore all metrics

Abstract

Digital content, particularly the digital videos recorded at specific angle, though, provides a truthful picture of reality but the widespread proliferation of easy-to-use content editing softwares doubt about its authenticity. Recently, Artificial Intelligence (AI) based content altering mechanism, known as deepfake, became popular on social media platforms, wherein any person can be able to purport the behaviour of another person in a video who is actually not there. Depending on the type of manipulation performed, different types of deepfakes are described in this paper. Moreover, rely on digital content for trustworthy evidence as well as to avoid spread of misinformation, integrity and authenticity of digital content has-been of utmost concerns. This paper aims to present a survey of the state-of-art video integrity verification techniques with special emphasis on emerging deepfake video detection approaches. Seeing the advancement in creation of more realistic deepfake videos, this review facilitates the development of more generalized methods with a thorough discussion on different research trends in the wake of deepfake detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detecting Manipulations in Video

Advancements and Challenges in Deepfake Video Detection: A Comprehensive Review

A comprehensive survey on passive techniques for digital video forgery detection

Article 14 October 2020

Notes

GAN: Generative Adversarial Network
http://www.fakeapp.com/
United States v Beeler, 62 F Supp. 2d 136 (July 1, 1999, United States District Court, D. Maine).
Dolan v State of Florida, 743 S. 2d 544 (July 21, 1999, Court of Appeal of Florida, Fourth District).
Defense Advanced Research Project Agency
Coarse-to-Fine Deep Convolutional Neural Network
Residual Network
This dataset is available as a part of FaceForensics.
DeepFake Detection Challenge
Deep Neural Network
https://www.ncbi.nlm.nih.gov/pubmed/9399231.
http://bionumbers.hms.harvard.edu/bionumber.aspx?id=100706&ver=0.
CNN: Convolutional Neural Network
LRCN: Long-Term Recurrent CNN, a combination of CNN and LSTM
LSTM: Long Short Term Memory
VAE: Variational AutoEncoder
Progressive Growing GAN
https://github.com/EricGzq/Hybrid-Fake-Face-Dataset
Gated Recurrent Unit
Root Mean Square Energy
Available at: https://www.descript.com/lyrebird-ai?source=lyrebird
Table 15 Lip-sync Deepfake detection techniques (A: Accuracy, EER: Effective Error Rate)
Full size table
Available at: https://www.asvspoof.org/
Available at: https://github.com/resemble-ai/Resemblyzer

References

Adami N, Signoroni A, Leonardi R (2007) State-of-the-art and trends in scalable video compression with wavelet-based approaches. IEEE Trans Circ Syst Video Technol 17(9):1238–1255
Article Google Scholar
Afchar D, Nozick V, Yamagishi J, Echizen I (2018) Mesonet: a compact facial video forgery detection network. In: 2018 IEEE International workshop on information forensics and security (WIFS). IEEE, pp 1–7
Agarwal S, El-Gaaly T, Farid H, Lim S N (2020) Detecting deep-fake videos from appearance and behavior. arXiv:2004.14491
Agarwal S, Farid H (2021) Detecting deep-fake videos from aural and oral dynamics
Agarwal S, Farid H, Gu Y, He M, Nagano K, Li H (2019) Protecting world leaders against deep fakes. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 38–45
Ajder H Deepfake threat intelligence: a statistics snapshot from june 2020. http://deeptracelabs.com/deepfake-threat-intelligence-a-statistics-snapshot-from-june-2020/
Al-Sanjary O I, Ahmed A A, Sulong G (2016) Development of a video tampering dataset for forensic investigation. Forensic Sci Int 266:565–572
Article Google Scholar
Amerini I, Galteri L, Caldelli R, Del Bimbo A (2019) Deepfake video detection through optical flow based cnn. In: Proceedings of the IEEE international conference on computer vision workshops, pp 0–0
Anina I, Zhou Z, Zhao G, Pietikäinen M (2015) Ouluvs2: a multi-view audiovisual database for non-rigid mouth motion analysis. In: 2015 11Th IEEE international conference and workshops on automatic face and gesture recognition (FG), vol 1. IEEE, pp 1–5
APTLY: Audio processing techniques lab at york. http://bil.eecs.yorku.ca/aptly-lab./
Aslani S, Mahdavi-Nasab H (2013) Optical flow based moving object detection and tracking for traffic surveillance. Int J Electr Comput Eng 7(9):1252–1256
Google Scholar
Baddar W J, Gu G, Lee S, Ro Y M (2017) Dynamics transfer gan:, Generating video by transferring arbitrary temporal dynamics from a source video to a single target image. Accessed 5 May 2021. arXiv:1712.03534
Baidu text-to-speech system. https://cloud.baidu.com/product/speech/tts
Baltrušaitis T, Robinson P, Morency LP (2016) Openface: an open source facial behavior analysis toolkit. In: 2016 IEEE Winter conference on applications of computer vision (WACV). IEEE, pp 1–10
Barker J (2013) The grid audiovisual sentence corpus, available at: http://spandh.dcs.shef.ac.uk/gridcorpus/
Bidokhti A, Ghaemmaghami S (2015) Detection of regional copy/move forgery in mpeg videos using optical flow. In: 2015 The international symposium on artificial intelligence and signal processing (AISP). IEEE, pp 13–17
Bonettini N, Cannas E D, Mandelli S, Bondi L, Bestagini P, Tubaro S (2020)
Bregler C, Covell M, Slaney M (1997) Video rewrite: Driving visual speech with audio. In: Proceedings of the 24th annual conference on Computer graphics and interactive techniques, pp 353–360
Bromley J, Guyon I, LeCun Y, Säckinger E, Shah R (1994) Signature verification using a” siamese” time delay neural network. In: Advances in neural information processing systems, pp 737–744
Caldelli R, Galteri L, Amerini I, Del Bimbo A (2021) Optical flow based cnn for detection of unlearnt deepfake manipulations. Pattern Recogn Lett 146:31–37
Article Google Scholar
Chakravarty P, Tuytelaars T (2016) Cross-modal supervision for learning active speaker detection in video. In: European conference on computer vision. Springer, pp 285–301
Chan C, Ginosar S, Zhou T, Efros A A (2019) Everybody dance now. In: Proceedings of the IEEE international conference on computer vision, pp 5933–5942
Chao J, Jiang X, Sun T (2012) A novel video inter-frame forgery model detection scheme based on optical flow consistency. In: International workshop on digital watermarking. Springer, pp 267–281
Chatfield K, Simonyan K, Vedaldi A, Zisserman A (2014) Return of the devil in the details:, Delving deep into convolutional nets. arXiv:1405.3531
Chen H, Chandrasekar V, Tan H, Cifelli R (2019) Rainfall estimation from ground radar and trmm precipitation radar using hybrid deep neural networks. Geophysical Research Letters
Chen H, Wo Y, Han G (2018) Multi-granularity geometrically robust video hashing for tampering detection. Multimed Tools Appl 77(5):5303–5321
Article Google Scholar
Chen T, Kumar A, Nagarsheth P, Sivaraman G, Khoury E (2020) Generalization of audio deepfake detection. In: Proceedings of the Odyssey 2020 the speaker and language recognition workshop, pp 132–137
Chen T Q, Rubanova Y, Bettencourt J, Duvenaud D. K (2018) Neural ordinary differential equations. In: Advances in neural information processing systems, pp 6571–6583
Cheung G K, Baker S, Hodgins J, Kanade T (2004) Markerless human motion transfer. In: Proceedings of the 2nd international symposium on 3d data processing, visualization and transmission, 2004. 3DPVT 2004. IEEE, pp 373–378
Chingovska I, Anjos A, Marcel S (2012) On the effectiveness of local binary patterns in face anti-spoofing. In: 2012 BIOSIG-proceedings of the international conference of biometrics special interest group (BIOSIG). IEEE, pp 1–7
Chintha A, Thai B, Sohrawardi S J, Bhatt K, Hickerson A, Wright M, Ptucha R (2020) Recurrent convolutional structures for audio spoof and video deepfake detection. IEEE J Sel Top Signal Process 14(5):1024–1037
Article Google Scholar
Cho W, Choi S, Park D. K, Shin I, Choo J (2019) Image-to-image translation via group-wise deep whitening-and-coloring transformation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 10639–10647
Choi Y, Choi M, Kim M, Ha J W, Kim S, Choo J (2018) Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8789–8797
Choi Y, Choi M, Kim M, Ha J W, Kim S, Choo J (2018) Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8789–8797
Chollet F (2017) Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1251–1258
Chugh K, Gupta P, Dhall A, Subramanian R (2020)
Chung J S, Zisserman A (2016) Lip reading in the wild. In: Asian conference on computer vision. Springer, pp 87–103
Chung J S, Zisserman A (2016) Out of time: automated lip sync in the wild. In: Asian conference on computer vision. Springer, pp 251–263
Ciftci U A, Demir I (2019) Fakecatcher:, Detection of synthetic portrait videos using biological signals. arXiv:1901.02212
Cole S (2017) Ai-assisted fake porn is here and we’re all fucked https://www.vice.com/en_us/article/gydydm/gal-gadot-fake-ai-porn
collection, D.: Xiph.org video test media. Accessed 5 May 2021. https://media.xiph.org/video/derf/
Cozzolino D, Rössler A, Thies J, Nießner M, Verdoliva L (2020) Id-reveal:, Identity-aware deepfake video detection. arXiv:2012.02512
D’Amiano L, Cozzolino D, Poggi G, Verdoliva L (2018) A patchmatch-based dense-field algorithm for video copy–move detection and localization. IEEE Trans Circ Syst Video Technol 29(3):669–682
Article Google Scholar
De Roover C, De Vleeschouwer C, Lefebvre F, Macq B (2005) Robust video hashing based on radial projections of key frames. IEEE Trans Signal Process 53(10):4020–4037
Article MathSciNet Google Scholar
Demir I, Ciftci U A (2021) Where do deep fakes look? synthetic face detection via gaze tracking. arXiv:2101.01165
(2019) Dessa: Detecting audio deepfakes with ai. available at:. https://medium.com/dessa-news/detecting-audio-deepfakes-f2edfd8e2b35
Ding X, Zhang D (2019) Detection of motion-compensated frame-rate up-conversion via optical flow-based prediction residue. Optik p 163766
Dolhansky B, Bitton J, Pflaum B, Lu J, Howes R, Wang M, Ferrer C C (2020) The deepfake detection challenge dataset. arXiv:2006.07397
Dolhansky B, Howes R, Pflaum B, Baram N, Ferrer C C (2019) The deepfake detection challenge (dfdc) preview dataset. arXiv:1910.08854
Dong Q, Yang G, Zhu N (2012) A mcea based passive forensics scheme for detecting frame-based video tampering. Digit Investig 9(2):151–159
Article Google Scholar
Dufour N (2019) Google ai blog. contributing data to deepfake detection research. Accessed 5 May 2021. https://ai.googleblog.com/2019/09/contributing-data-to-deepfake-detection.html
Durall R, Keuper M, Keuper J (2020) Watch your up-convolution: Cnn based generative deep neural networks are failing to reproduce spectral distributions. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7890–7899
Durall R, Keuper M, Pfreundt F. J, Keuper J (2019) Unmasking deepfakes with simple features. arXiv:1911.00686
Esser P, Haux J, Milbich T, et al. (2018) Towards learning a realistic rendering of human behavior. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 0–0
Feng D, Lu X, Lin X (2020) Deep detection for face manipulation. In: International conference on neural information processing. Springer, pp 316–323
Fernandes S, Raj S, Ortiz E, Vintila I, Salter M, Urosevic G, Jha S (2019) Predicting heart rate variations of deepfake videos using neural ode. In: Proceedings of the IEEE international conference on computer vision workshops, pp 0–0
Fernando T, Fookes C, Denman S, Sridharan S (2019) Exploiting human social cognition for the detection of fake and fraudulent faces via memory networks. arXiv:1911.07844
Garg R, Varna A L, Hajj-Ahmad A, Wu M (2013) “seeing” enf: power-signature-based timestamp for digital multimedia via optical sensing and signal processing. IEEE Trans Inf Forensics Secur 8(9):1417–1432
Article Google Scholar
Garrido P, Valgaerts L, Sarmadi H, Steiner I, Varanasi K, Perez P, Theobalt C (2015) Vdub: Modifying face video of actors for plausible visual alignment to a dubbed audio track. In: Computer graphics forum, vol 34. Wiley Online Library, pp 193–204
Grisham S (2018) Stephanie grisham on twitter. tampering performed on white house secretary’s video https://twitter.com/PressSec/status/1060374680991883265
Guan H, Kozak M, Robertson E, Lee Y, Yates A N, Delgado A, Zhou D, Kheyrkhah T, Smith J, Fiscus J (2019) Mfc datasets: Large-scale benchmark datasets for media forensic challenge evaluation. In: 2019 IEEE Winter applications of computer vision workshops (WACVW). IEEE, pp 63–72
Guan W, Wang W, Dong J, Peng B, Tan T (2021) Robust face-swap detection based on 3d facial shape information. arXiv:2104.13665
Guarnera L, Giudice O, Battiato S (2020) Deepfake detection by analyzing convolutional traces. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 666–667
Güera D, Baireddy S, Bestagini P, Tubaro S, Delp E J (2019) We need no pixels:, Video manipulation detection using stream descriptors. arXiv:1906.08743
Güera D, Delp E J (2018) Deepfake video detection using recurrent neural networks. In: 2018 15Th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, pp 1–6
Guo Z, Yang G, Chen J, Sun X (2020) Fake face detection via adaptive residuals extraction network. arXiv:2005.04945
Haliassos A, Vougioukas K, Petridis S, Pantic M (2020) Lips don’t lie:, A generalisable and robust approach to face forgery detection. arXiv:2012.07657
Hasan H R, Salah K (2019) Combating deepfake videos using blockchain and smart contracts. IEEE Access 7:41596–41606
Article Google Scholar
He Z, Zuo W, Kan M, Shan S, Chen X (2019) Attgan: Facial attribute editing by only changing what you want. IEEE Trans Image Process 28 (11):5464–5478
Article MathSciNet Google Scholar
He Z, Zuo W, Kan M, Shan S, Chen X (2019) Attgan: Facial attribute editing by only changing what you want. IEEE Trans Image Process 28 (11):5464–5478
Article MathSciNet Google Scholar
Hecker C, Raabe B, Enslow R. W, DeWeese J, Maynard J, van Prooijen K (2008) Real-time motion retargeting to highly varied user-created morphologies. ACM Transactions on Graphics (TOG) 27(3):1–11
Article Google Scholar
Hernandez-Ortega J, Tolosana R, Fierrez J, Morales A (2020) Deepfakeson-phys:, Deepfakes detection based on heart rate estimation. arXiv:2010.00400
Horn B K, Schunck B G (1981) Determining optical flow. Artificial intelligence 17(1–3):185–203
Article Google Scholar
Hsieh C K, Chiu C C, Su P C (2018) Video forensics for detecting shot manipulation using the information of deblocking filtering. In: 2018 IEEE 42Nd annual computer software and applications conference (COMPSAC), vol 2. IEEE, pp 353–358
Huang Y, Juefei-Xu F, Wang R, Xie X, Ma L, Li J, Miao W, Liu Y, Pu G (2020) Fakelocator:, Robust localization of gan-based face manipulations via semantic segmentation networks with bells and whistles. arXiv:2001.09598
Jeon H, Bang Y, Woo S S (2020) Fdftnet:, Facing off fake images using fake detection fine-tuning network. arXiv:2001.01265
Jiang L, Wu W, Li R, Qian C, Loy C C (2020) Deeperforensics-1.0:, A large-scale dataset for real-world face forgery detection. arXiv:2001.03024
Jr E O (2019) Thieves used audio deepfake of a ceo to steal $243,000 https://www.vice.com/en_in/article/d3a7qa/thieves-used-audio-deep-fake-of-a-ceo-to-steal-dollar243000
Karras T, Aila T, Laine S, Lehtinen J (2017) Progressive growing of gans for improved quality, stability, and variation. arXiv:1710.10196
Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4401–4410
Karras T, Laine S, Aittala M, Hellsten J, Lehtinen J, Aila T (2019) Analyzing and improving the image quality of stylegan. arXiv:1912.04958
Khalid H, Woo S S (2020) Oc-fakedect: Classifying deepfakes using one-class variational autoencoder. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 656–657
Khalil S S, Youssef S M, Saleh SN (2021) icaps-dfake: an integrated capsule-based model for deepfake image and video detection. Future Internet 13(4):93
Article Google Scholar
Khan S A, Artusi A, Dai H (2021)
Khodabakhsh A, Ramachandra R, Raja K, Wasnik P, Busch C (2018) Fake face detection methods: Can they be generalized?. In: 2018 International conference of the biometrics special interest group (BIOSIG). IEEE, pp 1–6
Kingma D P, Dhariwal P (2018) Glow: Generative flow with invertible 1x1 convolutions. In: Advances in neural information processing systems, pp 10215–10224
Kingma D P, Welling M (2013) Auto-encoding variational bayes. arXiv:1312.6114
Kingra S, Aggarwal N, Singh R. D (2016) Video inter-frame forgery detection: A survey. Indian J Sci Technol 9(44)
Kingra S, Aggarwal N, Singh R D (2017) Inter-frame forgery detection in h. 264 videos using motion and brightness gradients. Multimed Tools Appl 76(24):25767–25786
Article Google Scholar
Kobayashi K, Toda T (2018) Sprocket: Open-source voice conversion software. In: Odyssey, pp 203–210
Kobayashi M, Okabe T, Sato Y (2010) Detecting forgery from static-scene video based on inconsistency in noise level functions. IEEE Trans Inf Forensics Secur 5(4):883–892
Article Google Scholar
Kohli A, Gupta A (2021) Detecting deepfake, faceswap and face2face facial forgeries using frequency cnn. Multimedia Tools and Applications, pp 1–18
Korshunov P, Halstead M, Castan D, Graciarena M, McLaren M, Burns B, Lawson A, Marcel S (2019) Tampered speaker inconsistency detection with phonetically aware audio-visual features. In: International conference on machine learning, CONF
Korshunov P, Marcel S (2018) Deepfakes:, a new threat to face recognition? assessment and detection. arXiv:1812.08685
Korshunov P, Marcel S (2018) Speaker inconsistency detection in tampered video. In: 2018 26Th european signal processing conference (EUSIPCO). IEEE, pp 2375–2379
Kumar A, Bhavsar A, Verma R (2020) Detecting deepfakes with metric learning. In: 2020 8Th international workshop on biometrics and forensics (IWBF). IEEE, pp 1–6
Kumar N, Kaur N, Gupta D (2020) Major convolutional neural networks in image classification: a survey. In: Proceedings of International Conference on IoT Inclusive Life (ICIIL 2019), NITTTR Chandigarh, India. Springer, pp 243–258
Kumar N, Kaur N, Gupta D (2020) Red green blue depth image classification using pre-trained deep convolutional neural network. Pattern Recognit Image Anal 30(3):382–390
Article Google Scholar
Kumar P, Vatsa M, Singh R (2020) Detecting face2face facial reenactment in videos. arXiv:2001.07444
Laptev I, Marszałek M, Schmid C, Rozenfeld B (2008) Learning realistic human actions from movies
Lee S, Tariq S, Kim J, Woo S. S (2021) Tar:, Generalized forensic framework to detect deepfakes using weakly supervised learning. arXiv:2105.06117
Lee S, Yoo C D (2006) Video fingerprinting based on centroids of gradient orientations. In: 2006 IEEE International conference on acoustics speech and signal processing proceedings, vol 2. IEEE, pp II–II
Lee S, Yoo C D (2008) Robust video fingerprinting based on affine covariant regions. In: 2008 IEEE International conference on acoustics, speech and signal processing. IEEE, pp 1237–1240
Li H, Hu L, Wei L, Nagano K, Jaewoo S, Fursund J, Saito S Avatar digitization from a single image for real-time rendering (2020). US Patent 10,535,163
Li L, Bao J, Zhang T, Yang H, Chen D, Wen F, Guo B (2019) Face x-ray for more general face forgery detection. arXiv:1912.13458
Li M, Monga V (2012) Robust video hashing via multilinear subspace projections. IEEE Transactions on Image Processing 21(10):4397–4409
Article MathSciNet Google Scholar
Li R, Liu Z, Zhang Y, Li Y, Fu Z (2018) Noise-level estimation based detection of motion-compensated frame interpolation in video sequences. Multimedia Tools and Applications 77(1):663–688
Article Google Scholar
Li X, Lang Y, Chen Y, Mao X, He Y, Wang S, Xue H, Lu Q (2020) Sharp multiple instance learning for deepfake video detection. arXiv:2008.04585
Li Y, Chang M. C, Lyu S (2018) In ictu oculi:, Exposing ai generated fake face videos by detecting eye blinking. arXiv:1806.02877
Li Y, Yang X, Sun P, Qi H, Lyu S (2019) Celeb-df:, A new dataset for deepfake forensics. arXiv:1909.12962
Liu M, Ding Y, Xia M, Liu X, Ding E, Zuo W, Wen S (2019) Stgan: a unified selective transfer network for arbitrary image attribute editing. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3673–3682
Liu Y, Guan Q, Zhao X, Cao Y (2018) Image forgery localization based on multi-scale convolutional neural networks. In: Proceedings of the 6th ACM workshop on information hiding and multimedia security, pp 85–90
Liu Z, Luo P, Wang X, Tang X (2015) Deep learning face attributes in the wild. In: Proceedings of the IEEE international conference on computer vision, pp 3730–3738
Long C, Basharat A, Hoogs A (2019) A coarse-to-fine deep convolutional neural network framework for frame duplication detection and localization in forged videos. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 1–10
Lucas B. D, Kanade T et al (1981) An iterative image registration technique with an application to stereo vision
Malekesmaeili M, Fatourechi M, Ward R K (2009) Video copy detection using temporally informative representative images. In: 2009 International conference on machine learning and applications. IEEE, pp 69–74
Maras M H, Alexandrou A (2019) Determining authenticity of video evidence in the age of artificial intelligence and in the wake of deepfake videos. The Int J Evid Proof 23(3):255–262
Article Google Scholar
Mase K (1991) Recognition of facial expression from optical flow. IEICE Trans Inf Syst 74(10):3474–3483
Google Scholar
Masi I, Killekar A, Mascarenhas RM, Gurudatt S. P, AbdAlmageed W (2020) Two-branch recurrent network for isolating deepfakes in videos. arXiv:2008.03412
Matern F, Riess C, Stamminger M (2019) Exploiting visual artifacts to expose deepfakes and face manipulations. In: 2019 IEEE Winter applications of computer vision workshops (WACVW). IEEE, pp 83–92
Mehra A (2020) Deepfake detection using capsule networks with long short-term memory networks. Master’s thesis, University of Twente
Milani S, Bestagini P, Tagliasacchi M, Tubaro S (2012) Multiple compression detection for video sequences. In: 2012 IEEE 14Th international workshop on multimedia signal processing (MMSP). IEEE, pp 112–117
Mirsky Y, Lee W (2021) The creation and detection of deepfakes: a survey. ACM Computing Surveys (CSUR) 54(1):1–41
Article Google Scholar
Mittal T, Bhattacharya U, Chandra R, Bera A, Manocha D (2020) Emotions don’t lie:, A deepfake detection method using audio-visual affective cues. arXiv:2003.06711
Mohammadi SH (2019) Text to speech synthesis using deep neural network with constant unit length spectrogram. US Patent 10,186,252
Montserrat D M, Hao H, Yarlagadda S K, Baireddy S, Shao R, Horváth J, Bartusiak E, Yang J, Güera D, Zhu F et al (2020) Deepfakes detection with automatic face weighting. arXiv:2004.12027
Nagothu D, Chen Y, Blasch E, Aved A, Zhu S (2019) Detecting malicious false frame injection attacks on surveillance systems at the edge using electrical network frequency signals. Sensors 19(11):2424
Article Google Scholar
Nagothu D, Schwell J, Chen Y, Blasch E, Zhu S (2019) A study on smart online frame forging attacks against video surveillance system. In: Sensors and systems for space applications XII, vol 11017. International Society for Optics and Photonics, p 110170L
Nguyen H H, Fang F, Yamagishi J, Echizen I (2019) Multi-task learning for detecting and segmenting manipulated facial images and videos. arXiv:1906.06876
Nguyen H H, Yamagishi J, Echizen I (2019) Capsule-forensics: Using capsule networks to detect forged images and videos. In: ICASSP 2019-2019 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 2307–2311
Nguyen H M, Derakhshani R (2020) Eyebrow recognition for identifying deepfake videos. In: 2020 International conference of the biometrics special interest group (BIOSIG). IEEE, pp 1–5
Nguyen T T, Nguyen C M, Nguyen D T, Nguyen D T, Nahavandi S (2019) Deep learning for deepfakes creation and detection. arXiv:1909.11573
Nguyen X H, Tran T S, Nguyen K D, Truong D T, et al. (2021) Learning spatio-temporal features to detect manipulated facial videos created by the deepfake techniques. Forensic Science International: Digital Investigation 36:301108
Google Scholar
Nirkin Y, Keller Y, Hassner T (2019) Fsgan: Subject agnostic face swapping and reenactment. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 7184–7193
Nirkin Y, Wolf L, Keller Y, Hassner T (2020) Deepfake detection based on the discrepancy between the face and its context. arXiv:2008.12262
Noguchi A, Yanai K (2010) A surf-based spatio-temporal feature for feature-fusion-based action recognition. In: European conference on computer vision. Springer, pp 153–167
Oord Avd, Dieleman S, Zen H, Simonyan K, Vinyals O, Graves A, Kalchbrenner N, Senior A, Kavukcuoglu K (2016) Wavenet:, A generative model for raw audio. arXiv:1609.03499
Oostveen J, Kalker T, Haitsma J (2002) Feature extraction and a database strategy for video fingerprinting. In: International conference on advances in visual information systems. Springer, pp 117–128
Ouyang J, Liu Y, Shu H (2017) Robust hashing for image authentication using sift feature and quaternion zernike moments. Multimed Tools Appl 76(2):2609–2626
Article Google Scholar
Papadopoulou O, Zampoglou M, Papadopoulos S, Kompatsiaris Y, Teyssou D (2018) Invid fake video corpus v2. 0 (version 2.0) Dataset on Zenodo
Parkhi O M, Vedaldi A, Zisserman A (2015) Deep face recognition
Posters B (2018) Bill posters on instagram. artificially generated video of mark zuckerberg https://twitter.com/PressSec/status/1060374680991883265
Project A (2017) Ami corpus download. available at: http://groups.inf.ed.ac.uk/ami/download/
Project R Tools for digital forensics. http://www.rewindproject.eu/
Qadir G, Yahaya S, Ho AT (2012) Surrey university library for forensic analysis (sulfa) of video content
Qi H, Guo Q, Juefei-Xu F, Xie X, Ma L, Feng W, Liu Y, Zhao J (2020) Deeprhythm: exposing deepfakes with attentional visual heartbeat rhythms. In: Proceedings of the 28th ACM international conference on multimedia, pp 4318–4327
Rössler A, Cozzolino D, Verdoliva L, Riess C, Thies J, Nießner M (2018) Faceforensics: A large-scale video dataset for forgery detection in human faces. arXiv:1803.09179
Rössler A, Cozzolino D, Verdoliva L, Riess C, Thies J, Nießner M (2019) Faceforensics++: Learning to detect manipulated facial images. arXiv:1901.08971
Roy S, Sun Q (2007) Robust hash for detecting and localizing image tampering. In: 2007 IEEE International conference on image processing, vol 6. IEEE, pp VI–117
Sabir E, Cheng J, Jaiswal A, AbdAlmageed W, Masi I, Natarajan P (2019) Recurrent convolutional strategies for face manipulation detection in videos. Interfaces (GUI) 3:1
Google Scholar
Saikia N (2015) Perceptual hashing in the 3d-dwt domain. In: 2015 International conference on green computing and internet of things (ICGCIot). IEEE, pp 694–698
Sanderson C (2019) Vidtimit audio-video dataset. available at: http://conradsanderson.id.au/vidtimit/
Saunders J, Comerford A, Williams G (2019) Detecting deep fakes with mice: Machines vs biology https://i.blackhat.com/USA-19/wednesday/us-19-williams-detecting-deep-Fakes-With-Mice-wp.pdf
Saxena S, Subramanyam A, Ravi H (2016) Video inpainting detection and localization using inconsistencies in optical flow. In: 2016 IEEE Region 10 conference (TENCON). IEEE, pp 1361–1365
Seeling P, Reisslein M (2001) Video traces research group http://trace.eas.asu.edu/
Selvaraju R R, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp 618–626
Shang Z, Xie H, Zha Z, Yu L, Li Y, Zhang Y (2021) Prrnet: Pixel-region relation network for face forgery detection. Pattern Recogn 116:107950
Article Google Scholar
Shen J, Pang R, Weiss R J, Schuster M, Jaitly N, Yang Z, Chen Z, Zhang Y, Wang Y, Skerrv-Ryan R et al (2018) Natural tts synthesis by conditioning wavenet on mel spectrogram predictions. In: 2018 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 4779–4783
Singh R D, Aggarwal N (2017) Detection of upscale-crop and splicing for digital video authentication. Digit Investig 21:31–52
Article Google Scholar
Singh RD, Aggarwal N (2017) Optical flow and prediction residual based hybrid forensic system for inter-frame tampering detection. Journal of Circuits, Systems and Computers 26(07):1750107
Article Google Scholar
Singh R D, Aggarwal N (2018) Video content authentication techniques: a comprehensive survey. Multimed Syst 24(2):211–240
Article Google Scholar
Song F, Tan X, Liu X, Chen S (2014) Eyes closeness detection from still images with multi-scale histograms of principal oriented gradients. Pattern Recogn 47(9):2825–2838
Article Google Scholar
Sowmya K, Chennamma H (2015) A survey on video forgery detection. Int J Comput Eng Appl 9(2):17–27
Google Scholar
Stehouwer J, Dang H, Liu F, Liu X, Jain A (2019) On the detection of digital face manipulation. arXiv:1910.01717
Su Y, Xu J (2010) Detection of double-compression in mpeg-2 videos. In: 2010 2Nd international workshop on intelligent systems and applications. IEEE, pp 1–4
Sun K, Zhao Y, Jiang B, Cheng T, Xiao B, Liu D, Mu Y, Wang X, Liu W, Wang J (2019) High-resolution representations for labeling pixels and regions. arXiv:1904.04514
Sun Q, Liu Y, Chua T. S, Schiele B (2019) Meta-transfer learning for few-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 403–412
Sun X, Wu B, Chen W (2020) Identifying invariant texture violation for robust deepfake detection. arXiv:2012.10580
Suwajanakorn S, Seitz S M, Kemelmacher-Shlizerman I (2017) Synthesizing obama: learning lip sync from audio. ACM Transactions on Graphics (TOG) 36(4):1–13
Article Google Scholar
Tachibana H, Uenoyama K, Aihara S (2018) Efficiently trainable text-to-speech system based on deep convolutional networks with guided attention. In: 2018 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 4784–4788
Tamgade S N, Bora V R (2009) Motion vector estimation of video image by pyramidal implementation of lucas kanade optical flow. In: 2009 Second international conference on emerging trends in engineering & technology. IEEE, pp 914–917
Tan M, Le Q V (2019) Efficientnet:, Rethinking model scaling for convolutional neural networks. arXiv:1905.11946
Tariq S, Lee S, Woo S S (2020) A convolutional lstm based residual network for deepfake video detection. arXiv:2009.07480
Thies J, Elgharib M, Tewari A, Theobalt C (2019) Nießner, M.: Neural voice puppetry: Audio-driven facial reenactment. arXiv:1912.05566
Thies J, Zollhöfer M, Nießner M (2019) Deferred neural rendering: Image synthesis using neural textures. ACM Transactions on Graphics (TOG) 38(4):1–12
Article Google Scholar
Thies J, Zollhofer M, Stamminger M, Theobalt C, Nießner M (2016) face2face: Real-time face capture and reenactment of rgb videos. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2387–2395
Tian Y, Pei K, Jana S, Ray B (2018) Deeptest: Automated testing of deep-neural-network-driven autonomous cars. In: Proceedings of the 40th international conference on software engineering. ACM, pp 303–314
Todisco M, Wang X, Vestman V, Sahidullah M, Delgado H, Nautsch A, Yamagishi J, Evans N, Kinnunen T, Lee K A (2019) Asvspoof 2019:, Future horizons in spoofed and fake audio detection. arXiv:1904.05441
Tolosana R, Vera-Rodriguez R, Fierrez J, Morales A, Ortega-Garcia J (2020) Deepfakes and beyond:, A survey of face manipulation and fake detection. arXiv:2001.00179
TRECVID: Trec video retrieval evaluation. http://trecvid.nist.gov/
Tulyakov S, Liu M Y, Yang X, Kautz J (2018) Mocogan: Decomposing motion and content for video generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1526–1535
Verdoliva L (2020) Media forensics and deepfakes:, an overview. arXiv:2001.06564
Vincent J (2018) Jordan peele use ai to make barack obama deliver a psa about fake news https://www.theverge.com/tldr/2018/4/17/17247334/ai-fake-news-video%-barack-obama-jordan-peele-buzzfeed
Wahab A W A, Bagiwa M A, Idris M Y I, Khan S, Razak Z, Ariffin M R K (2014) Passive video forgery detection techniques: a survey. In: 2014 10Th international conference on information assurance and security. IEEE, pp 29–34
Wan L, Wang Q, Papir A, Moreno I L (2018) Generalized end-to-end loss for speaker verification. In: 2018 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 4879–4883
Wang J, Wu Z, Chen J, Jiang Y G (2021) M2tr:, Multi-modal multi-scale transformers for deepfake detection. arXiv:2104.09770
Wang Q, Li Z, Zhang Z, Ma Q (2014) Video inter-frame forgery identification based on optical flow consistency. Sensors & Transducers 166(3):229
Google Scholar
Wang R, Juefei-Xu F, Huang Y, Guo Q, Xie X, Ma L, Liu Y (2020) Deepsonar:, Towards effective and robust detection of ai-synthesized fake voices. arXiv:2005.13770
Wang R, Juefei-Xu F, Ma L, Xie X, Huang Y, Wang J, Liu Y (2020) Fakespotter: a simple yet robust baseline for spotting ai-synthesized fake faces. In: International joint conference on artificial intelligence (IJCAI)
Wang S Y, Wang O, Zhang R, Owens A, Efros A A (2020) Cnn-generated images are surprisingly easy to spot... for now. In: Proceedings of the IEEE conference on computer vision and pattern recognition, vol 7
Wang T C, Liu M. Y, Zhu J. Y, Liu G, Tao A, Kautz J, Catanzaro B (2018) Video-to-video synthesis. arXiv:1808.06601
Wang W, Farid H (2006) Exposing digital forgeries in video by detecting double mpeg compression. In: Proceedings of the 8th workshop on Multimedia and security. ACM, pp 37–47
Wang W, Farid H (2009) Exposing digital forgeries in video by detecting double quantization. In: Proceedings of the 11th ACM workshop on Multimedia and security. ACM, pp 39–48
Wang W, Jiang X, Wang S, Wan M, Sun T (2013) Identifying video forgery process using optical flow. In: International workshop on digital watermarking. Springer, pp 244–257
Wang Y, Skerry-Ryan R, Stanton D, Wu Y, Weiss R J, Jaitly N, Yang Z, Xiao Y, Chen Z, Bengio S et al (2017) Tacotron:, Towards end-to-end speech synthesis. arXiv:1703.10135
Wheatley T, Weinberg A, Looser C, Moran T, Hajcak G (2011) Mind perception: Real but not artificial faces sustain neural activity beyond the n170/vpp PloS one 6(3)
Wiles O, Koepke A, Zisserman A (2018) Self-supervised learning of a facial attribute embedding from video. arXiv:1808.06882
Wodajo D, Atnafu S (2021) Deepfake video detection using convolutional vision transformer. arXiv:2102.11126
Xie W, Nagrani A, Chung J S, Zisserman A (2019) Utterance-level aggregation for speaker recognition in the wild. In: ICASSP 2019-2019 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 5791–5795
Xu F, Liu Y, Stoll C, Tompkin J, Bharaj G, Dai Q, Seidel H P, Kautz J, Theobalt C (2011) Video-based characters: creating new human performances from a multi-view video database. In: ACM SIGGRAPH 2011 Papers, pp 1–10
Yang X, Li Y, Lyu S (2019) Exposing deep fakes using inconsistent head poses. In: ICASSP 2019-2019 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 8261–8265
Yoo D G, Kang S J, Kim Y H (2013) Direction-select motion estimation for motion-compensated frame rate up-conversion. J Disp Technol 9 (10):840–850
Article Google Scholar
Zampoglou M, Markatopoulou F, Mercier G, Touska D, Apostolidis E, Papadopoulos S, Cozien R, Patras I, Mezaris V, Kompatsiaris I (2019) Detecting tampered videos with multimedia forensics and deep learning. In: International conference on multimedia modeling. Springer, pp 374–386
Zhang X, Li H, Qi Y, Leow W K, Ng T K (2006) Rain removal in video by combining temporal and chromatic properties. In: 2006 IEEE International conference on multimedia and expo. IEEE, pp 461–464
Zhang Z, Robinson D, Tepper J (2018) Detecting hate speech on twitter using a convolution-gru based deep neural network. In: European semantic web conference, pp 745–760. Springer
Zhao T, Xu X, Xu M, Ding H, Xiong Y, Xia W (2020) Learning to recognize patch-wise consistency for deepfake detection. arXiv:2012.09311
Zhao Y, Wang S, Feng G, Tang Z (2010) A robust image hashing method based on zernike moments. J Comput Inf Syst 6(3):717–725
Google Scholar
Zhu B, Fang H, Sui Y, Li L (2020) Deepfakes for medical video de-identification: Privacy protection and diagnostic information preservation. In: Proceedings of the AAAI/ACM conference on ai, ethics, and society, pp 414–420

Download references

Acknowledgments

This Work is carried out at Design Innovation Center, Panjab University, Chandigarh, INDIA, established by the Ministry of Education, Government of India.

Author information

Authors and Affiliations

University Institute of Engineering and Technology, Panjab University, Chandigarh, India
Staffy Kingra, Naveen Aggarwal & Nirmal Kaur

Authors

Staffy Kingra
View author publications
You can also search for this author inPubMed Google Scholar
Naveen Aggarwal
View author publications
You can also search for this author inPubMed Google Scholar
Nirmal Kaur
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Staffy Kingra.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kingra, S., Aggarwal, N. & Kaur, N. Emergence of deepfakes and video tampering detection approaches: A survey. Multimed Tools Appl (2022). https://doi.org/10.1007/s11042-022-13100-x

Download citation

Received: 24 September 2020
Revised: 17 February 2022
Accepted: 04 April 2022
Published: 05 August 2022
DOI: https://doi.org/10.1007/s11042-022-13100-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Emergence of deepfakes and video tampering detection approaches: A survey

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Detecting Manipulations in Video

Advancements and Challenges in Deepfake Video Detection: A Comprehensive Review

A comprehensive survey on passive techniques for digital video forgery detection

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now