Abstract
The creation or manipulation of facial appearance through deep generative approaches, known as DeepFake, have achieved significant progress and promoted a wide range of benign and malicious applications, e.g., visual effect assistance in movie and misinformation generation by faking famous persons. The evil side of this new technique poses another popular study, i.e., DeepFake detection aiming to identify the fake faces from the real ones. With the rapid development of the DeepFake-related studies in the community, both sides (i.e., DeepFake generation and detection) have formed the relationship of battleground, pushing the improvements of each other and inspiring new directions, e.g., the evasion of DeepFake detection. Nevertheless, the overview of such battleground and the new direction is unclear and neglected by recent surveys due to the rapid increase of related publications, limiting the in-depth understanding of the tendency and future works. To fill this gap, in this paper, we provide a comprehensive overview and detailed analysis of the research work on the topic of DeepFake generation, DeepFake detection as well as evasion of DeepFake detection, with more than 318 research papers carefully surveyed. We present the taxonomy of various DeepFake generation methods and the categorization of various DeepFake detection methods, and more importantly, we showcase the battleground between the two parties with detailed interactions between the adversaries (DeepFake generation) and the defenders (DeepFake detection). The battleground allows fresh perspective into the latest landscape of the DeepFake research and can provide valuable analysis towards the research challenges and opportunities as well as research trends and future directions. We also elaborately design interactive diagrams (http://www.xujuefei.com/dfsurvey) to allow researchers to explore their own interests on popular DeepFake generators or detectors.




















Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
115th Congress. (2018). S.3805—Malicious Deep Fake Prohibition Act of 2018. https://www.congress.gov/bill/115th-congress/senate-bill/3805
116th Congress. (2019a). H.R.3230—Defending Each and Every Person from False Appearances by Keeping Exploitation Subject to Accountability Act of 2019. https://www.congress.gov/bill/116th-congress/house-bill/3230/
116th Congress. (2019b). S.2065—Deepfake Report Act of 2019. https://www.congress.gov/bill/116th-congress/senate-bill/2065
Abiantun, R., Juefei-Xu, F., Prabhu, U., & Savvides, M. (2019). SSR2: Sparse signal recovery for single-image super-resolution on faces with extreme low resolutions. Pattern Recognition, 90, 308–324.
Adobe. (2021a). Adobe audition. Retrieved August 1, 2021, from https://www.adobe.com/products/audition.html (online).
Adobe. (2021b). Adobe lightroom. Retrieved August 1, 2021, from https://www.adobe.com/products/photoshop-lightroom.html (online).
Adobe. (2021c). Adobe photoshop. Retrieved August 1, 2021, from https://www.adobe.com/products/photoshop.html (online).
Afchar, D., Nozick, V., Yamagishi, J., & Echizen, I. (2018). Mesonet: A compact facial video forgery detection network. In 2018 IEEE international workshop on information forensics and security (WIFS) (pp. 1–7). IEEE.
Afifi, M., Brubaker, M. A., & Brown, M. S. (2021). Histogan: controlling colors of gan-generated and real images via color histograms. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 7941–7950).
Agarwal, S., & Farid, H. (2021). Detecting deep-fake videos from aural and oral dynamics. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 981–989).
Agarwal, S., Farid, H., Gu, Y., He, M., Nagano, K., & Li, H. (2019). Protecting world leaders against deep fakes. In CVPR workshops (pp. 38–45).
Agarwal, S., Farid, H., El-Gaaly, T., & Lim, S. N. (2020a). Detecting deep-fake videos from appearance and behavior. In 2020 IEEE international workshop on information forensics and security (WIFS) (pp. 1–6). IEEE.
Agarwal, S., Farid, H., Fried, O., & Agrawala, M. (2020b). Detecting deep-fake videos from phoneme-viseme mismatches. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops (pp. 660–661).
Amerini, I., Galteri, L., Caldelli, R., & Del Bimbo, A. (2019). Deepfake video detection through optical flow based CNN. In Proceedings of the IEEE international conference on computer vision workshops.
Aneja, S., Midoglu, C., Dang-Nguyen, D. T., Riegler, M. A., Halvorsen, P., Niessner, M., Adsumilli, B., & Bregler, C. (2021). Mmsys’ 21 grand challenge on detecting cheapfakes. arXiv preprint arXiv:2107.05297
Antares Audio Technologies. (2021). Auto-tune. Retrieved August 1, 2021, from http://www.antarestech.com/product/auto-tune-pro/ (online).
Arjovsky M., Chintala S., & Bottou L. (2017). Wasserstein. arXiv:1701.07875
Bai, Y., Guo, Y., Wei, J., Lu, L., Wang, R., & Wang, Y. (2020). Fake generated painting detection via frequency analysis. In 2020 IEEE international conference on image processing (ICIP) (pp.1256–1260). IEEE.
Barni, M., Kallas, K., Nowroozi, E., & Tondi, B. (2020). CNN detection of gan-generated face images based on cross-band co-occurrences analysis. In 2020 IEEE international workshop on information forensics and security (WIFS) (pp. 1–6). IEEE.
Beijing Academy of Artificial Intelligence. (2021). Wu Dao 2.0. https://gpt3demo.com/apps/wu-dao-20
Bellemare, M. G., Danihelka, I., Dabney, W., Mohamed, S., Lakshminarayanan, B., Hoyer, S., & Munos, R. (2017). The Cramer distance as a solution to biased Wasserstein gradients. arXiv preprint arXiv:1705.10743
Berthelot, D., Schumm, T., & Metz, L. (2017). Began: Boundary equilibrium generative adversarial networks. arXiv preprint arXiv:1703.10717
Bilibili. (2010). A Chinese video sharing website. https://www.bilibili.com/
Bińkowski, M., Sutherland, D. J., Arbel, M., & Gretton, A. (2018). Demystifying MMD gans. arXiv preprint arXiv:1801.01401
Bommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S., Bernstein, M. S., Bohg, J., Bosselut, A., & Brunskill, E., (2021). On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258
Bondi, L., Cannas, E. D., Bestagini, P., & Tubaro, S. (2020). Training strategies and data augmentations in CNN-based deepfake video detection. In 2020 IEEE international workshop on information forensics and security (WIFS) (pp. 1–6). IEEE.
Bonettini, N., Bestagini, P., Milani, S., & Tubaro, S. (2021a). On the use of benford’s law to detect gan-generated images. In 2020 25th international conference on pattern recognition (ICPR) (pp. 5495–5502). IEEE.
Bonettini, N., Cannas, E. D., Mandelli, S., Bondi, L., Bestagini, P., & Tubaro, S. (2021b). Video face manipulation detection through ensemble of CNNs. In 2020 25th international conference on pattern recognition (ICPR) (pp. 5012–5019). IEEE.
Bonomi, M., Pasquini, C., & Boato, G. (2020) Dynamic texture analysis for detecting fake faces in video sequences. arXiv preprint arXiv:2007.15271
Brock, A., Donahue, J., & Simonyan, K. (2018) Large scale gan training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096
Brown, T. B., Mané, D., Roy, A., Abadi, M., & Gilmer, J. (2017) Adversarial patch. arXiv preprint arXiv:1712.09665
BuzzFeed. (2018). How to spot a deepfake like the Barack Obama-Jordan Peele Video. https://www.buzzfeed.com/craigsilverman/obama-jordan-peele-deepfake-video-debunk-buzzfeed
California (2019) California assembly bill no. 730. https://leginfo.legislature.ca.gov/faces/billTextClient.xhtml?bill_id=201920200AB730
Cao, Q., Shen, L., Xie, W., Parkhi, O. M., & Zisserman, A. (2018). Vggface2: a dataset for recognising faces across pose and age. In 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018) (pp. 67–74). IEEE.
Carlini, N., & Farid, H. (2020). Evading deepfake-image detectors with white-and black-box attacks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops (pp. 658–659).
Carlini, N., & Wagner, D. (2017). Towards evaluating the robustness of neural networks. In 2017 IEEE symposium on security and privacy (SP) (pp. 39–57). IEEE.
Chai, L., Bau, D., Lim, S. N., & Isola, P. (2020). What makes fake images detectable? Understanding properties that generalize. In European conference on computer vision (pp. 103–120). Springer.
Chandrasegaran, K., Tran, N. T., & Cheung, N. M. (2021). A closer look at Fourier spectrum discrepancies for CNN-generated images detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 7200–7209).
Chen, C., Chen, Q., Xu, J., & Koltun, V. (2018a). Learning to see in the dark. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3291–3300).
Chen, H., Hu, G., Lei, Z., Chen, Y., Robertson, N. M., & Li, S. Z. (2019). Attention-based two-stream convolutional networks for face spoofing detection. IEEE Transactions on Information Forensics and Security, 15, 578–593.
Chen, M., Radford, A., Child, R., Wu, J., Jun, H., Luan, D., & Sutskever, I. (2020). Generative pretraining from pixels. In International conference on machine learning (pp. 1691–1703). PMLR.
Chen, Q., & Koltun, V. (2017) Photographic image synthesis with cascaded refinement networks. In Proceedings of the IEEE international conference on computer vision (pp. 1511–1520).
Chen, R. T., Rubanova, Y., Bettencourt, J., & Duvenaud, D. K. (2018b). Neural ordinary differential equations. In Advances in neural information processing systems (pp. 6571–6583).
Chen, X., Xu, C., Yang, X., Song, L., & Tao, D. (2018). Gated-gan: Adversarial gated networks for multi-collection style transfer. IEEE Transactions on Image Processing, 28(2), 546–560.
Chen, Y., Fan, H., Xu, B., Yan, Z., Kalantidis, Y., Rohrbach, M., Yan, S., & Feng, J. (2019b). Drop an octave: Reducing spatial redundancy in convolutional neural networks with octave convolution. In Proceedings of the IEEE international conference on computer vision (pp. 3435–3444).
Chen, Y. C., Xu. X., Tian, Z., & Jia, J. (2019c). Homomorphic latent space interpolation for unpaired image-to-image translation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2408–2416).
Chen, Z., & Yang, H. (2021). Attentive semantic exploring for manipulated face detection. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 1985–1989). IEEE
Chen, Z., & Yang, H. (2021). Attentive semantic exploring for manipulated face detection. ICASSP 2021–2021 IEEE international conference on acoustics (pp. 1985–1989). IEEE: Speech and Signal Processing (ICASSP).
Chen, Z., Tondi, B., Li, X., Ni, R., Zhao, Y., & Barni, M. (2019). Secure detection of image manipulation by means of random feature selection. IEEE Transactions on Information Forensics and Security, 14(9), 2454–2469.
Cheng, Y., Juefei-Xu, F., Guo, Q., Fu, H., Xie, X., Lin, S. W., Lin, W., & Liu, Y. (2020) Adversarial exposure attack on diabetic retinopathy imagery. arXiv preprint arXiv:2009.09231
Cho, W., Choi, S., Park, D. K., Shin, I., & Choo, J. (2019). Image-to-image translation via group-wise deep whitening-and-coloring transformation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 10639–10647).
Choi, Y., Choi, M., Kim, M., Ha, J. W., Kim, S., & Choo, J. (2018). Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8789–8797).
Choi, Y., Uh, Y., Yoo, J., & Ha, J. W. (2020). Stargan v2: Diverse image synthesis for multiple domains. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 8188–8197).
Chollet, F. (2017). Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1251–1258).
Chugh, K., Gupta, P., Dhall, A., & Subramanian, R. (2020) Not made for each other-audio-visual dissonance-based deepfake detection and localization. In Proceedings of the 28th ACM international conference on multimedia (pp. 439–447).
Ciftci, U., Demir, I., & Yin, L. (2020a). Fakecatcher: Detection of synthetic portrait videos using biological signals. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2020.3009287
Ciftci, U.A., Demir, I., & Yin, L. (2020b). How do the hearts of deep fakes beat? Deep fake source detection via interpreting residuals with biological signals. arXiv preprint arXiv:2008.11363
CNN. (2020). ‘Deepfake’ Queen delivers alternative Christmas speech, in warning about misinformation. https://www.cnn.com/2020/12/25/uk/deepfake-queen-speech-christmas-intl-gbr
Cozzolino, D., Thies, J., Rössler, A., Riess, C., Nießner, M., & Verdoliva, L. (2018). Forensictransfer: Weakly-supervised domain adaptation for forgery detection. arXiv preprint arXiv:1812.02510
Cozzolino, D., Rössler, A., Thies, J., Nießner, M., & Verdoliva, L. (2020) Id-reveal: Identity-aware deepfake video detection. arXiv preprint arXiv:2012.02512
Dai, T., Cai, J., Zhang, Y., Xia, S. T., & Zhang, L. (2019). Second-order attention network for single image super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 11065–11074).
Dang, H., Liu, F., Stehouwer, J., Liu, X., & Jain, A. K. (2020). On the detection of digital face manipulation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5781–5790).
Dang, L. M., Hassan, S. I., Im, S., & Moon, H. (2019). Face image manipulation detection based on a convolutional neural network. Expert Systems with Applications, 129, 156–168.
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition (pp. 248–255). IEEE.
Ding, H., Sricharan, K., Chellappa, R. (2018). Exprgan: Facial expression editing with controllable expression intensity. In Proceedings of the AAAI conference on artificial intelligence (Vol. 32).
Ding, X., Raziei, Z., Larson, E. C., Olinick, E. V., Krueger, P., & Hahsler, M. (2020). Swapped face detection using deep learning and subjective assessment. EURASIP Journal on Information Security, 2020, 1–12.
Dogonadze, N., Obernosterer, J., & Hou, J. (2020) Deep face forgery detection. arXiv preprint arXiv:2004.11804
Dolhansky, B., Howes, R., Pflaum, B., Baram, N., & Ferrer, C. C. (2019) The deepfake detection challenge (DFDC) preview dataset. arXiv preprint arXiv:1910.08854
Dolhansky, B., Bitton, J., Pflaum, B., Lu, J., Howes, R., Wang, M., & Ferrer, C. C. (2020). The deepfake detection challenge dataset. arXiv preprint arXiv:2006.07397
Dong, X., Bao, J., Chen, D., Zhang, W., Yu, N., Chen, D., Wen, F., & Guo, B. (2020). Identity-driven deepfake detection. arXiv preprint arXiv:2012.03930
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., & Gelly, S. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929
Du, M., Pentyala, S., Li, Y., & Hu, X. (2019). Towards generalizable forgery detection with locality-aware autoencoder. arXiv preprint arXiv:1909.05999
Dufour, N., & Gully, A. (2019). Contributing data to deepfake detection research. Google AI Blog. https://ai.googleblog.com/2019/09/contributing-data-to-deepfake-detection.html
Durall, R., Keuper, M., Pfreundt, F. J., & Keuper, J. (2019). Unmasking deepfakes with simple features. arXiv preprint arXiv:1911.00686
Durall, R., Keuper, M., & Keuper, J. (2020). Watch your up-convolution: CNN based generative deep neural networks are failing to reproduce spectral distributions. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 7890–7899).
Esser, P., Rombach, R., & Ommer, B. (2021). Taming transformers for high-resolution image synthesis. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 12873–12883).
FaceApp. (2021). FaceApp. https://faceapp.com/app
Facebrity. (2021). Facebrity, Apple App Store. https://apps.apple.com/us/app/facebrity-face-swap-morph-app/id1449734851
FaceSwap. (2016). FaceSwap. https://github.com/deepfakes/faceswap
Feng, D., Lu, X., & Lin, X. (2020). Deep detection for face manipulation. In International conference on neural information processing (pp. 316–323). Springer.
Fernandes, S., Raj, S., Ortiz, E., Vintila, I., Salter, M., Urosevic, G., & Jha, S. (2019). Predicting heart rate variations of deepfake videos using neural ode. In Proceedings of the IEEE international conference on computer vision workshops.
Fernando, T., Fookes, C., Denman, S., & Sridharan, S .(2019). Exploiting human social cognition for the detection of fake and fraudulent faces via memory networks. arXiv preprint arXiv:1911.07844
Fraga-Lamas, P., & Fernández-Caramés, T. M. (2020). Fake news, disinformation, and deepfakes: Leveraging distributed ledger technologies and blockchain to combat digital deception and counterfeit reality. IT Professional, 22(2), 53–59.
Frank, J., Eisenhofer, T., Schönherr, L., Fischer, A., Kolossa, D., & Holz, T. (2020). Leveraging frequency analysis for deep fake image recognition. In International conference on machine learning (pp. 3247–3258). PMLR.
Friesen, E., & Ekman, P. (1978). Facial action coding system: A technique for the measurement of facial movement. Palo Alto, 3(2), 5.
Fu, L., Guo, Q., Juefei-Xu, F., Yu, H., Feng, W., Liu, Y., & Wang, S. (2021a). Benchmarking shadow removal for facial landmark detection and beyond. arXiv preprint arXiv:2111.13790
Fu, L., Zhou, C., Guo, Q., Juefei-Xu, F., Yu, H., Feng, W., Liu, Y., & Wang, S. (2021b). Auto-exposure fusion for single-image shadow removal. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10571–10580).
Gandhi, A., & Jain, S. (2020) Adversarial perturbations fool deepfake detectors. In 2020 International joint conference on neural networks (IJCNN) (pp. 1–8). IEEE.
Ganiyusufoglu, I., Ngô, L. M., Savov, N., Karaoglu, S., Gevers, T. (2020). Spatio-temporal features for generalized detection of deepfake videos. arXiv preprint arXiv:2010.11844
Gao, G., Huang, H., Fu, C., Li, Z., & He, R. (2021a). Information bottleneck disentanglement for identity swapping. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3404–3413).
Gao, R., Guo, Q., Juefei-Xu, F., Yu, H., Ren, X., Feng, W., & Wang, S. (2020). Making images undiscoverable from co-saliency detection. arXiv preprint arXiv:2009.09258
Gao, R., Guo, Q., Juefei-Xu, F., Yu, H., & Feng, W. (2021b). AdvHaze: Adversarial Haze Attack. arXiv preprint arXiv:2104.13673
Gao, R., Guo, Q., Zhang, Q., Juefei-Xu, F., Yu, H., & Feng, W. (2021c). Adversarial Relighting against Face Recognition. arXiv preprint arXiv:2108.07920
Gao, R., Guo, Q., Juefei-Xu, F., Yu, H., Fu, H., Feng, W., Liu, Y., & Wang, S. (2022). Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). IEEE.
Gao, Y., Wei, F., Bao, J., Gu, S., Chen, D., Wen, F., & Lian, Z. (2021d). High-fidelity and arbitrary face editing. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 16115–16124).
Goebel, M., Nataraj, L., Nanjundaswamy, T., Mohammed, T. M., Chandrasekaran S, Manjunath B (2020) Detection, attribution and localization of gan generated images. arXiv preprint arXiv:2007.10466
Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014a). Generative adversarial networks. arXiv preprint arXiv:1406.2661
Goodfellow, I. J., Shlens, J., & Szegedy, C. (2014b). Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572
Gu, S., Bao, J., Yang, H., Chen, D., Wen, F., & Yuan, L. (2019) Mask-guided portrait editing with conditional gans. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 3436–3445).
Guarnera, L., Giudice, O., & Battiato, S. (2020a). Deepfake detection by analyzing convolutional traces. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops (pp. 666–667).
Guarnera, L., Giudice, O., & Battiato, S. (2020). Fighting deepfake by exposing the convolutional traces on images. IEEE Access, 8, 165085–165098.
Guarnera, L., Giudice, O., Nastasi, C., & Battiato, S. (2020c). Preliminary forensics analysis of deepfake images. In 2020 AEIT international annual conference (AEIT) (pp. 1–6). IEEE.
Güera, D., & Delp, E. J. (2018). Deepfake video detection using recurrent neural networks. In 2018 15th IEEE international conference on advanced video and signal based surveillance (AVSS) (pp. 1–6). IEEE.
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., & Courville, A. C. (2017). Improved training of Wasserstein gans. Advances in neural information processing systems, 30, 5767–5777.
Guo, Q., Juefei-Xu, F., Xie, X., Ma, L., Wang, J., Yu, B., Feng, W., & Liu, Y. (2020). Watch out! motion is blurring the vision of your deep neural networks. Advances in Neural Information Processing Systems, 33, 975–985.
Guo, Q., Cheng, Z., Juefei-Xu, F., Ma, L., Xie, X., Liu, Y., & Zhao, J. (2021). Learning to Adversarially Blur Visual Object Tracking. In Proceedings of the IEEE international conference on computer vision (ICCV). IEEE.
Guo, Y., Zhang, L., Hu, Y., He, X., & Gao, J. (2016). Ms-celeb-1m: A dataset and benchmark for large-scale face recognition. In European conference on computer vision (pp. 87–102). Springer.
Guo, Y., Chen, J., Wang, J., Chen, Q., Cao, J., Deng, Z., Xu, Y., & Tan, M. (2020b). Closed-loop matters: Dual regression networks for single image super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5407–5416).
Guo, Z., Yang, G., Chen, J., & Sun. X. (2020c). Fake face detection via adaptive residuals extraction network. arXiv preprint arXiv:2005.04945
Gupta, P., Chugh, K., Dhall, A., Subramanian, R. (2020). The eyes know it: Fakeet-an eye-tracking database to understand deepfake perception. In Proceedings of the 2020 international conference on multimodal interaction (pp. 519–527).
Ha, S., Kersner, M., Kim, B., Seo, S., & Kim, D. (2020). Marionette: Few-shot face reenactment preserving identity of unseen targets. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 10893–10900.
Haliassos, A., Vougioukas, K., Petridis, S., & Pantic, M. (2021). Lips don’t lie: A generalisable and robust approach to face forgery detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5039–5049).
Hasan, H. R., & Salah, K. (2019). Combating deepfake videos using blockchain and smart contracts. IEEE Access, 7, 41596–41606.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770–778).
He, P., Li, H., & Wang, H. (2019a). Detection of fake images via the ensemble of deep representations from multi color spaces. In 2019 IEEE international conference on image processing (ICIP) (pp. 2299–2303). IEEE.
He, Y., Gan, B., Chen, S., Zhou, Y., Yin, G., Song, L., Sheng, L., Shao, J., & Liu, Z. (2021) Forgerynet: A versatile benchmark for comprehensive forgery analysis. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 4360–4369).
He, Z., Zuo, W., Kan, M., Shan, S., & Chen, X. (2019). Attgan: Facial attribute editing by only changing what you want. IEEE Transactions on Image Processing, 28(11), 5464–5478.
Hernandez-Ortega, J., Tolosana, R., Fierrez, J., & Morales, A. (2020). Deepfakeson-phys: Deepfakes detection based on heart rate estimation. arXiv preprint arXiv:2010.00400
Hsu, C. C., Lee, C. Y., & Zhuang, Y. X. (2018). Learning to detect fake face images in the wild. In 2018 international symposium on computer, consumer and control (IS3C) (pp. 388–391). IEEE.
Hsu, C. C., Zhuang, Y. X., & Lee, C. Y. (2020). Deep fake image detection based on pairwise learning. Applied Sciences, 10(1), 370.
Hu, S., Li, Y., & Lyu, S. (2021). Exposing gan-generated faces using inconsistent corneal specular highlights. In ICASSP 2021–2021 IEEE international conference on acoustics (pp. 2500–2504). IEEE: Speech and Signal Processing (ICASSP).
Huang, H., Li, Z., He, R., Sun, Z., & Tan, T. (2018). Introvae: Introspective variational autoencoders for photographic image synthesis. arXiv preprint arXiv:1807.06358
Huang, Y., Juefei-Xu, F., Guo, Q., Xie, X., Ma, L., Miao, W., Liu, Y., & Pu, G. (2020a). Fakeretouch: Evading deepfakes detection via the guidance of deliberate noise. arXiv preprint arXiv:2009.09213
Huang, Y., Juefei-Xu, F., Wang, R., Guo, Q., Ma, L., Xie, X., Li, J., Miao, W., Liu, Y., & Pu, G. (2020b). Fakepolisher: Making deepfakes more detection-evasive by shallow reconstruction. In Proceedings of the 28th ACM international conference on multimedia (pp 1217–1226).
Huang, Y., Juefei-Xu, F., Gou, Q., Liu, Y., & Pu, G. (2022). Fakelocator: Robust localization of GAN-based face manipulations. IEEE Transactions on Information Forensics and Security.
Huang, Y., Juefei-Xu, F., Guo, Q., Ma, L., Xie, X., Miao, W., Liu, Y., & Pu, G. (2021a). Dodging DeepFake detection via implicit spatial-domain notch filtering. arXiv preprint arXiv:2009.09213
Huang, Y., Juefei-Xu, F., Guo, Q., Miao, W., Liu, Y., & Pu, G. (2021b). AdvBokeh: Learning to adversarially defocus Blur. arXiv preprint arXiv:2111.12971
Hulzebosch, N., Ibrahimi, S., & Worring, M. (2020). Detecting cnn-generated facial images in real-world scenarios. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops (pp. 642–643).
Hyun, S., Kim, J., & Heo, J. P. (2021). Self-supervised video gans: Learning for appearance consistency and motion coherency. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10826–10835).
Iizuka, S., Simo-Serra, E., & Ishikawa, H. (2017). Globally and locally consistent image completion. ACM Transactions on Graphics (ToG), 36(4), 1–14.
IMDb. (2021). Avatar. Retrieved August 1, 2021, from https://www.imdb.com/title/tt0499549/ (online).
Jain, A. K., Flynn, P., & Ross, A. A. (2007). Handbook of biometrics. Springer.
Jeon, H., Bang, Y., Kim, J., & Woo, S. S. (2020a). T-gd: Transferable gan-generated images detection framework. arXiv preprint arXiv:2008.04115
Jeon, H., Bang, Y., Woo, S.S. (2020b). Fdftnet: Facing off fake images using fake detection fine-tuning network. In: IFIP international conference on ICT systems security and privacy protection (pp. 416–430). Springer.
Jia, C., Yang, Y., Xia, Y., Chen, Y. T., Parekh, Z., Pham, H., Le, Q. V., Sung, Y., Li, Z., & Duerig, T. (2021). Scaling up visual and vision-language representation learning with noisy text supervision. arXiv preprint arXiv:2102.05918
Jiang, L., Dai, B., Wu, W., & Loy, C. C. (2020a). Focal frequency loss for generative models. arXiv preprint arXiv:2012.12821
Jiang, L., Li, R., Wu, W., Qian, C., Loy, C. C. (2020b) Deeperforensics-1.0: A large-scale dataset for real-world face forgery detection. In 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 2886–2895). IEEE.
Jiang, L., Guo, Z., Wu, W., Liu, Z., Liu, Z., Loy, C. C., Yang, S., Xiong, Y., Xia, W., Chen, B., & Zhuang, P. (2021a). Deeperforensics challenge 2020 on real-world face forgery detection: Methods and results. arXiv preprint arXiv:2102.09471
Jiang, Y., Chang, S., Wang, Z. (2021b). Transgan: Two pure transformers can make one strong gan, and that can scale up. In 35th conference on neural information processing systems.
Jo, Y., & Park, J. (2019). Sc-fegan: Face editing generative adversarial network with user’s sketch and color. In Proceedings of the IEEE international conference on computer vision (pp. 1745–1753).
Juefei-Xu, F., & Savvides, M. (2015). Pokerface: Partial order keeping and energy repressing method for extreme face illumination normalization. In Proceedings of the IEEE 7th international conference on biometrics: theory, applications, and systems (BTAS) (pp. 1–8). IEEE.
Juefei-Xu, F., & Savvides, M. (2016). Fastfood dictionary learning for periocular-based full face hallucination. In Proceedings of the IEEE 7th international conference on biometrics: theory, applications, and systems (BTAS) (pp. 1–8). IEEE.
Juefei-Xu, F., Luu, K., & Savvides, M. (2015). Spartans: Single-sample periocular-based alignment-robust recognition technique applied to non-frontal scenarios. IEEE Transactions on Image Processing (TIP), 24(12), 4780–4795.
Jung, S., & Keuper, M. (2020). Spectral distribution aware image generation. arXiv preprint arXiv:2012.03110
Karnewar, A., & Wang. O. (2020). Msg-gan: Multi-scale gradients for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 7799–7808).
Karras, T., Aila, T., Laine, S., Lehtinen, J. (2017). Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196
Karras, T., Laine, S., & Aila, T. (2019). A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4401–4410).
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., & Aila, T. (2020). Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 8110–8119).
Kawa, P., & Syga, P. (2020). A note on deepfake detection with low-resources. arXiv preprint arXiv:2006.05183
Kemelmacher-Shlizerman, I., Seitz, S. M., Miller, D., & Brossard, E. (2016). The megaface benchmark: 1 million faces for recognition at scale. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4873–4882).
Khalid, H., & Woo, S. S. (2020). Oc-fakedect: Classifying deepfakes using one-class variational autoencoder. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops (pp. 656–657).
Khan, S., Naseer, M., Hayat, M., Zamir, S. W., Khan, F. S., & Shah, M. (2021). Transformers in vision: A survey. arXiv preprint arXiv:2101.01169
Kim, M., Tariq, S., & Woo, S. S. (2021). Fretal: Generalizing deepfake detection using knowledge distillation and representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 1001–1012).
Kingma, D. P., & Dhariwal, P. (2018). Glow: Generative flow with invertible 1x1 convolutions. In Advances in neural information processing systems (pp. 10215–10224).
Kingma, D. P., & Welling, M. (2013). Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114
Kingma, D. P., & Welling, M. (2019). An introduction to variational autoencoders. arXiv preprint arXiv:1906.02691
Koopman, M., Rodriguez, A. M., & Geradts, Z. (2018). Detection of deepfake video manipulation. In Conference: IMVIP.
Korshunov, P., & Marcel, S. (2018). Deepfakes: A new threat to face recognition? assessment and detection. arXiv preprint arXiv:1812.08685
Kukanov, I., Karttunen, J., Sillanpää, H., & Hautamäki, V. (2020). Cost sensitive optimization of deepfake detector. In 2020 Asia-Pacific signal and information processing association annual summit and conference (APSIPA ASC) (pp. 1300–1303). IEEE.
Kumar, P., Vatsa, M., & Singh, R. (2020). Detecting face2face facial reenactment in videos. In The IEEE winter conference on applications of computer vision (pp. 2589–2597).
Kwon, P., You, J., Nam, G., Park, S., & Chae, G. (2021). Kodf: A large-scale korean deepfake detection dataset. arXiv preprint arXiv:2103.10094
Le, T. N., Nguyen, H. H., Yamagishi, J., & Echizen, I. (2021). Openforensics: Large-scale challenging dataset for multi-face forgery detection and segmentation in-the-wild. arXiv preprint arXiv:2107.14480
Li, H., Chen, H., Li, B., & Tan, S. (2018a). Can forensic detectors identify gan generated images? In 2018 Asia-Pacific signal and information processing association annual summit and conference (APSIPA ASC) (pp. 722–727). IEEE.
Li, H., Li, B., Tan, S., & Huang, J. (2020). Identification of deep network generated images using disparities in color components. Signal Processing, 174, 107616.
Li, J., Shen, T., Zhang, W., Ren, H., Zeng, D., & Mei, T. (2019a). Zooming into face forensics: A pixel-level analysis. arXiv preprint arXiv:1912.05790
Li, J., Li, Z., Cao, J., Song, X., & He, R. (2021a). Faceinpainter: High fidelity face adaptation to heterogeneous domains. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5089–5098).
Li, J., Xie, H., Li, J., Wang, Z., & Zhang, Y. (2021b). Frequency-aware discriminative feature learning supervised by single-center loss for face forgery detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 6458–6467).
Li, K., Zhang, T., & Malik, J. (2019b). Diverse image synthesis from semantic layouts via conditional imle. In Proceedings of the IEEE international conference on computer vision (pp. 4220–4229).
Li, L., Bao, J., Yang, H., Chen, D., & Wen, F. (2020b). Advancing high fidelity identity swapping for forgery detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5074–5083).
Li, L., Bao, J., Zhang, T., Yang, H., Chen, D., Wen, F., & Guo, B. (2020c). Face x-ray for more general face forgery detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5001–5010).
Li, T., & Lin, L. (2019). Anonymousnet: Natural face de-identification with measurable privacy. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops.
Li, X., Lang, Y., Chen, Y., Mao, X., He, Y., Wang, S., Xue, H., & Lu, Q. (2020d). Sharp multiple instance learning for deepfake video detection. In Proceedings of the 28th ACM international conference on multimedia (pp. 1864–1872).
Li, Y., & Lyu, S. (2019). Exposing deepfake videos by detecting face warping artifacts. In 2017 IEEE conference on computer vision and pattern recognition workshops (CVPRW) (pp. 1831–1839). IEEE.
Li, Y., Chang, M. C., & Lyu. S. (2018b). In ICTU oculi: Exposing AI created fake videos by detecting eye blinking. In 2018 IEEE international workshop on information forensics and security (WIFS) (pp. 1–7). IEEE.
Li, Y., Yang, X., Sun, P., Qi, H., & Lyu, S. (2020e). Celeb-df: A large-scale challenging dataset for deepfake forensics. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3207–3216).
de Lima, O., Franklin, S., Basu, S., Karwoski, B., & George, A. (2020). Deepfake detection using spatiotemporal convolutional networks. arXiv preprint arXiv:2006.14749
Lin, C. H., Chang, C. C., Chen, Y. S., Juan, D. C., Wei, W., & Chen, H. T. (2019). Coco-gan: Generation by parts via conditional coordinating. In Proceedings of the IEEE international conference on computer vision (pp. 4512–4521).
Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C. L. (2014). Microsoft coco: Common objects in context. In European conference on computer vision (pp. 740–755). Springer.
Liu, B., Zhu, Y., Song, K., & Elgammal, A. (2021). Self-supervised sketch-to-image synthesis. Proceedings of the AAAI Conference on Artificial Intelligence, 35, 2073–2081.
Liu, H., Li, X., Zhou, W., Chen, Y., He, Y., Xue, H., Zhang, W., & Yu, N. (2021b). Spatial-phase shallow learning: rethinking face forgery detection in frequency domain. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 772–781).
Liu, J., Zhang, W., Tang, Y., Tang. J., & Wu, G. (2020a). Residual feature aggregation network for image super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 2359–2368).
Liu, M., Ding, Y., Xia, M., Liu, X., Ding, E., Zuo, W., & Wen, S. (2019). Stgan: A unified selective transfer network for arbitrary image attribute editing. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3673–3682).
Liu, S., Lin, T., He, D., Li, F., Deng, R., Li, X., Ding, E., & Wang, H. (2021c). Paint transformer: Feed forward neural painting with stroke prediction. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 6598–6607).
Liu, Z., Luo, P., Wang, X., & Tang, X. (2015). Deep learning face attributes in the wild. In Proceedings of the IEEE international conference on computer vision (pp. 3730–3738).
Liu, Z., Qi, X., & Torr, P. H. (2020b). Global texture enhancement for fake face detection in the wild. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 8060–8069).
Lu, S. A. (2018). FaceSwap-GAN. https://github.com/shaoanlu/faceswap-GAN
Luo, Y., Zhang, Y., Yan, J., & Liu, W. (2021). Generalizing face forgery detection with high-frequency features. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 16317–16326).
Lyu, S. (2020). Deepfake detection: Current challenges and next steps. In 2020 IEEE international conference on multimedia & expo workshops (ICMEW) (pp. 1–6). IEEE.
Mansourifar, H., & Shi, W. (2020). One-shot gan generated fake face detection. arXiv preprint arXiv:2003.12244
Mao, X., Li, Q., Xie, H., Lau, R. Y., & Wang, Z. (2016). Multi-class generative adversarial networks with the l2 loss function. arXiv preprint arXiv:1611.040765, 1057–7149.
Marra, F., Gragnaniello, D., Cozzolino, D., & Verdoliva, L. (2018). Detection of gan-generated fake images over social networks. In 2018 IEEE conference on multimedia information processing and retrieval (MIPR) (pp. 384–389). IEEE.
Marra, F., Gragnaniello, D., Verdoliva, L., & Poggi, G. (2019a). Do gans leave artificial fingerprints? In 2019 IEEE conference on multimedia information processing and retrieval (MIPR) (pp. 506–511). IEEE.
Marra, F., Saltori, C., Boato, G., & Verdoliva, L. (2019b). Incremental learning for the detection and classification of gan-generated images. In 2019 IEEE international workshop on information forensics and security (WIFS) (pp. 1–6). IEEE.
Marra, F., Gragnaniello, D., Verdoliva, L., & Poggi, G. (2020). A full-image full-resolution end-to-end-trainable CNN framework for image forgery detection. IEEE Access, 8, 133488–133502.
Mas Montserrat, D., Hao, H., Yarlagadda, S.K., Baireddy, S., Shao, R., Horvath, J., Bartusiak, E., Yang, J., Guera, D., Zhu, F., & Delp, E. J. (2020). Deepfakes detection with automatic face weighting. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops (pp. 668–669).
Masi, I., Killekar, A., Mascarenhas, R. M., Gurudatt, S. P., & AbdAlmageed, W. (2020) Two-branch recurrent network for isolating deepfakes in videos. In European conference on computer vision (pp. 667–684). Springer.
Matern, F., Riess, C., & Stamminger, M. (2019). Exploiting visual artifacts to expose deepfakes and face manipulations. In 2019 IEEE winter applications of computer vision workshops (WACVW) (pp. 83–92). IEEE.
Maurer, U. M. (2000). Authentication theory and hypothesis testing. IEEE Transactions on Information Theory, 46(4), 1350–1356.
Maximov, M., Elezi, I., Leal-Taixé, L. (2020). Ciagan: Conditional identity anonymization generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5447–5456).
McCloskey, S., & Albright, M. (2019). Detecting gan-generated imagery using saturation cues. In 2019 IEEE international conference on image processing (ICIP) (pp. 4584–4588). IEEE.
Mei, Y., Fan, Y., Zhou, Y., Huang, L., Huang, T. S., & Shi, H. (2020). Image super-resolution with cross-scale non-local attention and exhaustive self-exemplars mining. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5690–5699).
Mirsky, Y., & Lee, W. (2021). The creation and detection of deepfakes: A survey. ACM Computing Surveys (CSUR), 54(1), 1–41.
Mirza, M., & Osindero, S. (2014). Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784
MIT, T. R. (2020). Deepfake Putin is here to warn Americans about their self-inflicted doom. https://www.technologyreview.com/2020/09/29/1009098/ai-deepfake-putin-kim-jong-un-us-election/
Mittal, T., Bhattacharya, U., Chandra, R., Bera, A., & Manocha, D. (2020). Emotions don’t lie: A deepfake detection method using audio-visual affective cues. arXiv preprint arXiv:2003.06711
Miyato, T., Kataoka, T., Koyama, M., & Yoshida, Y. (2018). Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957
Mo, H., Chen, B., & Luo, W. (2018). Fake faces identification via convolutional neural network. In Proceedings of the 6th ACM workshop on information hiding and multimedia security (pp. 43–47).
Nataraj, L., Mohammed, T. M., Manjunath, B., Chandrasekaran, S., Flenner, A., Bappy, J. H., & Roy-Chowdhury, A. K. (2019). Detecting gan generated fake images using co-occurrence matrices. Electronic Imaging, 5, 532–1.
Natsume, R., Yatagawa, T., & Morishima, S. (2018) Rsgan: Face swapping and editing using face and hair representation in latent spaces. In ACM SIGGRAPH 2018 posters (pp. 1–2).
Neekhara, P., Dolhansky, B., Bitton, J., & Ferrer, C. C. (2021). Adversarial threats to deepfake detection: A practical perspective. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 923–932).
Neves, J. C., Tolosana, R., Vera-Rodriguez, R., Lopes, V., Proença, H., & Fierrez, J. (2020). Ganprintr: Improved fakes and evaluation of the state of the art in face manipulation detection. IEEE Journal of Selected Topics in Signal Processing, 14(5), 1038–1048.
Nguyen, H. H., Fang, F., Yamagishi, J., & Echizen, I. (2019a). Multi-task learning for detecting and segmenting manipulated facial images and videos. In Proceedings of the 10th international conference on biometrics theory, applications and systems (BTAS).
Nguyen, H. H., Yamagishi, J., & Echizen, I. (2019). Capsule-forensics: Using capsule networks to detect forged images and videos. In ICASSP 2019–2019 IEEE international conference on acoustics (pp. 2307–2311). IEEE: Speech and Signal Processing (ICASSP).
Nguyen, H. H., Yamagishi, J., & Echizen, I. (2019c). Use of a capsule network to detect fake images and videos. arXiv preprint arXiv:1910.12467
Nguyen, T. T., Nguyen, C. M., Nguyen, D. T., Nguyen, D. T., & Nahavandi, S. (2019d). Deep learning for deepfakes creation and detection. arXiv preprint arXiv:1909.11573
Nhu, T., Na, I., & Kim, S. (2018). Forensics face detection from gans using convolutional neural network. In Proceeding of 2018 international symposium on information technology convergence (ISITC 2018).
Nirkin, Y., Keller, Y., & Hassner, T. (2019). Fsgan: Subject agnostic face swapping and reenactment. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 7184–7193).
Nirkin, Y., Wolf, L., Keller, Y., & Hassner, T. (2020). Deepfake detection based on the discrepancy between the face and its context. arXiv preprint arXiv:2008.12262
Noroozi, M. (2020). Self-labeled conditional gans. arXiv preprint arXiv:2012.02162
NPR. (2020). Where are the deepfakes in this presidential election? https://www.npr.org/2020/10/01/918223033/where-are-the-deepfakes-in-this-presidential-election
OpenAI. (2021). DALL-E: Creating images from text. https://openai.com/blog/dall-e/
Osakabe, T., Tanaka, M., Kinoshita, Y., & Kiya, H. (2021). Cyclegan without checkerboard artifacts for counter-forensics of fake-image detection. In International workshop on advanced imaging technology (IWAIT) (Vol. 11766, p. 1176609). International Society for Optics and Photonics.
OValery. (2017). Swap-face. https://github.com/OValery16/swap-face
Pang, T., Du, C., Dong, Y., & Zhu, J. (2018). Towards robust detection of adversarial examples. In Proceedings of the 32nd international conference on neural information processing systems (pp. 4584–4594).
Park, T., Liu, M. Y., Wang, T. C., & Zhu, J. Y. (2019). Semantic image synthesis with spatially-adaptive normalization. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2337–2346).
Parkhi, O. M., Vedaldi, A., & Zisserman, A. (2015). Deep face recognition. In Proceedings of the British machine vision conference (BMVC) (pp 41.1–41.12).
Perarnau, G., Van De Weijer, J., Raducanu, B., & Álvarez, J. M. (2016). Invertible conditional gans for image editing. arXiv preprint arXiv:1611.06355
Petrov, I., Gao, D., Chervoniy, N., Liu, K., Marangonda, S., Umé, C., Jiang, J., Rp, L., Zhang, S., Wu, P., & Zhang, W. (2020). Deepfacelab: A simple, flexible and extensible face swapping framework. arXiv preprint arXiv:2005.05535
Pinscreen. (2021). AI avatars, virtual assistants, and deepfakes: A real-time look. Retrieved August 1, 2021, from https://blog.siggraph.org/2021/01/ai-avatars-virtual-assistants-and-deepfakes-a-real-time-look.html/, (online).
Pinscreen. (2021). Pinscreen AI-driven virtual avatars. http://www.pinscreen.com/
Pishori, A., Rollins, B., van Houten, N., Chatwani, N., & Uraimov, O. (2020). Detecting deepfake videos: An analysis of three techniques. arXiv preprint arXiv:2007.08517
Pu, J., Mangaokar, N., Wang, B., Reddy, C. K., & Viswanath, B. (2020). Noisescope: Detecting deepfake images in a blind setting. In Annual computer security applications conference (pp. 913–927).
Pu, J., Mangaokar, N., Kelly, L., Bhattacharya, P., Sundaram, K., Javed, M., Wang, B., & Viswanath, B. (2021). Deepfake videos in the wild: Analysis and detection. Proceedings of the Web Conference, 2021, 981–992.
Pumarola, A., Agudo, A., Martinez, A. M., Sanfeliu, A., & Moreno-Noguer, F. (2018). Ganimation: Anatomically-aware facial animation from a single image. In Proceedings of the European conference on computer vision (ECCV) (pp. 818–833).
Qi, H., Guo, Q., Juefei-Xu, F., Xie, X., Ma, L., Feng, W., Liu, Y., & Zhao, J. (2020). Deeprhythm: Exposing deepfakes with attentional visual heartbeat rhythms. In Proceedings of the 28th ACM international conference on multimedia (pp. 4318–4327).
Qian, Y., Yin, G., Sheng, L., Chen, Z., & Shao, J. (2020). Thinking in frequency: Face forgery detection by mining frequency-aware clues. In European conference on computer vision (pp. 86–103). Springer.
Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434
Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., & Krueger, G. (2021). Learning transferable visual models from natural language supervision. Image, 2, T2.
Razavi, A., van den Oord, A., & Vinyals, O. (2019). Generating diverse high-fidelity images with vq-vae-2. In Advances in neural information processing systems (pp. 14866–14876).
Reface. (2021). Reface, Apple App Store. https://apps.apple.com/us/app/reface-face-swap-videos/id1488782587
Rezende, D., & Mohamed, S. (2015). Variational inference with normalizing flows. In International conference on machine learning (pp. 1530–1538). PMLR.
Rössler, A., Cozzolino, D., Verdoliva, L., Riess, C., Thies, J., & Nießner, M. (2018). Faceforensics: A large-scale video dataset for forgery detection in human faces. arXiv preprint arXiv:1803.09179
Rossler, A., Cozzolino, D., Verdoliva, L., Riess, C., Thies, J., & Nießner, M. (2019). Faceforensics++: Learning to detect manipulated facial images. In Proceedings of the IEEE international conference on computer vision (pp. 1–11).
Sabir, E., Cheng, J., Jaiswal, A., AbdAlmageed, W., Masi, I., & Natarajan, P. (2019). Recurrent convolutional strategies for face manipulation detection in videos. Interfaces (GUI), 3(1), 80–87.
Sambhu, N., & Canavan, S. (2020). Detecting forged facial videos using convolutional neural network. arXiv preprint arXiv:2005.08344
Schwarcz, S., & Chellappa, R. (2021). Finding facial forgery artifacts with parts-based detectors. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 933–942).
Shu, Z., Yumer, E., Hadap, S., Sunkavalli, K., Shechtman, E., & Samaras, D. (2017). Neural face editing with intrinsic image disentangling. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5541–5550).
Sohrawardi, S. J., Chintha, A., Thai, B., Seng, S., Hickerson, A., Ptucha, R., & Wright, M. (2019). Poster: Towards robust open-world detection of deepfakes. In Proceedings of the 2019 ACM SIGSAC conference on computer and communications security (pp. 2613–2615).
Songsri-in, K., Zafeiriou, S. (2019). Complement face forensic detection and localization with faciallandmarks. arXiv preprint arXiv:1910.05455
Sun, K., Zhao, Y., Jiang, B., Cheng, T., Xiao, B., Liu, D., Mu, Y., Wang, X., Liu, W., & Wang, J. (2019). High-resolution representations for labeling pixels and regions. arXiv preprint arXiv:1904.04514
Sun, L., Juefei-Xu, F., Huang, Y., Guo. Q., Zhu. J., Feng, J., Liu. Y., & Pu, G. (2022). Ala: Adversarial lightness attack via naturalness-aware regularizations. arXiv preprint arXiv:2201.06070
Sun, Q., Tewari, A., Xu, W., Fritz, M., Theobalt, C., & Schiele, B. (2018). A hybrid model for identity obfuscation by face replacement. In Proceedings of the European conference on computer vision (ECCV) (pp. 553–569).
Sun, X., Wu, B., & Chen, W. (2020a). Identifying invariant texture violation for robust deepfake detection. arXiv preprint arXiv:2012.10580
Sun, Y., Wang, X., & Tang, X. (2013). Hybrid deep learning for face verification. In Proceedings of the IEEE international conference on computer vision (pp. 1489–1496).
Sun, Y., Wang, X., Liu, Z., Miller, J., Efros, A., Hardt, M. (2020b). Test-time training with self-supervision for generalization under distribution shifts. In International conference on machine learning (pp. 9229–9248). PMLR.
Suwajanakorn, S., Seitz, S. M., & Kemelmacher-Shlizerman, I. (2017). Synthesizing Obama: Learning lip sync from audio. ACM Transactions on Graphics (TOG), 36(4), 1–13.
Synthesia. (2021). Synthesia software. https://www.synthesia.io/
Tan, M., & Le, Q. (2019). Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning (pp. 6105–6114). PMLR.
Tarasiou, M., & Zafeiriou, S. (2020). Extracting deep local features to detect manipulated images of human faces. In 2020 IEEE international conference on image processing (ICIP) (pp. 1821–1825). IEEE.
Tariq, S., Lee, S., Kim, H., Shin, Y., & Woo, S. S. (2018). Detecting both machine and human created fake face images in the wild. In Proceedings of the 2nd international workshop on multimedia privacy and security (pp. 81–87).
Tariq, S., Lee, S., & Woo, S. S. (2020). A convolutional lstm based residual network for deepfake video detection. arXiv preprint arXiv:2009.07480
Texas. (2019). Texas Senate Bill No. 751. https://capitol.texas.gov/tlodocs/86R/analysis/html/SB00751F.htm
The Verge. (2019a). China makes it a criminal offense to publish deepfakes or fake news without disclosure. https://www.theverge.com/2019/11/29/20988363/
The Verge. (2019b). Virginia’s ‘revenge porn’ laws now officially cover deepfakes. https://www.theverge.com/2019/7/1/20677800/
Thies, J., Zollhofer, M., Stamminger, M., Theobalt, C., & Nießner, M. (2016). Face2face: Real-time face capture and reenactment of rgb videos. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2387–2395).
Thies, J., Zollhöfer, M., & Nießner, M. (2019). Deferred neural rendering: Image synthesis using neural textures. ACM Transactions on Graphics (TOG), 38(4), 1–12.
Tian, B., Guo, Q., Juefei-Xu, F., Le Chan, W., Cheng, Y., Li, X., Xie, X., & Qin, S. (2021a). Bias field poses a threat to dnn-based x-ray recognition. In 2021 IEEE international conference on multimedia and expo (ICME) (pp. 1–6). IEEE.
Tian, B., Juefei-Xu, F., Guo, Q., Xie, X., Li, X., & Liu, Y. (2021b). AVA: Adversarial vignetting attack against visual recognition. In Proceedings of the international joint conference on artificial intelligence (IJCAI).
Tolosana, R., Vera-Rodriguez, R., Fierrez, J., Morales, A., & Ortega-Garcia, J. (2020). Deepfakes and beyond: A survey of face manipulation and fake detection. Information Fusion, 64, 131–148.
Trinh, L., Tsang, M., Rambhatla, S., & Liu, Y. (2021). Interpretable and trustworthy deepfake detection via dynamic prototypes. In Proceedings of the IEEE/CVF winter conference on applications of computer vision (pp. 1973–1983).
Tripathy, S., Kannala, J., & Rahtu, E. (2020). Icface: Interpretable and controllable face reenactment using gans. In Proceedings of the IEEE/CVF winter conference on applications of computer vision (pp. 3385–3394).
Tripathy, S., Kannala, J., & Rahtu, E. (2021). Facegan: Facial attribute controllable reenactment gan. In Proceedings of the IEEE/CVF winter conference on applications of computer vision (pp. 1329–1338).
Tursman, E., George, M., Kamara, S., & Tompkin, J. (2020). Towards untrusted social video verification to combat deepfakes via face geometry consistency. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops (pp. 654–655).
Twitter Blog. (2019). Help us shape our approach to synthetic and manipulated media. https://blog.twitter.com/en_us/topics/company/2019/synthetic_manipulated_media_policy_feedback.html
Ulyanov, D., Vedaldi, A., & Lempitsky, V. (2018). Deep image prior. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 9446–9454).
Van Oord, A., Kalchbrenner, N., & Kavukcuoglu, K. (2016). Pixel recurrent neural networks. In: International Conference on Machine Learning, PMLR, pp 1747–1756
Van den Oord, A., Kalchbrenner, N., Vinyals, O., Espeholt, L., Graves, A., & Kavukcuoglu, K. (2016). Conditional image generation with pixelcnn decoders. arXiv preprint arXiv:1606.05328
Verdoliva, L. (2020). Media forensics and deepfakes: An overview. IEEE Journal of Selected Topics in Signal Processing, 14(5), 910–932.
Viazovetskyi, Y., Ivashkin, V., & Kashin, E. (2020). Stylegan2 distillation for feed-forward image manipulation. In European conference on computer vision (pp. 170–186). Springer.
Wang, C., & Deng, W. (2021). Representative forgery mining for fake face detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 14923–14932).
Wang, G., Zhou, J., & Wu, Y. (2020a). Exposing deep-faked videos by anomalous co-motion pattern detection. arXiv preprint arXiv:2008.04848
Wang, R., Juefei-Xu, F., Guo, Q., Huang, Y., Ma, L., Liu, Y., & Wang, L. (2020b). Deeptag: Robust image tagging for deepfake provenance. arXiv preprint arXiv:2009.09869
Wang, R., Juefei-Xu, F., Huang, Y., Guo, Q., Xie, X., Ma, L., & Liu, Y. (2020c). Deepsonar: Towards effective and robust detection of ai-synthesized fake voices. In Proceedings of the 28th ACM international conference on multimedia (pp. 1207–1216).
Wang, R., Juefei-Xu, F., Ma, L., Xie, X., Huang, Y., Wang, J., & Liu, Y. (2020d). Fakespotter: A simple yet robust baseline for spotting ai-synthesized fake faces. In International joint conference on artificial intelligence (IJCAI).
Wang, S. Y., Wang, O., Zhang, R., Owens, A., & Efros, A. A. (2020e). Cnn-generated images are surprisingly easy to spot... for now. In Proceedings of the IEEE conference on computer vision and pattern recognition (Vol. 7).
Wang, T. C., Mallya, A., & Liu, M. Y. (2021). One-shot free-view neural talking-head synthesis for video conferencing. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10039–10049).
Wang, X., Yao, T., Ding, S., & Ma, L. (2020f). Face manipulation detection via auxiliary supervision. In International conference on neural information processing (pp. 313–324). Springer.
Wikipedia. (2021a). Elo rating system. Retrieved December 30, 2020, from https://en.wikipedia.org/wiki/Elo_rating_system (online).
Wikipedia. (2021b). Sankey diagram. Retrieved December 17, 2020, from https://en.wikipedia.org/wiki/Sankey_diagram (online)
Woods, W., Chen, J., & Teuscher, C. (2019). Adversarial explanations for understanding image classification decisions and improved neural network robustness. Nature Machine Intelligence, 1(11), 508–516.
Wu, X., Xie, Z., Gao, Y., & Xiao, Y. (2020). Sstnet: Detecting manipulated faces through spatial, steganalysis and temporal features. In ICASSP 2020–2020 IEEE international conference on acoustics (pp. 2952–2956). IEEE: Speech and Signal Processing (ICASSP).
Wu, Y., AbdAlmageed, W., & Natarajan, P. (2019). Mantra-net: Manipulation tracing network for detection and localization of image forgeries with anomalous features. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 9543–9552).
Xia, W., Yang, Y., Xue, J. H., & Wu, B. (2021). Tedigan: Text-guided diverse face image generation and manipulation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 2256–2265).
Xuan, X., Peng, B., Wang, W., & Dong, J. (2019). On the generalization of gan image forensics. In Chinese conference on biometric recognition (pp. 134–141). Springer.
Yang, X., Li, Y., & Lyu, S. (2019). Exposing deep fakes using inconsistent head poses. In ICASSP 2019–2019 IEEE international conference on acoustics (pp. 8261–8265). IEEE: Speech and Signal Processing (ICASSP).
Yang, X., Li, Y., Qi, H., Lyu, S. (2019b). Exposing gan-synthesized faces using landmark locations. In Proceedings of the ACM workshop on information hiding and multimedia security (pp. 113–118).
Yao, Y., Ren, J., Xie, X., Liu, W., Liu, Y. J., & Wang, J. (2019). Attention-aware multi-stroke style transfer. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1467–1475).
Yi, D., Lei, Z., Liao, S., & Li, S. Z. (2014). Learning face representation from scratch. arXiv preprint arXiv:1411.7923
Yu, C. M., Chang, C. T., & Ti, Y. W. (2019a). Detecting deepfake-forged contents with separable convolutional neural network and image segmentation. arXiv preprint arXiv:1912.12184
Yu, F., Seff, A., Zhang, Y., Song, S., Funkhouser, T., & Xiao, J. (2015). Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., & Huang, T. S. (2018). Generative image inpainting with contextual attention. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5505–5514).
Yu, N., Davis, L. S., & Fritz, M. (2019b). Attributing fake images to gans: Learning and analyzing gan fingerprints. In Proceedings of the IEEE international conference on computer vision (pp. 7556–7566).
Yu, N., Skripniuk, V., Chen, D., Davis, L., & Fritz, M. (2020a). Responsible disclosure of generative models using scalable fingerprinting. arXiv preprint arXiv:2012.08726
Yu, Y., Ni, R., & Zhao, Y. (2020b). Mining generalized features for detecting ai-manipulated fake faces. arXiv preprint arXiv:2010.14129
Yuan, L., Chen, D., Chen, Y.L., Codella, N., Dai, X., Gao, J., Hu, H., Huang, X., Li, B., Li, C., & Liu, C. (2021). Florence: A new foundation model for computer vision. arXiv preprint arXiv:2111.11432
Zao. (2021). Zao, Apple App Store. https://apps.apple.com/cn/app/zao/id1465199127
Zhai, L., Juefei-Xu, F., Guo, Q., Xie, X., Ma, L., Feng, W., Qin, S., & Liu, Y. (2020). It’s raining cats or dogs? adversarial rain attack on dnn perception. arXiv preprint arXiv:2009.09205
Zhai, L., Juefei-Xu, F., Guo, Q., Xie, X., Ma, L., Feng, W., Qin, S., & Liu, Y. (2022). Adversarial rain attack and defensive deraining for dnn perception. arXiv preprint arXiv:2009.09205
Zhang, C., Zhao, Y., Huang, Y., Zeng, M., Ni, S., Budagavi, M., & Guo, X. (2021a). Facial: Synthesizing dynamic talking face with implicit attribute learning. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 3867–3876).
Zhang, G., Kan, M., Shan, S., & Chen, X. (2018). Generative adversarial network with spatial attention for face attribute editing. In Proceedings of the European conference on computer vision (ECCV) (pp. 417–432).
Zhang, W., Ji, X., Chen, K., Ding, Y., & Fan, C. (2021b). Learning a facial expression embedding disentangled from identity. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 6759–6768).
Zhang, X., Karaman, S., & Chang, S. F. (2019). Detecting and simulating artifacts in gan fake images. In 2019 IEEE international workshop on information forensics and security (WIFS) (pp. 1–6). IEEE.
Zhang, Y., Zheng, L., & Thing, V. L. (2017). Automated face swapping and its detection. In 2017 IEEE 2nd international conference on signal and image processing (ICSIP) (pp. 15–19). IEEE.
Zhao, H., Zhou, W., Chen, D., Wei, T., Zhang, W., & Yu, N. (2021). Multi-attentional deepfake detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 2185–2194).
Zhao, T., Xu, X., Xu, M., Ding, H., Xiong, Y., & Xia, W. (2020). Learning to recognize patch-wise consistency for deepfake detection. arXiv preprint arXiv:2012.09311
Zheng, Z., & Hong, P. (2018). Robust detection of adversarial attacks by modeling the intrinsic properties of deep neural networks. In Proceedings of the 32nd international conference on neural information processing systems (pp. 7924–7933).
Zhou, H., Sun, Y., Wu, W., Loy, C. C., Wang, X., & Liu, Z. (2021a). Pose-controllable talking face generation by implicitly modularized audio-visual representation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 4176–4186).
Zhou, P., Han, X., Morariu, V. I., & Davis, L. S. (2017). Two-stream neural networks for tampered face detection. In 2017 IEEE conference on computer vision and pattern recognition workshops (CVPRW) (pp. 1831–1839). IEEE.
Zhou, P., Han, X., Morariu, V. I., & Davis, L. S. (2018). Learning rich features for image manipulation detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1053–1061).
Zhou, T., Wang, W., Liang, Z., & Shen, J. (2021b). Face forensics in the wild. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5778–5788).
Zhu, H., Fu, C., Wu, Q., Wu, W., Qian, C., & He, R. (2020). Aot: Appearance optimal transport based identity swapping for forgery detection. Advances in Neural Information Processing Systems, 33, 21699–21712.
Zhu, J., Guo, Q., Juefei-Xu, F., Huang, Y., Liu, Y., & Pu, G. (2022). Masked faces with faced masks. arXiv preprint arXiv:2201.06427
Zhu, J. Y., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision (pp. 2223–2232).
Zhu, X., Wang, H., Fei, H., Lei, Z., & Li, S. Z. (2021a). Face forgery detection by 3d decomposition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 2929–2939).
Zhu, Y., Li, Q., Wang, J., Xu, C. Z., & Sun, Z. (2021b). One shot face swapping on megapixels. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 4834–4844).
Zi, B., Chang, M., Chen, J., Ma, X., & Jiang, Y. G. (2020). Wilddeepfake: A challenging real-world dataset for deepfake detection. In Proceedings of the 28th ACM international conference on multimedia (pp. 2382–2390).
Acknowledgements
This research was partly supported by the National Key Research and Development Program of China Under Grant No. 2021YFB3100700, the Fellowship of China National Postdoctoral Program for Innovative Talents Under No. BX2021229, the Natural Science Foundation of Hubei Province Under No. 2021CFB089, the Fundamental Research Funds for the Central Universities Under No. 2042021kf1030, the Open Foundation of Henan Key Laboratory of Cyberspace Situation Awareness under No. HNTS2022004, the National Natural Science Foundation of China (NSFC) under No. 61876134. The work was also supported by the National Research Foundation, Singapore under its the AI Singapore Programme (AISG2-RP-2020-019), the National Research Foundation, Prime Ministers Office, Singapore under its National Cybersecurity R &D Program (No. NRF2018NCR-NCR005-0001), NRF Investigatorship NRFI06-2020-0001, the National Research Foundation through its National Satellite of Excellence in Trustworthy Software Systems (NSOE-TSS) project under the National Cybersecurity R &D (NCR) Grant (No. NRF2018NCR-NSOE003-0001). We gratefully acknowledge the support of NVIDIA AI Tech Center (NVAITC) to our research.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Communicated by Jian Sun.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Project lead: Felix Juefei-Xu.
Rights and permissions
About this article
Cite this article
Juefei-Xu, F., Wang, R., Huang, Y. et al. Countering Malicious DeepFakes: Survey, Battleground, and Horizon. Int J Comput Vis 130, 1678–1734 (2022). https://doi.org/10.1007/s11263-022-01606-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-022-01606-8