ABSTRACT
Generating a deep learning-based fake video has become no longer rocket science. The advancement of automated Deepfake (DF) generation tools that mimic certain targets has rendered society vulnerable to fake news or misinformation propagation. In real-world scenarios, DF videos are compressed to low-quality (LQ) videos, taking up less storage space and facilitating dissemination through the web and social media. Such LQ DF videos are much more challenging to detect than high-quality (HQ) DF videos. To address this challenge, we rethink the design of standard deep learning-based DF detectors, specifically exploiting feature extraction to enhance the features of LQ images. We propose a novel LQ DF detection architecture, multi-scale Branch Zooming Network (BZNet), which adopts an unsupervised super-resolution (SR) technique and utilizes multi-scale images for training. We train our BZNet only using highly compressed LQ images and experiment under a realistic setting, where HQ training data are not readily accessible. Extensive experiments on the FaceForensics++ LQ and GAN-generated datasets demonstrate that our BZNet architecture improves the detection accuracy of existing CNN-based classifiers by 4.21% on average. Furthermore, we evaluate our method against a real-world Deepfake-in-the-Wild dataset collected from the internet, which contains 200 videos featuring 50 celebrities worldwide, outperforming the state-of-the-art methods by 4.13%.
- Darius Afchar, Vincent Nozick, Junichi Yamagishi, and Isao Echizen. 2018. Mesonet: a compact facial video forgery detection network. In 2018 IEEE International Workshop on Information Forensics and Security (WIFS). IEEE, 1–7.Google ScholarCross Ref
- Belhassen Bayar and Matthew C Stamm. 2016. A deep learning approach to universal image manipulation detection using a new convolutional layer. In Proceedings of the 4th ACM Workshop on Information Hiding and Multimedia Security. 5–10.Google ScholarDigital Library
- Adrian Bulat, Jing Yang, and Georgios Tzimiropoulos. 2018. To learn image super-resolution, use a gan to learn how to do image degradation first. In Proceedings of the European conference on computer vision (ECCV). 185–200.Google ScholarDigital Library
- Hong-Shuo Chen, Mozhdeh Rouhsedaghat, Hamza Ghani, Shuowen Hu, Suya You, and C. C. Jay Kuo. 2021. DefakeHop: A Light-Weight High-Performance Deepfake Detector. arxiv:2103.06929 [cs.CV]Google Scholar
- François Chollet. 2017. Xception: Deep Learning with Depthwise Separable Convolutions. arxiv:1610.02357 [cs.CV]Google Scholar
- Davide Cozzolino, Giovanni Poggi, and Luisa Verdoliva. 2017. Recasting residual-based local descriptors as convolutional neural networks: an application to image forgery detection. In Proceedings of the 5th ACM Workshop on Information Hiding and Multimedia Security. 159–164.Google ScholarDigital Library
- Davide Cozzolino, Andreas Rössler, Justus Thies, Matthias Nießner, and Luisa Verdoliva. 2021. ID-Reveal: Identity-aware DeepFake Video Detection. arxiv:2012.02512 [cs.CV]Google Scholar
- Hao Dang, Feng Liu, Joel Stehouwer, Xiaoming Liu, and Anil Jain. 2020. On the Detection of Digital Face Manipulation. arxiv:1910.01717 [cs.CV]Google Scholar
- EJ Dickson. 2019. Deepfake Porn Is Still a Threat, Particularly for K-Pop Stars. https://www.rollingstone.com/culture/culture-news/deepfakes-nonconsensual-porn-study-kpop-895605. Accessed: 2021-05-21.Google Scholar
- Thomas G Dietterich. 2000. Ensemble methods in machine learning. In International workshop on multiple classifier systems. Springer, 1–15.Google ScholarDigital Library
- Brian Dolhansky, Russ Howes, Ben Pflaum, Nicole Baram, and Cristian Canton Ferrer. 2019. The Deepfake Detection Challenge (DFDC) Preview Dataset. arxiv:1910.08854 [cs.CV]Google Scholar
- Xibin Dong, Zhiwen Yu, Wenming Cao, Yifan Shi, and Qianli Ma. 2020. A survey on ensemble learning. Frontiers of Computer Science 14, 2 (2020), 241–258.Google ScholarDigital Library
- Nick Dufour and Andrew Gully. 2019. Contributing Data to Deepfake Detection Research. https://ai.googleblog.com/2019/09/contributing-data-to-deepfake-detection.html. Accessed: 2021-05-05.Google Scholar
- Meenu EG. 2021. TRY THESE 10 AMAZINGLY REAL DEEPFAKE APPS AND WEBSITES. https://www.analyticsinsight.net/try-these-10-amazingly-real-deepfake-apps-and-websites/. Accessed: 2021-05-21.Google Scholar
- Jessica Fridrich and Jan Kodovsky. 2012. Rich models for steganalysis of digital images. IEEE Transactions on Information Forensics and Security 7, 3(2012), 868–882.Google ScholarDigital Library
- David Güera and Edward J Delp. 2018. Deepfake video detection using recurrent neural networks. In 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). IEEE, 1–6.Google ScholarCross Ref
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.Google ScholarCross Ref
- Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4700–4708.Google ScholarCross Ref
- Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2017. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196(2017).Google Scholar
- Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4401–4410.Google ScholarCross Ref
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25 (2012), 1097–1105.Google Scholar
- Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278–2324.Google ScholarCross Ref
- Christian Ledig, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, 2017. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4681–4690.Google ScholarCross Ref
- Mina Lee. 2021. Creepy deepfake technology, ”Shivering in Fear”. https://www.hankyung.com/society/article/2021022899967. Accessed: 2021-05-22.Google Scholar
- Sangyup Lee, Shahroz Tariq, Junyaup Kim, and Simon S Woo. 2021. TAR: Generalized Forensic Framework to Detect Deepfakes using Weakly Supervised Learning. arXiv preprint arXiv:2105.06117(2021).Google Scholar
- Sangyup Lee, Shahroz Tariq, Youjin Shin, and Simon S Woo. 2021. Detecting handcrafted facial image manipulations and GAN-generated facial images using Shallow-FakeFaceNet. Applied Soft Computing 105 (2021), 107256.Google ScholarDigital Library
- Min Lin, Qiang Chen, and Shuicheng Yan. 2014. Network In Network. arxiv:1312.4400 [cs.NE]Google Scholar
- Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2015. Deep Learning Face Attributes in the Wild. In Proceedings of International Conference on Computer Vision (ICCV).Google ScholarDigital Library
- Yuyang Qian, Guojun Yin, Lu Sheng, Zixuan Chen, and Jing Shao. 2020. Thinking in frequency: Face forgery detection by mining frequency-aware clues. In European Conference on Computer Vision. Springer, 86–103.Google ScholarDigital Library
- Nicolas Rahmouni, Vincent Nozick, Junichi Yamagishi, and Isao Echizen. 2017. Distinguishing computer graphics from natural images using convolution neural networks. In 2017 IEEE Workshop on Information Forensics and Security (WIFS). IEEE, 1–6.Google ScholarCross Ref
- Andreas Rössler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. 2019. FaceForensics++: Learning to Detect Manipulated Facial Images. In ICCV 2019.Google ScholarCross Ref
- Mark Saunokonoko. 2021. Deepfake nudes change the face of cyber threats, revenge porn and scams. https://www.9news.com.au/national/deepfake-nude-how-rise-of-bots-and-ai-could-make-you-a-victim/5d834b26-db9e-4cfe-8541-298dd3f64d01. Accessed: 2021-05-21.Google Scholar
- Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2019. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. International Journal of Computer Vision 128, 2 (Oct 2019), 336–359. https://doi.org/10.1007/s11263-019-01228-7Google ScholarDigital Library
- Sam Shead. 2020. Facebook to ban ‘deepfakes’. https://www.bbc.com/news/technology-51018758. Accessed: 2021-05-21.Google Scholar
- Assaf Shocher, Nadav Cohen, and Michal Irani. 2018. “zero-shot” super-resolution using deep internal learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3118–3126.Google ScholarCross Ref
- Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556(2014).Google Scholar
- Mingxing Tan and Quoc V. Le. 2020. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. arxiv:1905.11946 [cs.LG]Google Scholar
- Shahroz Tariq, Sangyup Lee, and Simon S Woo. 2021. One Detector to Rule Them All: Towards a General Deepfake Attack Detection Framework. arXiv preprint arXiv:2105.00187(2021).Google Scholar
- Ruben Tolosana, Ruben Vera-Rodriguez, Julian Fierrez, Aythami Morales, and Javier Ortega-Garcia. 2020. Deepfakes and beyond: A survey of face manipulation and fake detection. Information Fusion 64(2020), 131–148.Google ScholarCross Ref
- SNS Web. 2021. How Belgian visual expert Chris Ume masterminded Tom Cruise’s deepfakes. https://www.thestatesman.com/technology/belgian-visual-expert-chris-ume-masterminded-tom-cruises-deepfakes-1502955882.html. Accessed: 2021-05-21.Google Scholar
- Yuan Yuan, Siyuan Liu, Jiawei Zhang, Yongbing Zhang, Chao Dong, and Liang Lin. 2018. Unsupervised image super-resolution using cycle-in-cycle generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 701–710.Google ScholarCross Ref
- Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision. 2223–2232.Google ScholarCross Ref
- Bojia Zi, Minghao Chang, Jingjing Chen, Xingjun Ma, and Yu-Gang Jiang. 2021. WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection. arxiv:2101.01456 [cs.CV]Google Scholar
Index Terms
- BZNet: Unsupervised Multi-scale Branch Zooming Network for Detecting Low-quality Deepfake Videos
Recommendations
M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection
ICMR '22: Proceedings of the 2022 International Conference on Multimedia RetrievalThe widespread dissemination of Deepfakes demands effective approaches that can detect perceptually convincing forged images. In this paper, we aim to capture the subtle manipulation artifacts at different scales using transformer models. In particular, ...
Fooling State-of-the-art Deepfake Detection with High-quality Deepfakes
IH&MMSec '23: Proceedings of the 2023 ACM Workshop on Information Hiding and Multimedia SecurityDue to the rising threat of deepfakes to security and privacy, it is most important to develop robust and reliable detectors. In this paper, we examine the need for high-quality samples in the training datasets of such detectors. Accordingly, we show ...
Deep Convolutional Pooling Transformer for Deepfake Detection
Recently, Deepfake has drawn considerable public attention due to security and privacy concerns in social media digital forensics. As the wildly spreading Deepfake videos on the Internet become more realistic, traditional detection techniques have failed ...
Comments