ABSTRACT
This paper proposes a dual-channel network for DeepFake detection. The network comprises two channels: one using a stacked Maxvit block to process the downsampled original images, and the other using a stacked ResNet basic block to capture features from the discrete cosine transform of the image spectrums. The components extracted from the two channels are concatenated using a linear layer to train the entire model for exposing DeepFakes. Experimental results demonstrate that the proposed method could achieve satisfactory forensics performance. Besides, the experiments of cross-dataset evaluations prove it is also high in generalizability.
- 1] Darius Afchar, Vincent Nozick, Junichi Yamagishi, and Isao Echizen. 2018. Mesonet: a compact facial video forgery detection network. In 2018 IEEE in- ternational workshop on information forensics and security (WIFS). IEEE, 1--7.Google Scholar
- Nicolo Bonettini, Edoardo Daniele Cannas, Sara Mandelli, Luca Bondi, Paolo Bestagini, and Stefano Tubaro. 2021. Video face manipulation detection through ensemble of cnns. In 2020 25th international conference on pattern recognition (ICPR). IEEE, 5012--5019.Google ScholarCross Ref
- François Chollet. 2017. Xception: Deep learning with depthwise separable con- volutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1251--1258.Google ScholarCross Ref
- Xiangxiang Chu, Zhi Tian, Bo Zhang, Xinlong Wang, Xiaolin Wei, Huaxia Xia, and Chunhua Shen. 2021. Conditional positional encodings for vision transformers. arXiv preprint arXiv:2102.10882 (2021).Google Scholar
- Feng Ding, Bing Fan, Zhangyi Shen, Keping Yu, Gautam Srivastava, Kapal Dev, and Shaohua Wan. 2022. Securing Facial Bioinformation by Eliminating Adver- sarial Perturbations. IEEE Transactions on Industrial Informatics (2022).Google Scholar
- Feng Ding, Zhangyi Shen, Guopu Zhu, Sam Kwong, Yicong Zhou, and Siwei Lyu. 2022. ExS-GAN: Synthesizing Anti-Forensics Images via Extra Supervised GAN. IEEE Transactions on Cybernetics (2022).Google Scholar
- Feng Ding, Guopu Zhu, Yingcan Li, Xinpeng Zhang, Pradeep K. Atrey, and Siwei Lyu. 2022. Anti-Forensics for Face Swapping Videos via Adversarial Training. IEEE Transactions on Multimedia 24 (2022), 3429--3441.Google ScholarDigital Library
- Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xi- aohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).Google Scholar
- Ricard Durall, Margret Keuper, Franz-Josef Pfreundt, and Janis Keuper. 2019. Unmasking deepfakes with simple features. arXiv preprint arXiv:1911.00686 (2019).Google Scholar
- Patrick Esser, Robin Rombach, and Bjorn Ommer. 2021. Taming transformers for high-resolution image synthesis. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 12873--12883.Google ScholarCross Ref
- Joel Frank, Thorsten Eisenhofer, Lea Schönherr, Asja Fischer, Dorothea Kolossa, and Thorsten Holz. 2020. Leveraging frequency analysis for deep fake image recognition. In International conference on machine learning. PMLR, 3247--3258.Google Scholar
- Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).Google Scholar
- Yonghyun Jeong, Doyeon Kim, Youngmin Ro, and Jongwon Choi. 2022. FrePGAN: robust deepfake detection using frequency-level perturbations. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 1060--1068.Google ScholarCross Ref
- Sohail Ahmed Khan and Duc-Tien Dang-Nguyen. 2022. Hybrid Transformer Net- work for Deepfake Detection. In Proceedings of the 19th International Conference on Content-based Multimedia Indexing. 8--14.Google ScholarDigital Library
- Lingzhi Li, Jianmin Bao, Ting Zhang, Hao Yang, Dong Chen, Fang Wen, and Bain- ing Guo. 2020. Face x-ray for more general face forgery detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5001--5010.Google ScholarCross Ref
- Yuezun Li and Siwei Lyu. 2018. Exposing deepfake videos by detecting face warping artifacts. arXiv preprint arXiv:1811.00656 (2018).Google Scholar
- Yuezun Li, Xin Yang, Pu Sun, Honggang Qi, and Siwei Lyu. 2020. Celeb-df: A large-scale challenging dataset for deepfake forensics. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3207--3216.Google ScholarCross Ref
- Honggu Liu, Xiaodan Li, Wenbo Zhou, Yuefeng Chen, Yuan He, Hui Xue, Weiming Zhang, and Nenghai Yu. 2021. Spatial-phase shallow learning: rethinking face forgery detection in frequency domain. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 772--781.Google ScholarCross Ref
- Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, and Saining Xie. 2022. A convnet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11976--11986.Google ScholarCross Ref
- Yuchen Luo, Yong Zhang, Junchi Yan, and Wei Liu. 2021. Generalizing face forgery detection with high-frequency features. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 16317--16326.Google ScholarCross Ref
- Falko Matern, Christian Riess, and Marc Stamminger. 2019. Exploiting visual artifacts to expose deepfakes and face manipulations. In 2019 IEEE Winter Appli- cations of Computer Vision Workshops (WACVW). IEEE, 83--92.Google Scholar
- Huy H Nguyen, Fuming Fang, Junichi Yamagishi, and Isao Echizen. 2019. Multi- task learning for detecting and segmenting manipulated facial images and videos. In 2019 IEEE 10th International Conference on Biometrics Theory, Applications and Systems (BTAS). IEEE, 1--8.Google ScholarDigital Library
- Huy H Nguyen, Junichi Yamagishi, and Isao Echizen. 2019. Capsule-forensics: Using capsule networks to detect forged images and videos. In ICASSP 2019--2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2307--2311.Google ScholarCross Ref
- Hua Qi, Qing Guo, Felix Juefei-Xu, Xiaofei Xie, Lei Ma, Wei Feng, Yang Liu, and Jianjun Zhao. 2020. Deeprhythm: Exposing deepfakes with attentional visual heartbeat rhythms. In Proceedings of the 28th ACM international conference on multimedia. 4318--4327.Google ScholarDigital Library
- Yuyang Qian, Guojun Yin, Lu Sheng, Zixuan Chen, and Jing Shao. 2020. Thinking in frequency: Face forgery detection by mining frequency-aware clues. In Com- puter Vision--ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part XII. Springer, 86--103.Google Scholar
- Jonas Ricker, Simon Damm, Thorsten Holz, and Asja Fischer. 2022. Towards the Detection of Diffusion Model Deepfakes. arXiv preprint arXiv:2210.14571 (2022).Google Scholar
- Andreas Rossler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. 2019. Faceforensics: Learning to detect manipulated facial images. In Proceedings of the IEEE/CVF international conference on computer vision. 1--11.Google ScholarCross Ref
- Kaede Shiohara and Toshihiko Yamasaki. 2022. Detecting deepfakes with self- blended images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18720--18729.Google ScholarCross Ref
- Mingxing Tan and Quoc Le. 2019. Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning. PMLR, 6105--6114.Google Scholar
- Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, and Yinxiao Li. 2022. Maxvit: Multi-axis vision transformer. In Computer Vision--ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23--27, 2022, Proceedings, Part XXIV. Springer, 459--479.Google Scholar
- Tete Xiao, Mannat Singh, Eric Mintun, Trevor Darrell, Piotr Dollár, and Ross Girshick. 2021. Early convolutions help transformers see better. Advances in Neural Information Processing Systems 34 (2021), 30392--30400.Google Scholar
- Hanqing Zhao, Wenbo Zhou, Dongdong Chen, Tianyi Wei, Weiming Zhang, and Nenghai Yu. 2021. Multi-attentional deepfake detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2185--2194.Google ScholarCross Ref
Index Terms
- Exposing Deepfakes using Dual-Channel Network with Multi-Axis Attention and Frequency Analysis
Recommendations
Discrete Fourier transform‐based block transmission for multi‐carrier faster‐than‐Nyquist signalling
Multi‐carrier faster‐than‐Nyquist (MC‐FTN) signalling, which is a non‐orthogonal data transmission scheme, is broadly viewed as one of the potential candidates for the future high spectral‐efficient communications. In this study, a discrete time‐frequency ...
SINR analysis of FFH/OFDM over frequency selective Rayleigh fading channel
Fast frequency hopping/orthogonal frequency division multiplexing FFH/OFDM has been previously proposed to achieve frequency diversity over frequency selective channels. However, the performances of the FFH/OFDM scheme have been usually calculated using ...
Channel estimation, equalisation, and evaluation for high‐mobility airborne hyperspectral data transmission
In the past few years, unmanned aerial vehicles (UAVs) have become a primary airborne platform for hyperspectral imager for studies on precision agriculture, defence, and the environment. The ‘push‐broom’ type of hyperspectral sensors require moving ...
Comments