skip to main content
10.1145/3607947.3608013acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesic3Conference Proceedingsconference-collections
research-article

MMDFD- A Multimodal Custom Dataset for Deepfake Detection

Published: 28 September 2023 Publication History

Abstract

A multi-modal deepfake dataset is relevant in addressing the growing concern of deepfake misuse, which poses a significant security and privacy threat. Deepfakes are becoming increasingly sophisticated, and their potential to deceive individuals and organizations is a significant issue. The ability to generate synthesized human voices using deep learning models and inserting fake subtitles has added to this problem, making it more challenging to detect deepfakes accurately. A superior quality dataset is essential to developing a competent deepfake detector. However, existing datasets are limited and often biased, making it difficult to detect deepfakes accurately. A multi-modal deepfake dataset, such as the proposed multi-modal Audio-Video-Text Deepfake dataset (MMDFD) addresses this gap by providing a more realistic and unbiased dataset. Such a dataset helps develop more accurate and effective deepfake detection methods, which can detect audio, video, and textual deepfakes simultaneously. The proposed dataset is more reflective of situations in the real world since it contains actual YouTube recordings of celebrities from four different racial origins. This helps to avoid the creation of deepfake detectors that are biased toward certain racial or ethnic groups. Overall, a multi-modal deepfake dataset is essential in addressing the growing concerns of deepfake misuse and developing effective detection methods that can detect deepfakes accurately, regardless of the medium.

References

[1]
Joon Son Chung, Arsha Nagrani, and Andrew Zisserman. 2018. Voxceleb2: Deep speaker recognition. arXiv preprint arXiv:1806.05622 (2018).
[2]
AI Communis. 2021. AurisaiAI Transcribe audio to text and add subtitles to videos instantly. https://aurisai.io/
[3]
Brian Dolhansky, Russ Howes, Ben Pflaum, Nicole Baram, and Cristian Canton Ferrer. 2019. The deepfake detection challenge (dfdc) preview dataset. arXiv preprint arXiv:1910.08854 (2019).
[4]
Nick Dufour and Andrew Gully. [n. d.]. Contributing data to deepfake detection research(2019).
[5]
Shreyan Ganguly, Sk Mohiuddin, Samir Malakar, Erik Cuevas, and Ram Sarkar. 2022. Visual attention-based deepfake video forgery detection. Pattern Analysis and Applications (2022), 1–12.
[6]
Brian Hosler, Davide Salvi, Anthony Murray, Fabio Antonacci, Paolo Bestagini, Stefano Tubaro, and Matthew C Stamm. 2021. Do Deepfakes Feel Emotions? A Semantic Approach to Detecting Deepfakes via Emotional Inconsistencies. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1013–1022.
[7]
Ye Jia, Yu Zhang, Ron J Weiss, Quan Wang, Jonathan Shen, Fei Ren, Zhifeng Chen, Patrick Nguyen, Ruoming Pang, Ignacio Lopez Moreno, 2018. Transfer learning from speaker verification to multispeaker text-to-speech synthesis. arXiv preprint arXiv:1806.04558 (2018).
[8]
Hasam Khalid, Minha Kim, Shahroz Tariq, and Simon S Woo. 2021. Evaluation of an Audio-Video Multimodal Deepfake Dataset using Unimodal and Multimodal Detectors. In Proceedings of the 1st Workshop on Synthetic Multimedia-Audiovisual Deepfake Generation and Detection. 7–15.
[9]
Hasam Khalid, Shahroz Tariq, Minha Kim, and Simon S Woo. 2021. FakeAVCeleb: a novel audio-video multimodal deepfake dataset. arXiv preprint arXiv:2108.05080 (2021).
[10]
Pavel Korshunov and Sébastien Marcel. 2018. Deepfakes: a new threat to face recognition? assessment and detection. arXiv preprint arXiv:1812.08685 (2018).
[11]
John K Lewis, Imad Eddine Toubal, Helen Chen, Vishal Sandesera, Michael Lomnitz, Zigfried Hampel-Arias, Calyam Prasad, and Kannappan Palaniappan. 2020. Deepfake Video Detection Based on Spatial, Spectral, and Temporal Inconsistencies Using Multimodal Deep Learning. In 2020 IEEE Applied Imagery Pattern Recognition Workshop (AIPR). IEEE, 1–9.
[12]
Yuezun Li and Siwei Lyu. 2018. Exposing deepfake videos by detecting face warping artifacts. arXiv preprint arXiv:1811.00656 (2018).
[13]
Yuezun Li, Xin Yang, Pu Sun, Honggang Qi, and Siwei Lyu. 2020. Celeb-df: A large-scale challenging dataset for deepfake forensics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3207–3216.
[14]
Michael Lomnitz, Zigfried Hampel-Arias, Vishal Sandesara, and Simon Hu. 2020. Multimodal Approach for DeepFake Detection. In 2020 IEEE Applied Imagery Pattern Recognition Workshop (AIPR). IEEE, 1–9.
[15]
Momina Masood, Marriam Nawaz, Khalid Mahmood Malik, Ali Javed, and Aun Irtaza. 2021. Deepfakes Generation and Detection: State-of-the-art, open challenges, countermeasures, and way forward. arXiv preprint arXiv:2103.00484 (2021).
[16]
Trisha Mittal, Uttaran Bhattacharya, Rohan Chandra, Aniket Bera, and Dinesh Manocha. 2020. Emotions don’t lie: A deepfake detection method using audio-visual affective cues. arXiv preprint arXiv:2003.06711 3 (2020).
[17]
KR Prajwal, Rudrabha Mukhopadhyay, Vinay P Namboodiri, and CV Jawahar. 2020. A lip sync expert is all you need for speech to lip generation in the wild. In Proceedings of the 28th ACM International Conference on Multimedia. 484–492.
[18]
Andreas Rossler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. 2019. Faceforensics++: Learning to detect manipulated facial images. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1–11.
[19]
Conrad Sanderson. 2002. The vidtimit database. Technical Report. IDIAP.
[20]
Duha A Sultan and Laheeb M Ibrahim. 2022. A Comprehensive Survey on Deepfake Detection Techniques. International Journal of Intelligent Systems and Applications in Engineering 10, 3s (2022), 189–202.
[21]
Supasorn Suwajanakorn, Steven M Seitz, and Ira Kemelmacher-Shlizerman. 2017. Synthesizing obama: learning lip sync from audio. ACM Transactions on Graphics (ToG) 36, 4 (2017), 1–13.
[22]
Youtube. [n. d.]. BBC has wrong subtitles for Trump’s Inauguration. https://www.youtube.com/shorts/4jtzzAQgswo
[23]
Yipin Zhou and Ser-Nam Lim. 2021. Joint audio-visual deepfake detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 14800–14809.

Cited By

View all
  • (2024)D-Fence layer: an ensemble framework for comprehensive deepfake detectionMultimedia Tools and Applications10.1007/s11042-024-18130-183:26(68063-68086)Online publication date: 29-Jan-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
IC3-2023: Proceedings of the 2023 Fifteenth International Conference on Contemporary Computing
August 2023
783 pages
ISBN:9798400700224
DOI:10.1145/3607947
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 September 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Deep neural networks
  2. Deepfake Dataset
  3. Multimodal
  4. Subtitle Deepfakes

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

IC3 2023

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)128
  • Downloads (Last 6 weeks)9
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)D-Fence layer: an ensemble framework for comprehensive deepfake detectionMultimedia Tools and Applications10.1007/s11042-024-18130-183:26(68063-68086)Online publication date: 29-Jan-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media