It is our great pleasure to welcome you to the 1st International Workshop on Deepfake Detection for Audio Multimedia - DDAM 2022. Audio deepfake detection is an emerging topic in multimedia fields, which was included in the ASVspoof 2021. In this workshop, we aim to bring together researchers from the fields of audio deepfake detection, audio deep synthesis, audio fake game and adversarial attacks to further discuss recent research and future directions for detecting deepfake and manipulated audios in multimedia. As far as we know, we are the first workshop to focus on deepfake detection of audio multimedia, which is of great significance.
Proceeding Downloads
Lessons Learned from ASVSpoof and Remaining Challenges
Although speech technology reproducing an individual's voice is expected to bring new value to entertainment, it may cause security problems in speaker recognition systems if misused. In addition, there is a possibility of this technology being used for ...
Detection of Synthetic Speech Based on Spectrum Defects
Synthetic spoofing speech has become a threat to online communication and automatic speaker verification (ASV) systems based on deep learning since the synthetic model can produce anyone's voice. The first Audio Deep Synthesis Detection Challenge (ADD ...
Low-quality Fake Audio Detection through Frequency Feature Masking
The first Audio Deep Synthesis Detection Challenge (ADD 2022) competition was held which dealt with audio deepfake detection, audio deep synthesis, audio fake game, and adversarial attacks. Our team participated in track 1, classifying bona fide and fake ...
Audio Deepfake Detection Based on a Combination of F0 Information and Real Plus Imaginary Spectrogram Features
- Jun Xue,
- Cunhang Fan,
- Zhao Lv,
- Jianhua Tao,
- Jiangyan Yi,
- Chengshi Zheng,
- Zhengqi Wen,
- Minmin Yuan,
- Shegang Shao
Recently, pioneer research works have proposed a large number of acoustic features (log power spectrogram, linear frequency cepstral coefficients, constant Q cepstral coefficients, etc.) for audio deepfake detection, obtaining good performance, and ...
Fully Automated End-to-End Fake Audio Detection
- Chenglong Wang,
- Jiangyan Yi,
- Jianhua Tao,
- Haiyang Sun,
- Xun Chen,
- Zhengkun Tian,
- Haoxin Ma,
- Cunhang Fan,
- Ruibo Fu
The existing fake audio detection systems often rely on expert experience to design the acoustic features or manually design the hyperparameters of the network structure. However, artificial adjustment of the parameters can have a relatively obvious ...
A Comparative Study on Physical and Perceptual Features for Deepfake Audio Detection
Audio content synthesis has stepped into a new era and brought a great threat to daily life since the development of deep learning techniques. The ASVSpoof Challenge and the ADD Challenge have been launched to motivate the development of Deepfake audio ...
Deepfake Detection System for the ADD Challenge Track 3.2 Based on Score Fusion
This paper describes the deepfake audio detection system submitted to the Audio Deep Synthesis Detection (ADD) Challenge Track 3.2 and gives an analysis of score fusion. The proposed system is a score-level fusion of several light convolutional neural ...
Singing-Tacotron: Global Duration Control Attention and Dynamic Filter for End-to-end Singing Voice Synthesis
End-to-end singing voice synthesis (SVS) is attractive due to the avoidance of pre-aligned data. However, the auto-learned alignment of singing voice with lyrics is difficult to match the duration information in a musical score, which will lead to the ...
An Initial Investigation for Detecting Vocoder Fingerprints of Fake Audio
\beginabstract Many effective attempts have been made for fake audio detection. However, they can only provide detection results but no countermeasures to curb this harm. For many related practical applications, what model or algorithm generated the ...
Deep Spectro-temporal Artifacts for Detecting Synthesized Speech
- Xiaohui Liu,
- Meng Liu,
- Lin Zhang,
- Linjuan Zhang,
- Chang Zeng,
- Kai Li,
- Nan Li,
- Kong Aik Lee,
- Longbiao Wang,
- Jianwu Dang
The Audio Deep Synthesis Detection (ADD) Challenge has been held to detect generated human-like speech. With our submitted system, this paper provides an overall assessment of track 1 (Low-quality Fake Audio Detection) and track 2 (Partially Fake Audio ...
Acoustic or Pattern? Speech Spoofing Countermeasure based on Image Pre-training Models
Traditional speech spoofing countermeasures (CM) typically contain a frontend which extract a two dimensional feature from the waveform, and a Convolutional Neural Network (CNN) based backend classifier. This pipeline is similar to an image ...
Human Perception of Audio Deepfakes
The recent emergence of deepfakes has brought manipulated and generated content to the forefront of machine learning research. Automatic detection of deepfakes has seen many new machine learning techniques. Human detection capabilities, however, are far ...
Improving Spoofing Capability for End-to-end Any-to-many Voice Conversion
Audio deep synthesis techniques have been able to generate high-quality speech whose authenticity is difficult for humans to recognize. Meanwhile, many anti-spoofing systems have been developed to capture artifacts in the synthesized speech that are ...
Recommendations
Acceptance Rates
Year | Submitted | Accepted | Rate |
---|---|---|---|
DDAM '22 | 14 | 12 | 86% |
Overall | 14 | 12 | 86% |