Abstract:
Intelligent speech recognition is increasingly used in embedded systems, which is also seriously threatened by malicious speech spoofing attacks. Different from the conve...Show MoreMetadata
Abstract:
Intelligent speech recognition is increasingly used in embedded systems, which is also seriously threatened by malicious speech spoofing attacks. Different from the conventional methods, this article proposes a segment-based anti-spoofing detection (SASD) method for the quick detection of spoofed speeches against embedded speech recognition, which focuses on the anti-spoofing features rather than the contexts of speeches and the voiceprints of speakers. The speeches are divided into word segments and silent segments. Based on constant Q cepstral coefficients (CQCCs), a word CQCC (WCQCC) extraction is first designed for the word segments of speeches. Then, based on short-term zero crossing rate (ZCR), an average ZCR (AZCR) extraction is devised for the silent segments. Combining the WCQCC of word segments and AZCR of silent segments, a biased decision strategy is proposed to quickly determine whether a speech is spoofed. Based on ASVspoof 2021 datasets, extensive experiments are conducted to evaluate the effectiveness of the proposed method. Specifically, our SASD can improve the accuracy of anti-spoofing detection by up to 33.47% and save up to 69.10% of time overhead on embedded devices compared with the existing methods.
Published in: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems ( Volume: 41, Issue: 11, November 2022)