The 2020 Personalized Voice Trigger Challenge: Open Datasets, Evaluation Metrics, Baseline System and Results

Jia, Yan; Wang, Xingming; Qin, Xiaoyi; Zhang, Yinping; Wang, Xuyang; Wang, Junjie; Zhang, Dong; Li, Ming

doi:10.21437/Interspeech.2021-602

The 2020 Personalized Voice Trigger Challenge: Open Datasets, Evaluation Metrics, Baseline System and Results

Yan Jia, Xingming Wang, Xiaoyi Qin, Yinping Zhang, Xuyang Wang, Junjie Wang, Dong Zhang, Ming Li

The 2020 Personalized Voice Trigger Challenge (PVTC2020) addresses two different research problems in a unified setup: joint wake-up word detection with speaker verification on close-talking single microphone data and far-field multi-channel microphone array data. Specially, the second task poses an additional cross-channel matching challenge on top of the far-field condition. To simulate the real-life application scenario, the enrollment utterances are recorded from close-talking cell-phone only, while the test utterances are recorded from both the close-talking cell-phone and the far-field microphone arrays. This paper introduces our challenge setup and the released database as well as the evaluation metrics. In addition, we present a sequential two stage end-to-end neural network baseline system trained with the proposed database for speaker-dependent wake-up word detection. Results show that state-of-the-art personalized voice trigger methods are still based on the two stage design, however, this benchmark database could also be used to evaluate multi-task joint learning methods. The official website, the open-source baseline system and results of submitted systems have been released.

doi: 10.21437/Interspeech.2021-602

Cite as: Jia, Y., Wang, X., Qin, X., Zhang, Y., Wang, X., Wang, J., Zhang, D., Li, M. (2021) The 2020 Personalized Voice Trigger Challenge: Open Datasets, Evaluation Metrics, Baseline System and Results. Proc. Interspeech 2021, 4239-4243, doi: 10.21437/Interspeech.2021-602

@inproceedings{jia21b_interspeech,
  author={Yan Jia and Xingming Wang and Xiaoyi Qin and Yinping Zhang and Xuyang Wang and Junjie Wang and Dong Zhang and Ming Li},
  title={{The 2020 Personalized Voice Trigger Challenge: Open Datasets, Evaluation Metrics, Baseline System and Results}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={4239--4243},
  doi={10.21437/Interspeech.2021-602}
}