ISCA Archive Interspeech 2016
ISCA Archive Interspeech 2016

Speaker Verification Using Short Utterances with DNN-Based Estimation of Subglottal Acoustic Features

Jinxi Guo, Gary Yeung, Deepak Muralidharan, Harish Arsikere, Amber Afshan, Abeer Alwan

Speaker verification in real-world applications sometimes deals with limited duration of enrollment and/or test data. MFCC-based i-vector systems have defined the state-of-the-art for speaker verification, but it is well known that they are less effective with short utterances. To address this issue, we propose a method to leverage the speaker specificity and stationarity of subglottal acoustics. First, we present a deep neural network (DNN) based approach to estimate subglottal features from speech signals. The approach involves training a DNN-regression model that maps the log filter-bank coefficients of a given speech signal to those of its corresponding subglottal signal. Cross-validation experiments on the WashU-UCLA corpus (which contains parallel recordings of speech and subglottal acoustics) show the effectiveness of our DNN-based estimation algorithm. The average correlation coefficient between the actual and estimated subglottal filter-bank coefficients is 0.9. A score-level fusion of MFCC and subglottal-feature systems in the i-vector PLDA framework yields statistically-significant improvements over the MFCC-only baseline. On the NIST SRE 08 truncated 10sec–10sec and 5sec–5sec core evaluation tasks, the relative reduction in equal error rate ranges between 6 and 14% for the conditions tested with both microphone and telephone speech.


doi: 10.21437/Interspeech.2016-282

Cite as: Guo, J., Yeung, G., Muralidharan, D., Arsikere, H., Afshan, A., Alwan, A. (2016) Speaker Verification Using Short Utterances with DNN-Based Estimation of Subglottal Acoustic Features. Proc. Interspeech 2016, 2219-2222, doi: 10.21437/Interspeech.2016-282

@inproceedings{guo16b_interspeech,
  author={Jinxi Guo and Gary Yeung and Deepak Muralidharan and Harish Arsikere and Amber Afshan and Abeer Alwan},
  title={{Speaker Verification Using Short Utterances with DNN-Based Estimation of Subglottal Acoustic Features}},
  year=2016,
  booktitle={Proc. Interspeech 2016},
  pages={2219--2222},
  doi={10.21437/Interspeech.2016-282}
}