ISCA Archive Interspeech 2021
ISCA Archive Interspeech 2021

Age Estimation with Speech-Age Model for Heterogeneous Speech Datasets

Ryu Takeda, Kazunori Komatani

This paper describes an age estimation method from speech signals for heterogeneous datasets. Although previous studies in the speech field evaluate age prediction models with held-out testing data within the same dataset recorded in a consistent setting, such evaluation does not measure real performance. The difficulty of heterogeneous datasets is overfitting caused by the corpus-specific properties: transfer function of the recording environment and distributions of age and speaker. We propose a speech-age model and its integration with sequence neural networks (NNs). The speech-age model represents the ambiguity of age as a probability distribution, which also virtually extends the limited range of age distribution of each corpus. A Bayesian generative model successfully integrates the speech-age model and the NNs. We also applied mean normalization technique to cope with the transfer function problem. Experiments showed that our proposed method outperformed the baseline neural classifier for completely open test sets in the age distribution and recording setting.


doi: 10.21437/Interspeech.2021-861

Cite as: Takeda, R., Komatani, K. (2021) Age Estimation with Speech-Age Model for Heterogeneous Speech Datasets. Proc. Interspeech 2021, 4164-4168, doi: 10.21437/Interspeech.2021-861

@inproceedings{takeda21_interspeech,
  author={Ryu Takeda and Kazunori Komatani},
  title={{Age Estimation with Speech-Age Model for Heterogeneous Speech Datasets}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={4164--4168},
  doi={10.21437/Interspeech.2021-861}
}