ISCA Archive SLTU 2018
ISCA Archive SLTU 2018

Improved Language Identification Using Stacked SDC Features and Residual Neural Network

Ravi Kumar Vuddagiri, Hari Krishna Vydana, Anil Kumar Vuppala

Language identification (LID) systems, which can model high level information such as phonotactics have exhibited superior performance. State-of-the-art models use sequential models to capture the high-level information, but these models are sensitive to the length of the utterance and do not equally generalize over variable length utterances. To effectively capture this information, a feature that can model the long-term temporal context is required. This study aims to capture the long-term temporal context by appending successive shifted delta cepstral (SDC) features. Deep neural networks have been explored for developing LID systems. Experiments have been performed using AP17-OLR database. LID systems developed by stacking SDC features have shown significant improvement compared to the system trained with SDC features. The proposed feature with residual connections in the feed-forward networks reduced the equal error rate from 21.04, 18.02, 16.45 to 14.42, 11.14 and 10.11 on the 1-second, 3-seconds and > 3-second test utterances respectively.


doi: 10.21437/SLTU.2018-44

Cite as: Kumar Vuddagiri, R., Vydana, H.K., Kumar Vuppala, A. (2018) Improved Language Identification Using Stacked SDC Features and Residual Neural Network. Proc. 6th Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU 2018), 210-214, doi: 10.21437/SLTU.2018-44

@inproceedings{kumarvuddagiri18_sltu,
  author={Ravi {Kumar Vuddagiri} and Hari Krishna Vydana and Anil {Kumar Vuppala}},
  title={{Improved Language Identification Using Stacked SDC Features and Residual Neural Network}},
  year=2018,
  booktitle={Proc. 6th Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU 2018)},
  pages={210--214},
  doi={10.21437/SLTU.2018-44}
}