We assume that only word pairs identified by human are available in a low-resource target language. The word pairs are parameterized by a bottleneck feature (BNF) extractor that is trained using transcribed data in a high-resource language. The cross-lingual BNFs of the word pairs are used for training another neural network to generate a new feature representation in the target language. Pairwise learning of frame-level and word-level feature representations are investigated. Our proposed feature representations were evaluated in a word discrimination task on the Switchboard telephone speech corpus. Our learned features could bring 27.5% relative improvement over the previously best reported result on the task.
Cite as: Yuan, Y., Leung, C.-C., Xie, L., Ma, B., Li, H. (2016) Learning Neural Network Representations Using Cross-Lingual Bottleneck Features with Word-Pair Information. Proc. Interspeech 2016, 788-792, doi: 10.21437/Interspeech.2016-317
@inproceedings{yuan16_interspeech, author={Yougen Yuan and Cheung-Chi Leung and Lei Xie and Bin Ma and Haizhou Li}, title={{Learning Neural Network Representations Using Cross-Lingual Bottleneck Features with Word-Pair Information}}, year=2016, booktitle={Proc. Interspeech 2016}, pages={788--792}, doi={10.21437/Interspeech.2016-317} }