ISCA Archive Interspeech 2016
ISCA Archive Interspeech 2016

Future Context Attention for Unidirectional LSTM Based Acoustic Model

Jian Tang, Shiliang Zhang, Si Wei, Li-Rong Dai

Recently, feedforward sequential memory networks (FSMN) has shown strong ability to model past and future long-term dependency in speech signals without using recurrent feedback, and has achieved better performance than BLSTM in acoustic modeling. However, the encoding coefficients in FSMN is context-independent while context-dependent weights are commonly supposed to be more reasonable in acoustic modeling. In this paper, we propose a novel architecture called attention-based LSTM, which employs context-dependent scores or context-dependent weights to encode temporal future context information with the help of a kind of attention mechanism for unidirectional LSTM based acoustic model. Preliminary experimental results on TIMIT corpus have shown that the proposed attention-based LSTM achieves a phone error rate (PER) of 20.8% while PER is 20.1% for BLSTM. We have also presented a lot of experiments to evaluate different context attention methods.


doi: 10.21437/Interspeech.2016-185

Cite as: Tang, J., Zhang, S., Wei, S., Dai, L.-R. (2016) Future Context Attention for Unidirectional LSTM Based Acoustic Model. Proc. Interspeech 2016, 3394-3398, doi: 10.21437/Interspeech.2016-185

@inproceedings{tang16d_interspeech,
  author={Jian Tang and Shiliang Zhang and Si Wei and Li-Rong Dai},
  title={{Future Context Attention for Unidirectional LSTM Based Acoustic Model}},
  year=2016,
  booktitle={Proc. Interspeech 2016},
  pages={3394--3398},
  doi={10.21437/Interspeech.2016-185}
}