Recurrent neural network (RNN) is conventionally trained in the supervised mode but used in the free-running mode for inferences on testing samples. The supervised mode takes ground truth token values as RNN inputs but the free-running mode can only use self-predicted token values as surrogating inputs. Such inconsistency inevitably results in poor generalizations of RNN on out-of-sample data. We propose a moment matching (MM) training strategy to alleviate such inconsistency by simultaneously taking these two distinct modes and their corresponding dynamics into consideration. Our MM-RNN shows significant performance improvements over existing approaches when tested on practical NLP applications including logic form generation and image captioning.
Cite as: Deng, Y., Shen, Y., Chen, K., Jin, H. (2018) Training Recurrent Neural Network through Moment Matching for NLP Applications. Proc. Interspeech 2018, 3353-3357, doi: 10.21437/Interspeech.2018-1369
@inproceedings{deng18_interspeech, author={Yue Deng and Yilin Shen and KaWai Chen and Hongxia Jin}, title={{Training Recurrent Neural Network through Moment Matching for NLP Applications}}, year=2018, booktitle={Proc. Interspeech 2018}, pages={3353--3357}, doi={10.21437/Interspeech.2018-1369} }