Attention-Based Recurrent Neural Network for Sequence Labeling

Li, Bofang; Liu, Tao; Zhao, Zhe; Du, Xiaoyong

doi:10.1007/978-3-319-96890-2_28

Attention-Based Recurrent Neural Network for Sequence Labeling

Bofang Li^16,17,
Tao Liu^16,17,
Zhe Zhao^16,17 &
…
Xiaoyong Du^16,17

Conference paper
First Online: 19 July 2018

1525 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10987))

Abstract

Sequence labeling is one of the key problems in natural language processing. Recently, Recurrent Neural Network (RNN) and its variations have been widely used for this task. Despite their abilities of encoding information from long distance, in practice, one single hidden layer is still not sufficient for prediction. In this paper, we propose an attention architecture for sequence labeling, which allows RNNs to selectively focus on every useful hidden layers instead of irrelative ones. We conduct experiments on four typical sequence labeling tasks, including Part-Of-Speech Tagging (POS), Chunking, Named Entity Recognition (NER), and Slot Filling for Spoken Language Understanding (SF-SLU). Comprehensive experiments show that our attention architecture provides consistent improvements over different RNN variations.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
For simplicity, we omit bias terms in all the equations.
2.
CoNLL 2000 shared task: http://www.cnts.ua.ac.be/conll2000/chunking.
3.
CoNLL 2003 shared task: http://www.cnts.ua.ac.be/conll2003/ner.
4.
http://code.google.com/p/word2vec/.

References

Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. CoRR abs/1409.0473 (2014)
Google Scholar
Bahdanau, D., Chorowski, J., Serdyuk, D., Brakel, P., Bengio, Y.: End-to-end attention-based large vocabulary speech recognition. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP, pp. 4945–4949. IEEE (2016)
Google Scholar
Chiu, J.P.C., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNs. TACL 4, 357–370 (2016)
Google Scholar
Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: EMNLP (2014)
Google Scholar
Chorowski, J., Bahdanau, D., Serdyuk, D., Cho, K., Bengio, Y.: Attention-based models for speech recognition. CoRR abs/1506.07503 (2015)
Google Scholar
Hemphill, C.T., Godfrey, J.J., Doddington, G.R.: The ATIS spoken language systems pilot corpus. In: DARPA Speech and Natural Language Workshop, pp. 96–101 (1990)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)
Kudo, T., Matsumoto, Y.: Chunking with support vector machines. In: NAACL, pp. 1–8. ACL (2001)
Google Scholar
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: ICML, pp. 282–289 (2001)
Google Scholar
Luong, T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. In: EMNLP (2015)
Google Scholar
Ma, X., Hovy, E.H.: End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: ACL, pp. 147–155. ACL (2016)
Google Scholar
Mesnil, G., Dauphin, Y., Yao, K., Bengio, Y., Deng, L., Hakkani-Tur, D., He, X., Heck, L., Tur, G., Yu, D., et al.: Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Trans. Audio Speech Lang. Process. 23(3), 530–539 (2015)
Article Google Scholar
Ratinov, L., Roth, D.: Design challenges and misconceptions in named entity recognition. In: ACL, pp. 147–155. ACL (2009)
Google Scholar
Raymond, C., Riccardi, G.: Generative and discriminative algorithms for spoken language understanding. In: INTERSPEECH, pp. 1605–1608 (2007)
Google Scholar
Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)
Article Google Scholar
Tur, G., Hakkani-Tur, D., Heck, L.: What is left to be understood in ATIS? In: Spoken Language Technology Workshop, pp. 19–24. IEEE (2010)
Google Scholar
Wang, Y.Y., Acero, A., Mahajan, M., Lee, J.: Combining statistical and knowledge-based spoken language understanding in conditional models. In: COLING/ACL, pp. 882–889. ACL (2006)
Google Scholar
Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A.C., Salakhutdinov, R., Zemel, R.S., Bengio, Y.: Show, attend and tell: neural image caption generation with visual attention. In: ICML (2015)
Google Scholar
Zaremba, W., Sutskever, I., Vinyals, O.: Recurrent neural network regularization. CoRR abs/1409.2329 (2014)
Google Scholar
Zeiler, M.D.: Adadelta: an adaptive learning rate method. CoRR abs/1212.5701 (2012)
Google Scholar

Download references

Acknowledgments

This work is supported by the Fundamental Research Funds for the Central Universities, the Research Funds of Renmin University of China, National Natural Science Foundation of China with grant No. 61472428.

Author information

Authors and Affiliations

School of Information, Renmin University of China, Beijing, China
Bofang Li, Tao Liu, Zhe Zhao & Xiaoyong Du
Key Laboratory of Data Engineering and Knowledge Engineering, MOE, Beijing, China
Bofang Li, Tao Liu, Zhe Zhao & Xiaoyong Du

Authors

Bofang Li
View author publications
You can also search for this author in PubMed Google Scholar
Tao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhe Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyong Du
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tao Liu .

Editor information

Editors and Affiliations

South China University of Technology, Guangzhou, China
Yi Cai
Nagoya University, Nagoya, Japan
Yoshiharu Ishikawa
Hong Kong Baptist University, Kowloon Tong, Hong Kong, China
Jianliang Xu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, B., Liu, T., Zhao, Z., Du, X. (2018). Attention-Based Recurrent Neural Network for Sequence Labeling. In: Cai, Y., Ishikawa, Y., Xu, J. (eds) Web and Big Data. APWeb-WAIM 2018. Lecture Notes in Computer Science(), vol 10987. Springer, Cham. https://doi.org/10.1007/978-3-319-96890-2_28

Download citation

DOI: https://doi.org/10.1007/978-3-319-96890-2_28
Published: 19 July 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-96889-6
Online ISBN: 978-3-319-96890-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics