Abstract
Sequence tagging models can take many forms, each featuring strong points and limitations. In this contribution, we introduce a hybrid model for sequence tagging that combines recurrent neural networks with conditional random fields. It avoids feature engineering and addresses rare and out-of-vocabulary words by complementing typical word embeddings with compositional character-to-word representations. Using shared parameters across multiple tasks, we are able to achieve performance scores that are either superior or comparable to current state-of-the-art models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The log probability learned from the CRF’layer is backpropagated via cross-entropy.
- 2.
For pre-trained word embeddings we used the ones in https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md.
References
Ammar, W., Mulcaire, G., Tsvetkov, Y., Lample, G., Dyer, C., Smith, N.A.: Massively multilingual word embeddings. arXiv preprint arXiv:1602.01925 (2016)
Berend, G.: Sparse coding of neural word embeddings for multilingual sequence labeling. arXiv preprint arXiv:1612.07130 (2016)
Cardoso, N., Santos, D.: Directivas para a identificação e classificação semântica na colecção dourada do harem (2007)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
dos Santos, C.N., Guimarães, V.: Boosting named entity recognition with neural character embeddings. CoRR, abs/1505.05008 (2015). http://arxiv.org/abs/1505.05008
Dos Santos, C.N., Zadrozny, B.: Learning character-level representations for part-of-speech tagging. In: Proceedings of the 31st International Conference on International Conference on Machine Learning, ICML 2014, vol. 32, pp. II-1818–II-1826. JMLR.org (2014). http://dl.acm.org/citation.cfm?id=3044805.3045095
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Kiperwasser, E., Goldberg, Y.: Easy-first dependency parsing with hierarchical tree LSTMs. CoRR, abs/1603.00375 (2016). http://arxiv.org/abs/1603.00375
Kiperwasser, E., Goldberg, Y.: Simple and accurate dependency parsing using bidirectional LSTM feature representations. CoRR, abs/1603.04351 (2016). http://arxiv.org/abs/1603.04351
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, ICML 2001, San Francisco, CA, USA, pp. 282–289. Morgan Kaufmann Publishers Inc. (2001). http://dl.acm.org/citation.cfm?id=645530.655813. ISBN 1-55860-778-1
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Process. Syst. 26, 3111–3119 (2013)
Nguyen, D.Q., Dras, M., Johnson, M.: A novel neural network model for joint POS tagging and graph-based dependency parsing. In: Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp. 134–142 (2017). http://www.aclweb.org/anthology/K17-3014
Nivre, J., et al.: Universal dependencies 1.2 (2015)
Plank, B., Søgaard, A., Goldberg, Y.: Multilingual part-of-speech tagging with bidirectional long short-term memory models and auxiliary loss. arXiv preprint arXiv:1604.05529 (2016)
Tsarfaty, R., Seddah, D., Kübler, S., Nivre, J.: Parsing morphologically rich languages: introduction to the special issue. Comput. Linguist. 39(1), 15–22 (2013)
Yang, Z., Salakhutdinov, R., Cohen, W.W.: Multi-task cross-lingual sequence tagging from scratch. CoRR, abs/1603.06270 (2016). http://arxiv.org/abs/1603.06270
Yasunaga, M., Kasai, J., Radev, D.: Robust multilingual part-of-speech tagging via adversarial training. arXiv preprint arXiv:1711.04903 (2017)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
da Costa, P., Paetzold, G.H. (2018). Effective Sequence Labeling with Hybrid Neural-CRF Models. In: Villavicencio, A., et al. Computational Processing of the Portuguese Language. PROPOR 2018. Lecture Notes in Computer Science(), vol 11122. Springer, Cham. https://doi.org/10.1007/978-3-319-99722-3_49
Download citation
DOI: https://doi.org/10.1007/978-3-319-99722-3_49
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99721-6
Online ISBN: 978-3-319-99722-3
eBook Packages: Computer ScienceComputer Science (R0)