Learning Morpheme Representation for Mongolian Named Entity Recognition

Wang, Weihua; Bao, Feilong; Gao, Guanglai

doi:10.1007/s11063-019-10044-6

Learning Morpheme Representation for Mongolian Named Entity Recognition

Published: 02 May 2019

Volume 50, pages 2647–2664, (2019)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

456 Accesses
12 Citations
Explore all metrics

Abstract

Traditional approaches to Mongolian named entity recognition heavily rely on the feature engineering. Even worse, the complex morphological structure of Mongolian words made the data more sparsity. To alleviate the feature engineering and data sparsity in Mongolian named entity recognition, we propose a framework of recurrent neural networks with morpheme representation. We then study this framework in depth with different model variants. More specially, the morpheme representation utilizes the characteristic of classical Mongolian script, which can be learned from unsupervised corpus. Our model will be further augmented by different character representations and auxiliary language model losses which will extract context knowledge from scratch. By jointly decoding by Conditional Random Field layer, the model could learn the dependence between different labels. Experimental results show that feeding the morpheme representation into neural networks outperforms the word representation. The additional character representation and morpheme language model loss also improve the performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Named Entity Recognition in Russian with Word Representation Learned by a Bidirectional Language Model

Character-to-Word Representation and Global Contextual Representation for Named Entity Recognition

Article 15 February 2023

Thai Named-Entity Recognition Using Variational Long Short-Term Memory with Conditional Random Field

References

Abudukelimu H, Liu Y, Chen X, Sun M, Abulizi A (2015) Learning distributed representations of uyghur words and morphemes. In: Chinese computational linguistics and natural language processing based on naturally annotated big data—14th China National Conference, CCL 2015 and third international symposium, NLP-NABD 2015, Guangzhou, China, November 13–14, 2015, Proceedings, pp 202–211
Arisoy E, Sethy A, Ramabhadran B, Chen SF (2015) Bidirectional recurrent neural network language models for automatic speech recognition. In: 2015 IEEE international conference on acoustics, speech and signal processing, ICASSP 2015, South Brisbane, Queensland, Australia, April 19–24, 2015, pp 5421–5425
Benajiba Y, Rosso P (2008) Arabic named entity recognition using conditional random fields. In: Proceedings of workshop on HLT & NLP within the Arabic World, LREC, vol 8, pp 143–153
Benajiba Y, Zitouni I, Diab M, Rosso P (2010) Arabic named entity recognition: using features extracted from noisy data. In: Proceedings of the ACL 2010 conference short papers, pp 281–285. Association for Computational Linguistics
Bengio Y, Ducharme R, Vincent P, Janvin C (2003) A neural probabilistic language model. J Mach Learn Res 3:1137–1155
MATH Google Scholar
Bengio Y, Simard PY, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Net 5(2):157–166
Article Google Scholar
Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146
Article Google Scholar
Botha JA, Blunsom P (2014) Compositional morphology for word representations and language modelling. In: Proceedings of the 31th international conference on machine learning, ICML 2014, Beijing, China, 21–26 June 2014, pp 1899–1907
Chen X, Xu L, Liu Z, Sun M, Luan H (2015) Joint learning of character and word embeddings. In: Proceedings of the twenty-fourth international joint conference on artificial intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25–31, 2015, pp 1236–1242
Chiu J, Nichols E (2016) Named entity recognition with bidirectional lstm-cnns. Trans Assoc Comput Linguist 4:357–370. http://www.aclweb.org/anthology/Q16-1026
Article Google Scholar
Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder–decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing, pp 1724–1734. Association for Computational Linguistics. https://doi.org/10.3115/v1/D14-1179. http://www.aclweb.org/anthology/D14-1179
Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12(Aug):2493–2537
MATH Google Scholar
David Nadeau SS (2007) A survey of named entity recognition and classification. Lingvisticae Investig 30(1):3–26
Article Google Scholar
Graves A, Mohamed A, Hinton GE (2013) Speech recognition with deep recurrent neural networks. In: IEEE international conference on acoustics, speech and signal processing, ICASSP 2013, Vancouver, BC, Canada, May 26–31, 2013, pp 6645–6649
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Article Google Scholar
Huang EH, Socher R, Manning CD, Ng AY (2012) Improving word representations via global context and multiple word prototypes. In: The 50th annual meeting of the association for computational linguistics, proceedings of the conference, July 8–14, 2012, Jeju Island, Korea—Volume 1: Long Papers, pp 873–882
Huang Z, Xu W, Yu K (2015) Bidirectional LSTM-CRF models for sequence tagging. CoRR arXiv:1508.01991
Irsoy O, Cardie C (2014) Opinion mining with deep recurrent neural networks. In: Proceedings of the 2014 conference on empirical methods in natural language processing, EMNLP 2014, October 25–29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, pp 720–728
Kazama J, Torisawa K (2007) Exploiting wikipedia as external knowledge for named entity recognition. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL)
Kim Y, Jernite Y, Sontag D, Rush AM (2016) Character-aware neural language models. In: AAAI, pp 2741–2749
Konkol M, Konopík M (2013) CRF-based Czech named entity recognizer and consolidation of czech NER research. In: Text, speech, and dialogue, pp 153–160. Springer
Kudo T, Matsumoto Y (2001) Chunking with support vector machines. In: Proceedings of the 2001 conference of the North American chapter of the association for computational linguistics. Association for Computational Linguistics
Lafferty J, McCallum A, Pereira FC (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data
Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C (2016) Neural architectures for named entity recognition. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 260–270. Association for Computational Linguistics, San Diego, California
Liu L, Shang J, Xu F, Ren X, Gui H, Peng J, Han J (2017) Empower sequence labeling with task-aware neural language model. arXiv preprint arXiv:1709.04109
Luong M, Le QV, Sutskever I, Vinyals O, Kaiser L (2015) Multi-task sequence to sequence learning. CoRR arXiv:1511.06114
Luong T, Socher R, Manning CD (2013) Better word representations with recursive neural networks for morphology. In: Proceedings of the seventeenth conference on computational natural language learning, CoNLL 2013, Sofia, Bulgaria, August 8–9, 2013, pp 104–113
Ma X, Hovy E (2016) End-to-end sequence labeling via bi-directional lstm-cnns-crf. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: long papers), pp 1064–1074. Association for computational linguistics. https://doi.org/10.18653/v1/P16-1101. http://www.aclweb.org/anthology/P16-1101
Mesnil G, He X, Deng L, Bengio Y (2013) Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. In: INTERSPEECH 2013, 14th annual conference of the international speech communication association, Lyon, France, August 25–29, 2013, pp 3771–3775
Ogawa A, Hori T (2015) ASR error detection and recognition rate estimation using deep bidirectional recurrent neural networks. In: 2015 IEEE international conference on acoustics, speech and signal processing, ICASSP 2015, South Brisbane, Queensland, Australia, April 19–24, 2015, pp 4370–4374
Peng N, Dredze M (2016) Improving named entity recognition for chinese social media with word segmentation representation learning. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 2: short papers), pp 149–155. Association for computational linguistics. https://doi.org/10.18653/v1/P16-2025. http://aclweb.org/anthology/P16-2025
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Empirical methods in natural language processing (EMNLP), pp 1532–1543. http://www.aclweb.org/anthology/D14-1162
Plank B, Søgaard A, Goldberg Y (2016) Multilingual part-of-speech tagging with bidirectional long short-term memory models and auxiliary loss. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 2: short papers), pp 412–418. Association for computational linguistics. https://doi.org/10.18653/v1/P16-2067. http://www.aclweb.org/anthology/P16-2067
Radford W, Carreras X, Henderson J (2015) Named entity recognition with document-specific KB tag gazetteers. In: Proceedings of the 2015 conference on empirical methods in natural language processing, EMNLP 2015, Lisbon, Portugal, September 17–21, 2015, pp 512–517
Ratinov L, Roth D (2009) Design challenges and misconceptions in named entity recognition. In: Proceedings of the thirteenth conference on computational natural language learning (CoNLL-2009), pp 147–155. Association for Computational Linguistics, Boulder, Colorado
Rei M (2017) Semi-supervised multitask learning for sequence labeling. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers), pp 2121–2130. Association for computational linguistics. https://doi.org/10.18653/v1/P17-1194. http://www.aclweb.org/anthology/P17-1194
Rei M, Crichton G, Pyysalo S (2016) Attending to characters in neural sequence labeling models. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers, pp 309–318. The COLING 2016 organizing committee. http://www.aclweb.org/anthology/C16-1030
Reimers N, Gurevych I (2017) Reporting score distributions makes a difference: performance study of LSTM-networks for sequence tagging. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 338–348. Association for computational linguistics. http://aclweb.org/anthology/D17-1035
Sasano R, Kurohashi S (2008) Japanese named entity recognition using structural natural language processing. In: IJCNLP, pp 607–612
Şeker GA, şen Eryiğit G (2012) Initial explorations on using CRFS for Turkish named entity recognition. In: Proceedings of the 24th international conference on computational linguistics, COLING 2012. Mumbai, India
Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681
Article Google Scholar
Seltzer ML, Droppo J (2013) Multi-task learning in deep neural networks for improved phoneme recognition. In: 2013 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 6965–6969. IEEE
Sen S, Mitra M, Bhattacharyya A, Sarkar R, Schwenker F, Roy K (2019) Feature selection for recognition of online handwritten bangla characters. Neural Process Lett 1–24
Wang L, Cao Z, Xia Y, de Melo G (2016) Morphological segmentation with window LSTM neural networks. In: Proceedings of the 13rd AAAI conference on artificial intelligence, pp 2842–2848
Wang W, Bao F, Gao G (2015) Mongolian named entity recognition using suffixes segmentation. In: Proceedings of 2015 international conference on asian language processing (IALP), pp 169–172. Suzhou, China
Wang W, Bao F, Gao G (2016) Cyrillic mongolian named entity recognition with rich features. In: Natural language understanding and intelligent applications, pp 497–505. Springer
Wang W, Bao F, Gao G (2016) Mongolian named entity recognition system with rich features. In: Proceedings of the 26th international conference on computational linguistics (COLING): technical papers, pp 505–512. The COLING 2016 Organizing Committee, Osaka, Japan. http://www.aclweb.org/anthology/C16-1049
Wang Z, Jiang T, Chang B, Sui Z (2015) Chinese semantic role labeling with bidirectional recurrent neural networks. In: Proceedings of the 2015 conference on empirical methods in natural language processing, EMNLP 2015, Lisbon, Portugal, September 17–21, 2015, pp 1626–1631
Yannakoudakis H, Rei M, Andersen ØE, Yuan Z (2017) Neural sequence-labelling models for grammatical error correction. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 2795–2806
Yin R, Wang Q, Li P, Li R, Wang B (2016) Multi-granularity chinese word embedding. In: Proceedings of the 2016 conference on empirical methods in natural language processing, pp 981–986. Association for computational linguistics. https://doi.org/10.18653/v1/D16-1100. http://aclweb.org/anthology/D16-1100
Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in Neural Information Processing Systems 28. Curran Associates, Inc., Red Hook, pp 649–657
Google Scholar
Zhou G, Su J (2002) Named entity recognition using an HMM-based chunk tagger. In: Proceedings of the 40th annual meeting on association for computational linguistics, pp. 473–480. Association for Computational Linguistics

Download references

Author information

Authors and Affiliations

College of Computer Science, Inner Mongolia University, Hohhot, 010021, China
Weihua Wang, Feilong Bao & Guanglai Gao

Authors

Weihua Wang
View author publications
You can also search for this author in PubMed Google Scholar
Feilong Bao
View author publications
You can also search for this author in PubMed Google Scholar
Guanglai Gao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Feilong Bao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported by the National Natural Science Foundation of China (Nos. 61563040, 61773224); Natural Science Foundation of Inner Mongolia (No. 2016ZD06).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, W., Bao, F. & Gao, G. Learning Morpheme Representation for Mongolian Named Entity Recognition. Neural Process Lett 50, 2647–2664 (2019). https://doi.org/10.1007/s11063-019-10044-6

Download citation

Published: 02 May 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s11063-019-10044-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning Morpheme Representation for Mongolian Named Entity Recognition

Abstract

Access this article

Similar content being viewed by others

Named Entity Recognition in Russian with Word Representation Learned by a Bidirectional Language Model

Character-to-Word Representation and Global Contextual Representation for Named Entity Recognition

Thai Named-Entity Recognition Using Variational Long Short-Term Memory with Conditional Random Field

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning Morpheme Representation for Mongolian Named Entity Recognition

Abstract

Access this article

Similar content being viewed by others

Named Entity Recognition in Russian with Word Representation Learned by a Bidirectional Language Model

Character-to-Word Representation and Global Contextual Representation for Named Entity Recognition

Thai Named-Entity Recognition Using Variational Long Short-Term Memory with Conditional Random Field

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation