Skip to main content

SC-NER: A Sequence-to-Sequence Model with Sentence Classification for Named Entity Recognition

  • Conference paper
  • First Online:
Book cover Advances in Knowledge Discovery and Data Mining (PAKDD 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11439))

Included in the following conference series:

Abstract

Named Entity Recognition (NER) is a basic task in Natural Language Processing (NLP). Recently, the sequence-to-sequence (seq2seq) model has been widely used in NLP task. Different from the general NLP task, 60% sentences in the NER task do not contain entities. Traditional seq2seq method cannot address this issue effectively. To solve the aforementioned problem, we propose a novel seq2seq model, named SC-NER, for NER task. We construct a classifier between the encoder and decoder. In particular, the classifier’s input is the last hidden state of the encoder. Moreover, we present the restricted beam search to improve the performance of the proposed SC-NER. To evaluate our proposed model, we construct the patent documents corpus in the communications field, and conduct experiments on it. Experimental results show that our SC-NER model achieves better performance than other baseline methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://patents.google.com.

References

  1. Deft ERE annotation guidelines: Entities. Linguistics Data Consortium (2014)

    Google Scholar 

  2. Graves, A., Mohamed, A., Hinton, G.: Speech recognition with deep recurrent neural networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 38, pp. 6645–6649. IEEE (2013)

    Google Scholar 

  3. Alotaibi, F., Lee, M.: A hybrid approach to features representation for fine-grained Arabic named entity recognition. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pp. 984–995 (2014)

    Google Scholar 

  4. Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5, 157–166 (1994)

    Article  Google Scholar 

  5. Bowman, S.R., Vilnis, L., Vinyals, O., Dai, A.M., Jozefowicz, R., Bengio, S.: Generating sentences from a continuous space. arXiv preprint arXiv:1511.06349 (2015)

  6. Cheng, Y., et al.: Agreement-based joint training for bidirectional attention-based neural machine translation. arXiv preprint arXiv:1512.04650 (2015)

  7. Chiu, J., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNs. Comput. Sci. (2015)

    Google Scholar 

  8. Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. Comput. Sci. (2014)

    Google Scholar 

  9. Collobert, R., Weston, J., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)

    MATH  Google Scholar 

  10. Doersch, C.: Tutorial on variational autoencoders. arXiv preprint arXiv:1606.05908 (2016)

  11. Dong, C., Zhang, J., Zong, C., Hattori, M., Di, H.: Character-based LSTM-CRF with radical-level features for Chinese named entity recognition. In: Lin, C.-Y., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds.) ICCPOL/NLPCC -2016. LNCS (LNAI), vol. 10102, pp. 239–250. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50496-4_20

    Chapter  Google Scholar 

  12. Gehring, J., Auli, M., Grangier, D., Dauphin, Y.N.: A convolutional encoder model for neural machine translation. arXiv preprint arXiv:1611.02344 (2016)

  13. Gehring, J., Auli, M., Grangier, D., Yarats, D., Dauphin, Y.N.: Convolutional sequence to sequence learning. arXiv preprint arXiv:1705.03122 (2017)

  14. Graves, A., Mohamed, A.R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649. IEEE (2013)

    Google Scholar 

  15. Hai, L., Ng, H.: Named entity recognition with a maximum entropy approach. In: Conference on Natural Language Learning at HLT-NAACL, pp. 160–163 (2003)

    Google Scholar 

  16. Ji, Y., Tan, C., Martschat, S., Choi, Y., Smith, N.A.: Dynamic entity representations in neural language models. arXiv preprint arXiv:1708.00781 (2017)

  17. Keith, K.A., Handler, A., Pinkham, M., Magliozzi, C., McDuffie, J., O’Connor, B.: Identifying civilians killed by police with distantly supervised entity-event extraction. arXiv preprint arXiv:1707.07086 (2017)

  18. Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)

  19. Konstas, I., Iyer, S., Yatskar, M., Choi, Y., Zettlemoyer, L.: Neural AMR: Sequence-to-sequence models for parsing and generation (2017)

    Google Scholar 

  20. Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360 (2016)

  21. Li, J., Monroe, W., Jurafsky, D.: A simple, fast diverse decoding algorithm for neural generation. arXiv preprint arXiv:1611.08562 (2016)

  22. Li, P.H., Dong, R.P., Wang, Y.S., Chou, J.C., Ma, W.Y.: Leveraging linguistic structures for named entity recognition with bidirectional recursive neural networks. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2664–2669 (2017)

    Google Scholar 

  23. Lin, B.Y., Xu, F., Luo, Z., Zhu, K.: Multi-channel BiLSTM-CRF model for emerging named entity recognition in social media. In: Proceedings of the 3rd Workshop on Noisy User-generated Text, pp. 160–165 (2017)

    Google Scholar 

  24. Sundermeyer, M., Ney, H., Schluter, R.: From feedforward to recurrent LSTM neural networks for language modeling. IEEE/ACM Trans. Audio Speech Lang. Process. 23(3), 517–529 (2015)

    Article  Google Scholar 

  25. Ma, X., Hovy, E.: End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. arXiv preprint arXiv:1603.01354 (2016)

  26. Mccallum, A., Li, W.: Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In: Conference on Natural Language Learning at HLT-NAACL, pp. 188–191 (2003)

    Google Scholar 

  27. Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. Lingvisticae Investig. 30(1), 3–26 (2007)

    Article  Google Scholar 

  28. Passos, A., Kumar, V., McCallum, A.: Lexicon infused phrase embeddings for named entity resolution. arXiv preprint arXiv:1404.5367 (2014)

  29. Peng, N., Dredze, M.: Improving named entity recognition for Chinese social media with word segmentation representation learning. In: Meeting of the Association for Computational Linguistics, Berlin, Germany, pp. 149–155, August 2016

    Google Scholar 

  30. Shao, L., Gouws, S., Britz, D., Goldie, A., Strope, B.: Generating high-quality and informative conversation responses with sequence-to-sequence models (2017)

    Google Scholar 

  31. Su, J., Su, J.: Named entity recognition using an HMM-based chunk tagger. In: Meeting on Association for Computational Linguistics, pp. 473–480 (2002)

    Google Scholar 

  32. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)

    Google Scholar 

  33. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 6000–6010 (2017)

    Google Scholar 

  34. Wiseman, S., Rush, A.M.: Sequence-to-sequence learning as beam-search optimization. arXiv preprint arXiv:1606.02960 (2016)

  35. Wu, Y., Jiang, M., Lei, J., Xu, H.: Named entity recognition in Chinese clinical text using deep neural network. Stud. Health Technol. Inform. 216, 624–628 (2015)

    Google Scholar 

  36. Yao, Y., Huang, Z.: Bi-directional LSTM recurrent neural network for Chinese word segmentation. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds.) ICONIP 2016. LNCS, vol. 9950, pp. 345–353. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46681-1_42

    Chapter  Google Scholar 

Download references

Acknowledgment

This work was partially supported by Natural Science Foundation of China (No. 61603197, 61772284, 61876091, 61802205), Jiangsu Provincial Natural Science Foundation of China under Grant BK20171447, Jiangsu Provincial University Natural Science Research of China under Grant 17KJB520024, the Natural Science Research Project of Jiangsu Province under Grant 18KJB520037, and Nanjing University of Posts and Telecommunications under Grant NY215045.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yun Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, Y., Li, Y., Zhu, Z., Xia, B., Liu, Z. (2019). SC-NER: A Sequence-to-Sequence Model with Sentence Classification for Named Entity Recognition. In: Yang, Q., Zhou, ZH., Gong, Z., Zhang, ML., Huang, SJ. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2019. Lecture Notes in Computer Science(), vol 11439. Springer, Cham. https://doi.org/10.1007/978-3-030-16148-4_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-16148-4_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-16147-7

  • Online ISBN: 978-3-030-16148-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics