Skip to main content

Transition-Based Discourse Parsing with Multilayer Stack Long Short Term Memory

  • Conference paper
  • First Online:
Book cover Natural Language Understanding and Intelligent Applications (ICCPOL 2016, NLPCC 2016)

Abstract

Discourse parsing aims to identify the relationship between different discourse units, where most previous works focus on recovering the constituency structure among discourse units with carefully designed features. In this paper, we propose to exploit Long Short Term Memory (LSTM) to properly represent discourse units, while using as few feature engineering as possible. Our transition based parsing model features a multilayer stack LSTM framework to discover the dependency structures among different units. Experiments on RST Discourse Treebank show that our model can outperform traditional feature based systems in terms of dependency structures, without complicated feature design. When evaluated in discourse constituency, our parser can also achieve promising performance compared to the state-of-the-art constituency discourse parsers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In Table 2 row ID 1 and row ID 2, we use results of [23] when they do experiment with their whole feature set 1 and feature set 2. We list their two feature sets below:

    (1) WORD: The first one word, the last one word, and the first bigrams in each EDU, the pair of the two first words and the pair of the two last words in the two EDUs are extracted as features.

    (2) POS: The first one and two POS tags in each EDU, and the pair of the two first POS tags in the two EDUs are extracted as features.

References

  1. Abney, S.P., Johnson, M.: Memory requirements and local ambiguities of parsing strategies. J. Psycholinguist. Res. 20, 233–250 (1991)

    Article  Google Scholar 

  2. Ballesteros, M., Dyer, C., Smith, N.A.: Improved transition-based parsing by modeling characters instead of words with LSTMs. In: EMNLP 2015, Lisbon, Portugal, pp. 349–359 (2015)

    Google Scholar 

  3. Carlson, L., Marcu, D., Okurowski, M.E.: Building a discourse-tagged corpus in the framework of rhetorical structure theory. In: SIGdial Workshop, pp. 1–10 (2001)

    Google Scholar 

  4. Chorowski, J., Bahdanau, D., Serdyuk, D., Cho, K., Bengio, Y.: Attention-based models for speech recognition (2015). CoRR arXiv:1506.07503

  5. Chung, J., Gülçehre, Ç., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling (2014). CoRR arXiv:1412.3555

  6. Collins, M., Roark, B.: Incremental parsing with the perceptron algorithm. In: Proceedings of 42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, Spain, 21–26 July 2004, pp. 111–118 (2004)

    Google Scholar 

  7. Dyer, C., Ballesteros, M., Ling, W., Matthews, A., Smith, N.A.: Transition-based dependency parsing with stack long short-term memory. In: ACL 2015, Beijing, vol. 1, pp. 334–343 (2015)

    Google Scholar 

  8. Eisner, J.: Three new probabilistic models for dependency parsing: an exploration. In: COLING 1996, 5–9 August 1996, pp. 340–345 (1996)

    Google Scholar 

  9. Eyben, F., Böck, S., Schuller, B.W., Graves, A.: Universal onset detection with bidirectional long short-term memory neural networks. In: ISMIR 2010, Utrecht, Netherlands, 9–13 August 2010, pp. 589–594 (2010)

    Google Scholar 

  10. Feng, V.W., Hirst, G.: A linear-time bottom-up discourse parser with constraints and post-editing. In: ACL 2014, Baltimore, MD, USA, vol. 1, pp. 511–521 (2014)

    Google Scholar 

  11. Ferrucci, D.A., Brown, E.W., Chu-Carroll, J., Fan, J., Gondek, D., Kalyanpur, A., Lally, A., Murdock, J.W., Nyberg, E., Prager, J.M., Schlaefer, N., Welty, C.A.: Building Watson: an overview of the DeepQA project. AI Mag. 31(3), 59–79 (2010)

    Google Scholar 

  12. Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: AISTATS 2011, Fort Lauderdale, USA, 11–13 April 2011, pp. 315–323 (2011)

    Google Scholar 

  13. Graves, A., Jaitly, N., Mohamed, A.: Hybrid speech recognition with deep bidirectional LSTM. In: 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, Olomouc, Czech Republic, 8–12 December 2013, pp. 273–278 (2013)

    Google Scholar 

  14. Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 18(5–6), 602–610 (2005)

    Article  Google Scholar 

  15. Hahn, U.: The theory and practice of discourse parsing and summarization by Daniel Marcu. Comput. Linguist. 28(1), 81–83 (2002)

    Article  Google Scholar 

  16. Hernault, H., Prendinger, H., duVerle, D.A., Ishizuka, M.: HILDA: a discourse parser using support vector machine classification. D&D 1(3), 1–33 (2010)

    Article  Google Scholar 

  17. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  18. Ji, Y., Eisenstein, J.: One vector is not enough: entity-augmented distributed semantics for discourse relations. TACL 3, 329–344 (2015)

    Google Scholar 

  19. Joty, S.R., Carenini, G., Ng, R.T., Mehdad, Y.: Combining intra- and multi-sentential rhetorical parsing for document-level discourse analysis. In: ACL 2013, Sofia, Bulgaria, vol. 1, pp. 486–496 (2013)

    Google Scholar 

  20. Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: ICML 2014, Beijing, China, 21–26 June 2014, pp. 1188–1196 (2014)

    Google Scholar 

  21. LeThanh, H.: Generating discourse structures for written texts. In: Proceedings of 20th International Conference on Computational Linguistics, pp. 329–335 (2004)

    Google Scholar 

  22. Li, J., Li, R., Hovy, E.H.: Recursive deep models for discourse parsing. In: EMNLP 2014, Doha, Qatar, 25–29 October 2014, pp. 2061–2069 (2014)

    Google Scholar 

  23. Li, S., Wang, L., Cao, Z., Li, W.: Text-level discourse dependency parsing. In: ACL 2014, Baltimore (vol. 1: Long Papers), pp. 25–35 (2014)

    Google Scholar 

  24. Louis, A., Joshi, A.K., Nenkova, A.: Discourse indicators for content selection in summarization. In: SIGDIAL 2010 Conference, Tokyo, Japan, pp. 147–156 (2010)

    Google Scholar 

  25. Mann, W., Thompson, S.: Rhetorical structure theory: toward a functional theory of text organization. Text-Interdiscip. J. Study Discourse 8, 243–281 (1988)

    Google Scholar 

  26. McDonald, R.T., Crammer, K., Pereira, F.C.N.: Online large-margin training of dependency parsers. In: ACL 2005, University of Michigan, USA (2005)

    Google Scholar 

  27. Nivre, J., Scholz, M.: Deterministic dependency parsing of english text. In: COLING 2004, Geneva, Switzerland, 23–27 August 2004 (2004)

    Google Scholar 

  28. Socher, R., Karpathy, A., Le, Q.V., Manning, C.D., Ng, A.Y.: Grounded compositional semantics for finding and describing images with sentences. TACL 2, 207–218 (2014)

    Google Scholar 

  29. Soricut, R., Marcu, D.: Sentence level discourse parsing using syntactic and lexical information. In: HLT-NAACL (2003)

    Google Scholar 

  30. Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. In: ACL 2015, Beijing, pp. 1556–1566 (2015)

    Google Scholar 

  31. Voll, K., Taboada, M.: Not all words are created equal: extracting semantic orientation as a function of adjective relevance. In: Orgun, M.A., Thornton, J. (eds.) AI 2007. LNCS (LNAI), vol. 4830, pp. 337–346. Springer, Heidelberg (2007). doi:10.1007/978-3-540-76928-6_35

    Chapter  Google Scholar 

Download references

Acknowledgments

We would like to thank Sujian Li, Liang Wang, and the anonymous reviewers for their helpful feedback. This work is supported by National High Technology R&D Program of China (Grant Nos. 2015AA015403, 2014AA015102), Natural Science Foundation of China (Grant Nos. 61202233, 61272344, 61370055) and the joint project with IBM Research. For any correspondence, please contact Yansong Feng.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yansong Feng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Jia, Y., Feng, Y., Luo, B., Ye, Y., Liu, T., Zhao, D. (2016). Transition-Based Discourse Parsing with Multilayer Stack Long Short Term Memory. In: Lin, CY., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds) Natural Language Understanding and Intelligent Applications. ICCPOL NLPCC 2016 2016. Lecture Notes in Computer Science(), vol 10102. Springer, Cham. https://doi.org/10.1007/978-3-319-50496-4_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-50496-4_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-50495-7

  • Online ISBN: 978-3-319-50496-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics