Skip to main content

Advertisement

Log in

A survey of discourse parsing

  • Review Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Discourse parsing is an important research area in natural language processing (NLP), which aims to parse the discourse structure of coherent sentences. In this survey, we introduce several different kinds of discourse parsing tasks, mainly including RST-style discourse parsing, PDTB-style discourse parsing, and discourse parsing for multiparty dialogue. For these tasks, we introduce the classical and recent existing methods, especially neural network approaches. After that, we describe the applications of discourse parsing for other NLP tasks, such as machine reading comprehension and sentiment analysis. Finally, we discuss the future trends of the task.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Jansen P, Surdeanu M, Clark P. Discourse complements lexical semantics for non-factoid answer reranking. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2014, 977–986

  2. Narasimhan K, Barzilay R. Machine comprehension with discourse relations. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2015, 1253–1262

  3. Bhatia P, Ji Y, Eisenstein J. Better document-level sentiment analysis from rst discourse parsing. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015, 2212–2218

  4. Ji Y, Haffari G, Eisenstein J. A latent variable recurrent neural network for discourse-driven language models. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. June 2016, 332–342

  5. Meyer T, Popescu-Belis A. Using sense-labeled discourse connectives for statistical machine translation. In: Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra). 2012, 129–138

  6. Ji Y, Smith N A. Neural discourse structure for text categorization. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2017, 996–1005

  7. Mann W C, Thompson S A. Rhetorical structure theory: Toward a functional theory of text organization. Text-Interdisciplinary Journal for the Study of Discourse, 1988, 8(3): 243–281

    Article  Google Scholar 

  8. Carlson L, Marcu D, Okurowski M E. Building a discourse-tagged corpus in the framework of rhetorical structure theory. Springer, 2003

  9. Wolf F, Gibson E, Fisher A, Knight M. Discourse graphbank. Linguistic Data Consortium. Philadelphia, 2004

  10. Prasad R, Dinesh N, Lee A, Miltsakaki E, Robaldo L, Joshi A K, Webber B L. The penn discourse treebank 2.0. In: LREC. 2008

  11. Webber B. D-ltag: extending lexicalized tag to discourse. Cognitive Science, 2004, 28(5): 751–779

    Google Scholar 

  12. Afantenos S, Kow E, Asher N, Perret J. Discourse parsing for multiparty chat dialogues. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015, 928–937

  13. Perret J, Afantenos S, Asher N, Morey M. Integer linear programming for discourse parsing. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016, 99–109

  14. Shi Z, Huang M. A deep sequential model for discourse parsing on multi-party dialogues. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2019, 7007–7014

  15. Ji Y, Eisenstein J. Representation learning for text-level discourse parsing. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2014, 13–24

  16. Webber B, Prasad R, Lee A, Joshi A. The penn discourse treebank 3.0 annotation manual. Philadelphia, University of Pennsylvania, 2019

    Google Scholar 

  17. Lin Z, Ng H T, Kan M Y. A pdtb-styled end-to-end discourse parser. Natural Language Engineering, 2014, 20(2): 151–184

    Article  Google Scholar 

  18. Pitler E, Louis A, Nenkova A. Automatic sense prediction for implicit discourse relations in text. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2-Volume 2. 2009, 683–691

  19. Hong Y, Zhou X, Che T, Yao J, Zhu Q, Zhou G. Cross-argument inference for implicit discourse relation recognition. In: Proceedings of the 21st ACM international conference on Information and knowledge management. 2012, 295–304

  20. Rehbein I, Scholman M, Demberg V. Annotating discourse relations in spoken language: A comparison of the PDTB and CCR frameworks. LREC, 2016

  21. Asher N, Hunter J, Morey M, Benamara F, Afantenos S. Discourse structure and dialogue acts in multiparty dialogue: The STAC corpus. In: Proceedings of the 10th International Conference on Language Resources and Evaluation, 2016, 2721–2727

  22. Li J, Liu M, Kan M Y, Zheng Z, Wang Z, Lei W, Liu T, Qin B. Molweni: A challenge multiparty dialogues-based machine reading comprehension dataset with discourse structure. In: Proceedings of the 28th International Conference on Computational Linguistics. 2020, 2642–2652

  23. Lowe R, Pow N, Serban I, Pineau J. The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. In: Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue. 2015, 285–294

  24. Soricut R, Marcu D. Sentence level discourse parsing using syntactic and lexical information. In: Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics. 2003, 228–235

  25. Subba R, Di Eugenio B. Automatic discourse segmentation using neural networks. In: Proceedings of the 11th Workshop on the Semantics and Pragmatics of Dialogue. 2007, 189–190

  26. Fisher S, Roark B. The utility of parse-derived features for automatic discourse segmentation. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. 2007, 488–495

  27. Joty S, Carenini G, Ng R. A novel discriminative framework for sentence-level discourse analysis. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 2012, 904–915

  28. Sagae K. Analysis of discourse structure with syntactic dependencies and data-driven shift-reduce parsing. In: Proceedings of the 11th International Conference on Parsing Technologies (IWPTÂąÂŕ09). 2009, 81–84

  29. Hernault H, Prendinger H, Ishizuka M, others. Hilda: A discourse parser using support vector machine classification. Dialogue & Discourse, 2010, 1(3)

    Google Scholar 

  30. Bach N X, Le Nguyen M, Shimazu A. A reranking model for discourse segmentation using subtree features. In: Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue. 2012, 160–168

  31. Feng V W, Hirst G. Two-pass discourse segmentation with pairing and global features. 2014, arXiv preprint arXiv: 1407.8215

  32. Wang Y, Li S, Yang J. Toward fast and accurate neural discourse segmentation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018, 962–967

  33. Hirao T, Yoshida Y, Nishino M, Yasuda N, Nagata M. Single-document summarization as a tree knapsack problem. In: Proceedings of the 2013 conference on empirical methods in natural language processing. 2013, 1515–1520

  34. Li S, Wang L, Cao Z, Li W. Text-level discourse dependency parsing. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2014, 25–35

  35. Hayashi K, Hirao T, Nagata M. Empirical comparison of dependency conversions for rst discourse trees. In: Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue. 2016, 128–136

  36. Surdeanu M, Hicks T, Valenzuela-Escárcega M A. Two practical rhetorical structure theory parsers. In: Proceedings of the 2015 conference of the North American chapter of the association for computational linguistics: Demonstrations. 2015, 1–5

  37. Morey M, Muller P, Asher N. A dependency perspective on rst discourse parsing and evaluation. Computational Linguistics, 2018, 44(2): 197–235

    Article  MathSciNet  Google Scholar 

  38. Joty S, Carenini G, Ng R T. Codra: A novel discriminative framework for rhetorical analysis. Computational Linguistics, 2015, 41(3): 385–435

    Article  MathSciNet  Google Scholar 

  39. Li J, Li R, Hovy E. Recursive deep models for discourse parsing. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2014, 2061–2069

  40. Li Q, Li T, Chang B. Discourse parsing with attention-based hierarchical neural networks. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016, 362–371

  41. Jia Y, Ye Y, Feng Y, Lai Y, Yan R, Zhao D. Modeling discourse cohesion for discourse parsing via memory network. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 2018, 438–443

  42. Yu N, Zhang M, Fu G. Transition-based neural rst parsing with implicit syntax features. In: Proceedings of the 27th International Conference on Computational Linguistics. 2018, 559–570

  43. Jia Y, Feng Y, Ye Y, Lv C, Shi C, Zhao D. Improved discourse parsing with two-step neural transition-based model. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 2018, 17(2): 11

  44. Braud C, Plank B, Søgaard A. Multi-view and multi-task training of rst discourse parsers. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. 2016, 1903–1913

  45. Braud C, Coavoux M, Søgaard A. Cross-lingual rst discourse parsing. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. 2017, 292–304

  46. Pitler E, Nenkova A. Using syntax to disambiguate explicit discourse connectives in text. In: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers. 2009, 13–16

  47. Li S, Kong F, Zhou G. A joint learning approach to explicit discourse parsing via structured perceptron. Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data, Springer, Cham, 2014, 70–82

  48. Marcu D, Echihabi A. An unsupervised approach to recognizing discourse relations. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. 2002, 368–375

  49. Wang X, Li S, Li J, Li W. Implicit discourse relation recognition by selecting typical training examples. In: COLING. 2012, 2757–2772

  50. Lan M, Xu Y, Niu Z Y. Leveraging synthetic discourse data via multitask learning for implicit discourse relation recognition. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2013, 476–485

  51. Ji Y, Eisenstein J. One vector is not enough: Entity-augmented distributed semantics for discourse relations. Transactions of the Association for Computational Linguistics, 2015, 3: 329–344

    Article  Google Scholar 

  52. Rutherford A T, Demberg V, Xue N. Neural network models for implicit discourse relation classification in english and chinese without surface features. 2016, arXiv preprint arXiv: 1606.01990

  53. Braud C, Denis P. Comparing word representations for implicit discourse relation classification. In: Proceedings of Empirical Methods in Natural Language Processing (EMNLP 2015). 2015

  54. Shi W, Demberg V. Next sentence prediction helps implicit discourse relation classification within and across domains. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019, 5794–5800

  55. Kishimoto Y, Murawaki Y, Kurohashi S. Adapting bert to implicit discourse relation classification with a focus on discourse connectives. In: Proceedings of The 12th Language Resources and Evaluation Conference. 2020, 1152–1158

  56. Rutherford A, Xue N. Discovering implicit discourse relations through brown cluster pair representation and coreference patterns. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics. 2014, 645–654

  57. McKeown K, Biran O. Aggregated word pair features for implicit discourse relation disambiguation. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. 2013, 69–73

  58. Lei W, Wang X, Liu M, Ilievski I, He X, Kan M Y. Swim: A simple word interaction model for implicit discourse relation recognition. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2017, 4026–4032

  59. Chen J, Zhang Q, Liu P, Qiu X, Huang X. Implicit discourse relation detection via a deep architecture with gated relevance network. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2016, 1726–1735

  60. Chen J, Zhang Q, Liu P, Huang X. Discourse relations detection via a mixed generative-discriminative framework. In: Proceedings of Thirtieth AAAI Conference on Artificial Intelligence. 2016, 30(1)

  61. Lei W, Xiang Y, Wang Y, Zhong Q, Liu M, Kan M Y. Linguistic properties matter for implicit discourse relation recognition: Combining semantic interaction, topic continuity and attribution. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018, 32(1)

  62. Guo F, He R, Jin D, Dang J, Wang L, Li X. Implicit discourse relation recognition using neural tensor network with interactive attention and sparse learning. In: Proceedings of the 27th International Conference on Computational Linguistics. 2018, 547–558

  63. Bai H, Zhao H. Deep enhanced representation for implicit discourse relation recognition. In: Proceedings of the 27th International Conference on Computational Linguistics. 2018, 571–583

  64. Xu S, Li P, Kong F, Zhu Q, Zhou G. Topic tensor network for implicit discourse relation recognition in chinese. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019, 608–618

  65. Liu Y, Li S, Zhang X, Sui Z. Implicit discourse relation classification via multi-task neural networks. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. 2016, 2750–2756

  66. Rutherford A, Xue N. Improving the inference of implicit discourse relations via classifying explicit discourse connectives. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2015, 799–808

  67. Shi W, Demberg V. Learning to explicitate connectives with seq2seq network for implicit discourse relation classification. In: Proceedings of the 13th International Conference on Computational SemanticsLong Papers. 2019, 188–199

  68. Dai Z, Huang R. A regularization approach for incorporating event knowledge and coreference relations into neural discourse parsing. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019, 2967–2978

  69. Guo F, He R, Dang J, Wang J. Working memory-driven neural networks with a novel knowledge enhancement paradigm for implicit discourse relation recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 7822–7829

  70. He R, Wang J, Guo F, Han Y. TransS-driven joint learning architecture for implicit discourse relation recognition. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020, 139–148

  71. Verberne S, Boves L, Oostdijk N, Coppen P A. Evaluating discoursebased answer extraction for why-question answering. In: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. 2007, 735–736

  72. Marcu D. The theory and practice of discourse parsing and summarization. MIT press, 2000

  73. Gerani S, Mehdad Y, Carenini G, Ng R, Nejat B. Abstractive summarization of product reviews using discourse structure. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014, 1602–1613

  74. Xu J, Gan Z, Cheng Y, Liu J. Discourse-aware neural extractive text summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. July 2020, 5021–5031

  75. Meyer T. Disambiguating temporal-contrastive connectives for machine translation. In: Proceedings of the ACL 2011 Student Session. June 2011, 46–51

  76. Meyer T, Popescu-Belis A, Zufferey S, Cartoni B. Multilingual annotation and disambiguation of discourse connectives for machine translation. In: Proceedings of Association for Computational Linguistics-Proceedings of 12th SIGdial Meeting on Discourse and Dialogue, number CONF. 2011

  77. Chai J, Jin R. Discourse structure for context question answering. In: Proceedings of the Workshop on Pragmatics of Question Answering at HLT-NAACL 2004. 2004, 23–30

  78. Sun M, Chai J Y. Discourse processing for context question answering based on linguistic knowledge. Knowledge-Based Systems, 2007, 20(6): 511–526

    Article  Google Scholar 

  79. Sachan M, Dubey K, Xing E, Richardson M. Learning answerentailing structures for machine comprehension. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2015, 239–249

  80. Kraus M, Feuerriegel S. Sentiment analysis based on rhetorical structure theory: Learning deep neural networks from discourse trees. Expert Systems with Applications, 2019, 118: 65–79

    Article  Google Scholar 

  81. Louis A, Joshi A, Nenkova A. Discourse indicators for content selection in summarization. In: Proceedings of the SIGDIAL 2010 Conference. 2010, 147–156

  82. Yoshida Y, Suzuki J, Hirao T, Nagata M. Dependency-based discourse parser for single-document summarization. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2014, 1834–1839

  83. Durrett G, Berg-Kirkpatrick T, Klein D. Learning-based singledocument summarization with compression and anaphoricity constraints. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2016, 1998–2008

  84. Li J J, Thadani K, Stent A. The role of discourse units in near-extractive summarization. In: Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue. 2016, 137–147

  85. Liu Z, Chen N. Exploiting discourse-level segmentation for extractive summarization. In: Proceedings of the 2nd Workshop on New Frontiers in Summarization. 2019, 116–121

  86. Haenelt K. Towards a quality improvement in machine translation: Modelling discourse structure and including discourse development in the determination of translation equivalents. In: Proceedings of the 4th International Conference on Theoretical and Methodological Issues in Machine Translation. Mor-ristown: Association for Computational Linguiscs. 1992, 205–212

  87. Mitkov R. How could rhetorical relations be used in machine translation? In: Proceedings of Intentionality and structure in discourse relations. 1993

  88. Wang Y, Che W, Guo J, Liu T. A neural transition-based approach for semantic dependency graph parsing. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2018, 32(1)

  89. Peters M, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L. Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 2018, 2227–2237

  90. Devlin J, Chang M W, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019, 4171–4186

Download references

Acknowledgements

The research in this article is supported by the Science and Technology Innovation 2030 -“New Generation Artificial Intelligence” Major Project (2018AA0101901), the National Key Research and Development Project (2018YFB1005103), the National Natural Science Foundation of China (Grant Nos. 61772156 and 61976073), Shenzhen Foundational Research Funding (JCYJ20200109113441941), and the Foundation of Heilongjiang Province (F2018013).

Author information

Authors and Affiliations

Corresponding author

Correspondence to Bing Qin.

Additional information

Jiaqi Li received the BS degree from the School of Computer Science and Technology, Heilongjiang University, China in 2015. He is currently working toward the PhD degree in the Harbin Institute of Technology, China. His research interests include discourse parsing for multiparty dialogues and its applications.

Ming Liu received the PhD degree from the School of Computer Science and Technology, Harbin Institute of Technology, China in 2010. He is a full professor/PhD supervisor of the Department of Computer Science, and the faculty member of Social Computing and Information Retrieval (HIT-SCIR), Harbin Institute of Technology, China. His research interests include knowledge graph, machine reading comprehension.

Bing Qin received the PhD degree from the School of Computer Science and Technology, Harbin Institute of Technology, China in 2005. She is a full professor of the Department of Computer Science, and the director of the Research Center for Social Computing and Information Retrieval (HIT-SCIR), Harbin Institute of Technology, China. Her research interests include natural language processing, information extraction, document-level discourse analysis, and sentiment analysis.

Ting Liu received the PhD degree from the Department of Computer Science, Harbin Institute of Technology, China in 1998. He is a full professor of the School of Computer Science and Technology, and the director of Faculty of Computing, Harbin Institute of Technology, China. His research interests include information retrieval, natural language processing, and social media analysis.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, J., Liu, M., Qin, B. et al. A survey of discourse parsing. Front. Comput. Sci. 16, 165329 (2022). https://doi.org/10.1007/s11704-021-0500-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-021-0500-z

Keywords