Skip to main content
Log in

Discriminative explicit instance selection for implicit discourse relation classification

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Discourse relation classification is a fundamental task for discourse analysis, which is essential for understanding the structure and connection of texts. Implicit discourse relation classification aims to determine the relationship between adjacent sentences and is very challenging because it lacks explicit discourse connectives as linguistic cues and sufficient annotated training data. In this paper, we propose a discriminative instance selection method to construct synthetic implicit discourse relation data from easy-to-collect explicit discourse relations. An expanded instance consists of an argument pair and its sense label. We introduce the argument pair type classification task, which aims to distinguish between implicit and explicit argument pairs and select the explicit argument pairs that are most similar to natural implicit argument pairs for data expansion. We also propose a simple label-smoothing technique to assign robust sense labels for the selected argument pairs. We evaluate our method on PDTB 2.0 and PDTB 3.0. The results show that our method can consistently improve the performance of the baseline model, and achieve competitive results with the state-of-the-art models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Webber B, Egg M, Kordoni V. Discourse structure and language technology. Natural Language Engineering, 2012, 18(4): 437–490

    Article  Google Scholar 

  2. Hobbs J R. Coherence and coreference. Cognitive Science, 1979, 3(1): 67–90

    Article  Google Scholar 

  3. Mann W C, Thompson S A. Rhetorical structure theory: toward a functional theory of text organization. Text-Interdisciplinary Journal for the Study of Discourse, 1988, 8(3): 243–281

    Article  Google Scholar 

  4. Prasad R, Dinesh N, Lee A, Miltsakaki E, Robaldo L, Joshi A, Webber B. The Penn discourse TreeBank 2.0. In: Proceedings of the 6th International Conference on Language Resources and Evaluation. 2008

  5. Pitler E, Raghupathy M, Mehta H, Nenkova A, Lee A, Joshi A. Easily identifiable discourse relations. In: Proceedings of COLING 2008: Companion Volume: Posters. 2008, 87–90

  6. Kurfali M, Östling R. Let’s be explicit about that: distant supervision for implicit discourse relation classification via connective prediction. In: Proceedings of the 1st Workshop on Understanding Implicit and Underspecified Language. 2021, 1–10

  7. Marcu D, Echihabi A. An unsupervised approach to recognizing discourse relations. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 2002, 368–375

  8. Sporleder C, Lascarides A. Using automatically labelled examples to classify rhetorical relations: an assessment. Natural Language Engineering, 2008, 14(3): 369–416

    Article  Google Scholar 

  9. Rutherford A, Xue N. Improving the inference of implicit discourse relations via classifying explicit discourse connectives. In: Proceedings of 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2015, 799–808

  10. Ji Y, Zhang G, Eisenstein J. Closing the gap: domain adaptation from explicit to implicit discourse relations. In: Proceedings of 2015 Conference on Empirical Methods in Natural Language Processing. 2015, 2219–2224

  11. Liu Y, Li S, Zhang X, Sui Z. Implicit discourse relation classification via multi-task neural networks. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016, 2750–2756

  12. Kishimoto Y, Murawaki Y, Kurohashi S. Adapting BERT to implicit discourse relation classification with a focus on discourse connectives. In: Proceedings of the 12th Language Resources and Evaluation Conference. 2020, 1152–1158

  13. Song W, Liu L. Representation learning in discourse parsing: a survey. Science China Technological Sciences, 2020, 63(10): 1921–1946

    Article  Google Scholar 

  14. Li J, Liu M, Qin B, Liu T. A survey of discourse parsing. Frontiers of Computer Science, 2022, 16(5): 165329

    Article  Google Scholar 

  15. Xiang W, Wang B. A survey of implicit discourse relation recognition. ACM Computing Surveys, 2023, 55(12): 258

    Article  Google Scholar 

  16. Webber B, Prasad R, Lee A, Joshi A. The Penn discourse Treebank 3.0 annotation manual. Philadelphia: University of Pennsylvania, 2019

    Google Scholar 

  17. Devlin J, Chang M W, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019, 4171–4186

  18. Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V. RoBERTa: a robustly optimized BERT pretraining approach. 2019, arXiv preprint arXiv: 1907.11692

  19. Lin Z, Kan M Y, Ng H T. Recognizing implicit discourse relations in the Penn discourse Treebank. In: Proceedings of 2009 Conference on Empirical Methods in Natural Language Processing. 2009, 343–351

  20. Ji Y, Eisenstein J. Representation learning for text-level discourse parsing. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2014, 13–24

  21. Qin L, Zhang Z, Zhao H. Implicit discourse relation recognition with context-aware character-enhanced embeddings. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. 2016, 1914–1924

  22. Li H, Zhang J, Zong C. Implicit discourse relation recognition for English and Chinese with multiview modeling and effective representation learning. ACM Transactions on Asian and Low-Resource Language Information Processing, 2017, 16(3): 19

    Article  Google Scholar 

  23. Bai H, Zhao H. Deep enhanced representation for implicit discourse relation recognition. In: Proceedings of the 27th International Conference on Computational Linguistics. 2018, 571–583

  24. Lei W, Xiang Y, Wang Y, Zhong Q, Liu M, Kan M Y. Linguistic properties matter for implicit discourse relation recognition: combining semantic interaction, topic continuity and attribution. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence and 30th Innovative Applications of Artificial Intelligence Conference and 8th AAAI Symposium on Educational Advances in Artificial Intelligence. 2018, 594

  25. Shi W, Demberg V. Next sentence prediction helps implicit discourse relation classification within and across domains. In: Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019, 5790–5796

  26. He R, Wang J, Guo F, Han Y. TransS-driven joint learning architecture for implicit discourse relation recognition. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020, 139–148

  27. Liu X, Ou J, Song Y, Jiang X. On the importance of word and sentence representation learning in implicit discourse relation classification. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence. 2021, 3830–3836

  28. Dou Z, Hong Y, Sun Y, Zhou G. CVAE-based Re-anchoring for implicit discourse relation classification. In: Proceedings of Findings of the Association for Computational Linguistics: EMNLP 2021. 2021, 1275–1283

  29. Wang X, Li S, Li J, Li W. Implicit discourse relation recognition by selecting typical training examples. In: Proceedings of COLING 2012. 2012, 2757–2772

  30. Lan M, Wang J, Wu Y, Niu Z Y, Wang H. Multi-task attention-based neural networks for implicit discourse relationship representation and identification. In: Proceedings of 2017 Conference on Empirical Methods in Natural Language Processing. 2017, 1299–1308

  31. Wu C, Shi X, Chen Y, Huang Y, Su J. Bilingually-constrained synthetic data for implicit discourse relation recognition. In: Proceedings of 2016 Conference on Empirical Methods in Natural Language Processing. 2016, 2306–2312

  32. Malmi E, Pighin D, Krause S, Kozhevnikov M. Automatic prediction of discourse connectives. In: Proceedings of the 11th International Conference on Language Resources and Evaluation. 2018

  33. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the inception architecture for computer vision. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016, 2818–2826

  34. Yang Y, Zha K, Chen Y, Wang H, Katabi D. Delving into deep imbalanced regression. In: Proceedings of the 38th International Conference on Machine Learning. 2021, 11842–11851

  35. Xu Y, Hong Y, Ruan H, Yao J, Zhang M, Zhou G. Using active learning to expand training data for implicit discourse relation recognition. In: Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing. 2018, 725–731

  36. Dai Z, Huang R. A regularization approach for incorporating event knowledge and coreference relations into neural discourse parsing. In: Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019, 2976–2987

  37. Kim N, Feng S, Gunasekara C, Lastras L. Implicit discourse relation classification: we need to talk about evaluation. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020, 5404–5414

  38. Liang L, Zhao Z, Webber B. Extending implicit discourse relation recognition to the PDTB-3. In: Proceedings of the 1st Workshop on Computational Approaches to Discourse. 2020, 135–147

  39. Kiyomaru H, Kurohashi S. Contextualized and generalized sentence representations by contrastive self-supervised learning: a case study on discourse relation analysis. In: Proceedings of 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2021, 5578–5584

  40. He P, Liu X, Gao J, Chen W. DeBERTa: decoding-enhanced BERT with disentangled attention. In: Proceedings of the 9th International Conference on Learning Representations. 2021

Download references

Acknowledgements

This work was funded by the National Natural Science Foundation of China (Grant Nos. 62376166, 62306188, 61876113), and the National Key R&D Program of China (No. 2022YFC3303504).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Song.

Ethics declarations

Competing interests The authors declare that they have no competing interests or financial conflicts to disclose.

Additional information

Wei Song is a professor at the Information Engineering College, Capital Normal University, China. He obtained his PhD degree from the Department of Computer Science, Harbin Institute of Technology, China in 2013. His research interests include natural language processing and its applications.

Hongfei Han is a master’s student at the Information Engineering College, Capital Normal University, China. He obtained his bachelor’s degree in Digital Media Technology from China Three Gorges University, China in 2019. His research interest is discourse analysis.

Xu Han is a lecturer at the Information Engineering College, Capital Normal University, China. She obtained her PhD degree from the Institute of Computing Technology, Chinese Academy of Sciences, China in 2011. Her research interests include natural language processing and sentiment analysis.

Miaomiao Cheng is a lecturer at the Information Engineering College, Capital Normal University, China. She obtained her PhD degree from the School of Computer and Information Technology, Beijing Jiaotong University, China in 2019. Her main research interest is multi-modal machine learning.

Jiefu Gong is a senior researcher at iFLYTEK AI Research Institute, China. He obtained his master’ s degree from the School of Software, Harbin Institute of Technology, China in 2016. His main research interests are natural language processing and intelligent scoring systems.

Shijin Wang is the executive director of iFLYTEK AI Research Institute, China. He obtained his PhD degree from the Institute of Automation, Chinese Academy of Sciences, China in 2008. His main research interest is educational AI technology.

Ting Liu is a full professor at Harbin Institute of Technology, China. He obtained his PhD degree from the Department of Computer Science, Harbin Institute of Technology, China in 1998. His research interests include natural language processing, information retrieval, and social media analysis.

Electronic Supplementary Material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Song, W., Han, H., Han, X. et al. Discriminative explicit instance selection for implicit discourse relation classification. Front. Comput. Sci. 18, 184340 (2024). https://doi.org/10.1007/s11704-023-3058-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-023-3058-2

Keywords

Navigation