Abstract
Conversational Machine Comprehension (CMC) is a challenging task with a broad range of applications in natural language processing. Early approaches deal with CMC in a single-turn setting as traditional MRC. Recent studies have proposed multi-turn models by introducing the information flow mechanism to consider the temporal dependencies among the follow-up questions along with a conversation. However, previous methods merely consider shallow semantic dependencies at the “token-to-token” level and short-term temporal dependencies, and ignore the global transition information during the understanding and reasoning process. In this paper, we propose a Hierarchical Conversation Flow Transition and Reasoning (HCFTR) model for conversational machine comprehension. A multi-flow transition mechanism is designed to integrate the globally-aware information flow transition and make dynamic reasoning. In addition, another multi-level flow-context attention mechanism is developed to fuse multiple levels of hierarchical fine-grained representations and perform advanced reasoning. Experimental results on two benchmark datasets show that our model outperforms the strong baseline methods.




Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
The BERT embedding can be dropped out in the embedding layer. We have performed experiments to show the performances with/without Bert.
This is the multi-hop advanced reasoning process although we use 2-hop re-reasoning. Exploration of K-hop advanced reasoning is not our main focus.
The indicator \(I_i^S\) will be 0 for the Yes/No questions or unanswerable questions.
For a fair comparision, we re-implement the FlowDelta model by omitting the fine-tuning process and merely using the pre-trained BERT to initialize the word embeddings.
References
Rajpurkar P, Zhang J, Lopyrev K, Liang P (2016) Squad: 100,000+ questions for machine comprehension of text. In: Proceedings of the 2016 Conference on empirical methods in natural language processing, pp. 2383–2392
Reddy S, Chen D, Manning CD (2019) Coqa: a conversational question answering challenge. Trans Associat Comput Linguist 7:249–266
Choi E, He H, Iyyer M, Yatskar M, Yih W-t, Choi Y, Liang P, Zettlemoyer L (2018) Quac: Question answering in context. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp. 2174–2184
Zhu C, Zeng M, Huang X (2018) Sdnet: Contextualized attention-based deep network for conversational question answering. arXiv:1812.03593, 1–8
Seo M, Kembhavi A, Farhadi A, Hajishirzi H (2017) Bidirectional attention flow for machine comprehension. arXiv:1611.01603, 1–13
Huang H, Choi E, Yih W (2019) Flowqa: grasping flow in history for conversational machine comprehension. In: 7th International conference on learning representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019, pp. 1–11
Chen Y, Wu L, Zaki MJ (2020) GraphFlow: exploiting conversation flow with graph neural networks for conversational machine comprehension. https://openreview.net/forum?id=rkgi6JSYvB
Ohsugi Y, Saito I, Nishida K, Asano H, Tomita J (2019) A simple but effective method to incorporate multi-turn context with bert for conversational machine comprehension. arXiv:1905.12848
Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the north american chapter of the association for computational linguistics: human language technologies, Volume 1 (Long and Short Papers), pp. 4171–4186
Yeh Y-T, Chen Y-N (2019) FlowDelta: modeling flow information gain in reasoning for conversational machine comprehension. In: Proceedings of the 2nd Workshop on Machine Reading for Question Answering, pp. 86–90. Association for Computational Linguistics, Hong Kong, China
Ju Y, Zhao F, Chen S, Zheng B, Yang X, Liu Y (2019) Technical report on conversational question answering. arXiv preprint arXiv:1909.10772
Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov RR, Le QV (2019) Xlnet: generalized autoregressive pretraining for language understanding. Adv Neur Info Process Sys. 32
Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing, pp. 1532–1543
McCann B, Bradbury J, Xiong C, Socher R (2017) Learned in translation: contextualized word vectors. In: Advances in Neural Information Processing Systems, pp. 6294–6305
Peters M, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, Volume 1 (Long Papers), pp. 2227–2237
Chen D, Fisch A, Weston J, Bordes A (2017) Reading wikipedia to answer open-domain questions. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp. 1870–1879
Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neur Netw 18(5–6):602–610
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neur Comput 9(8):1735–1780
Huang H-Y, Zhu C, Shen Y, Chen W (2018) Fusionnet: fusing via fully-aware attention with application to machine comprehension. In: International conference on learning representations, pp. 1–20. https://openreview.net/forum?id=BJIgi_eCZ
See A, Liu PJ, Manning CD (2017) Get to the point: summarization with pointer-generator networks. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp. 1073–1083
Yatskar M (2019) A qualitative comparison of coqa, squad 2.0 and quac. In: Proceedings of the 2019 conference of the north american chapter of the association for computational linguistics: Human language technologies, Volume 1 (Long and Short Papers), pp. 2318–2323
Qu C, Yang L, Qiu M, Croft WB, Zhang Y, Iyyer M (2019) Bert with history answer embedding for conversational question answering. In: Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval. SIGIR’19, pp. 1133–1136. Association for computing machinery, New York, NY, USA. https://doi.org/10.1145/3331184.3331341
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692
Kim G, Kim H, Park J, Kang J (2021) Learn to resolve conversational dependency: a consistency training framework for conversational question answering. ACL
Gal Y, Ghahramani Z (2016) A theoretically grounded application of dropout in recurrent neural networks. In: Advances in neural information processing systems, pp. 1019–1027
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. CoRR abs/1412.6980, 1–15
Acknowledgements
This work was partially supported by the National Science Foundation of China (No. 61906185, No. 61876053, No. 61902385), Natural Science Foundation of Guangdong (No. 2019A1515011705), Youth Innovation Promotion Association of CAS China (No. 2020357), Shenzhen Science and Technology Innovation Program (Grant No. KQTD20190929172835662), Shenzhen Basic Research Foundation (No. JCYJ20200109113441941 and No. JCYJ20210324115614039). Ziyu Lyu is supported by the National Natural Science Foundation of China (No. 62002352).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interest. The authors have no financial or proprietary interests in any material discussed in this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Xiao Liu and Min Yang have contribute equally to this work.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, X., Yang, M., Lyu, Z. et al. Hierarchical conversation flow transition and reasoning for conversational machine comprehension. Neural Comput & Applic 35, 2413–2428 (2023). https://doi.org/10.1007/s00521-022-07720-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07720-5