Hierarchical conversation flow transition and reasoning for conversational machine comprehension

Liu, Xiao; Yang, Min; Lyu, Ziyu; Lin, Dongding; Li, Piji; Xu, Ruifeng

doi:10.1007/s00521-022-07720-5

Hierarchical conversation flow transition and reasoning for conversational machine comprehension

Original Article
Published: 27 August 2022

Volume 35, pages 2413–2428, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Xiao Liu^1,2,
Min Yang¹,
Ziyu Lyu ORCID: orcid.org/0000-0002-3049-0085¹,
Dongding Lin³,
Piji Li⁴ &
…
Ruifeng Xu⁵

335 Accesses
2 Citations
Explore all metrics

Abstract

Conversational Machine Comprehension (CMC) is a challenging task with a broad range of applications in natural language processing. Early approaches deal with CMC in a single-turn setting as traditional MRC. Recent studies have proposed multi-turn models by introducing the information flow mechanism to consider the temporal dependencies among the follow-up questions along with a conversation. However, previous methods merely consider shallow semantic dependencies at the “token-to-token” level and short-term temporal dependencies, and ignore the global transition information during the understanding and reasoning process. In this paper, we propose a Hierarchical Conversation Flow Transition and Reasoning (HCFTR) model for conversational machine comprehension. A multi-flow transition mechanism is designed to integrate the globally-aware information flow transition and make dynamic reasoning. In addition, another multi-level flow-context attention mechanism is developed to fuse multiple levels of hierarchical fine-grained representations and perform advanced reasoning. Experimental results on two benchmark datasets show that our model outperforms the strong baseline methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

The BERT embedding can be dropped out in the embedding layer. We have performed experiments to show the performances with/without Bert.
This is the multi-hop advanced reasoning process although we use 2-hop re-reasoning. Exploration of K-hop advanced reasoning is not our main focus.
The indicator $I_i^S$ will be 0 for the Yes/No questions or unanswerable questions.
https://github.com/ MiuLab/ðFlowDelta.
http://pytorch.org/.
For a fair comparision, we re-implement the FlowDelta model by omitting the fine-tuning process and merely using the pre-trained BERT to initialize the word embeddings.

References

Rajpurkar P, Zhang J, Lopyrev K, Liang P (2016) Squad: 100,000+ questions for machine comprehension of text. In: Proceedings of the 2016 Conference on empirical methods in natural language processing, pp. 2383–2392
Reddy S, Chen D, Manning CD (2019) Coqa: a conversational question answering challenge. Trans Associat Comput Linguist 7:249–266
Article Google Scholar
Choi E, He H, Iyyer M, Yatskar M, Yih W-t, Choi Y, Liang P, Zettlemoyer L (2018) Quac: Question answering in context. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp. 2174–2184
Zhu C, Zeng M, Huang X (2018) Sdnet: Contextualized attention-based deep network for conversational question answering. arXiv:1812.03593, 1–8
Seo M, Kembhavi A, Farhadi A, Hajishirzi H (2017) Bidirectional attention flow for machine comprehension. arXiv:1611.01603, 1–13
Huang H, Choi E, Yih W (2019) Flowqa: grasping flow in history for conversational machine comprehension. In: 7th International conference on learning representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019, pp. 1–11
Chen Y, Wu L, Zaki MJ (2020) GraphFlow: exploiting conversation flow with graph neural networks for conversational machine comprehension. https://openreview.net/forum?id=rkgi6JSYvB
Ohsugi Y, Saito I, Nishida K, Asano H, Tomita J (2019) A simple but effective method to incorporate multi-turn context with bert for conversational machine comprehension. arXiv:1905.12848
Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the north american chapter of the association for computational linguistics: human language technologies, Volume 1 (Long and Short Papers), pp. 4171–4186
Yeh Y-T, Chen Y-N (2019) FlowDelta: modeling flow information gain in reasoning for conversational machine comprehension. In: Proceedings of the 2nd Workshop on Machine Reading for Question Answering, pp. 86–90. Association for Computational Linguistics, Hong Kong, China
Ju Y, Zhao F, Chen S, Zheng B, Yang X, Liu Y (2019) Technical report on conversational question answering. arXiv preprint arXiv:1909.10772
Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov RR, Le QV (2019) Xlnet: generalized autoregressive pretraining for language understanding. Adv Neur Info Process Sys. 32
Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing, pp. 1532–1543
McCann B, Bradbury J, Xiong C, Socher R (2017) Learned in translation: contextualized word vectors. In: Advances in Neural Information Processing Systems, pp. 6294–6305
Peters M, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, Volume 1 (Long Papers), pp. 2227–2237
Chen D, Fisch A, Weston J, Bordes A (2017) Reading wikipedia to answer open-domain questions. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp. 1870–1879
Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neur Netw 18(5–6):602–610
Article Google Scholar
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neur Comput 9(8):1735–1780
Article Google Scholar
Huang H-Y, Zhu C, Shen Y, Chen W (2018) Fusionnet: fusing via fully-aware attention with application to machine comprehension. In: International conference on learning representations, pp. 1–20. https://openreview.net/forum?id=BJIgi_eCZ
See A, Liu PJ, Manning CD (2017) Get to the point: summarization with pointer-generator networks. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp. 1073–1083
Yatskar M (2019) A qualitative comparison of coqa, squad 2.0 and quac. In: Proceedings of the 2019 conference of the north american chapter of the association for computational linguistics: Human language technologies, Volume 1 (Long and Short Papers), pp. 2318–2323
Qu C, Yang L, Qiu M, Croft WB, Zhang Y, Iyyer M (2019) Bert with history answer embedding for conversational question answering. In: Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval. SIGIR’19, pp. 1133–1136. Association for computing machinery, New York, NY, USA. https://doi.org/10.1145/3331184.3331341
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692
Kim G, Kim H, Park J, Kang J (2021) Learn to resolve conversational dependency: a consistency training framework for conversational question answering. ACL
Gal Y, Ghahramani Z (2016) A theoretically grounded application of dropout in recurrent neural networks. In: Advances in neural information processing systems, pp. 1019–1027
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. CoRR abs/1412.6980, 1–15

Download references

Acknowledgements

This work was partially supported by the National Science Foundation of China (No. 61906185, No. 61876053, No. 61902385), Natural Science Foundation of Guangdong (No. 2019A1515011705), Youth Innovation Promotion Association of CAS China (No. 2020357), Shenzhen Science and Technology Innovation Program (Grant No. KQTD20190929172835662), Shenzhen Basic Research Foundation (No. JCYJ20200109113441941 and No. JCYJ20210324115614039). Ziyu Lyu is supported by the National Natural Science Foundation of China (No. 62002352).

Author information

Authors and Affiliations

Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
Xiao Liu, Min Yang & Ziyu Lyu
University of Chinese Academy of Sciences, Beijing, 100049, China
Xiao Liu
School of Computer Science, Sun Yat-Sen University, Guangzhou, 510275, China
Dongding Lin
Tencent AI Lab, Shenzhen, China
Piji Li
School of Computer Science, Harbin Institute of Technology (Shenzhen), Shenzhen, 518055, China
Ruifeng Xu

Authors

Xiao Liu
View author publications
You can also search for this author inPubMed Google Scholar
Min Yang
View author publications
You can also search for this author inPubMed Google Scholar
Ziyu Lyu
View author publications
You can also search for this author inPubMed Google Scholar
Dongding Lin
View author publications
You can also search for this author inPubMed Google Scholar
Piji Li
View author publications
You can also search for this author inPubMed Google Scholar
Ruifeng Xu
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Ziyu Lyu.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest. The authors have no financial or proprietary interests in any material discussed in this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Xiao Liu and Min Yang have contribute equally to this work.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liu, X., Yang, M., Lyu, Z. et al. Hierarchical conversation flow transition and reasoning for conversational machine comprehension. Neural Comput & Applic 35, 2413–2428 (2023). https://doi.org/10.1007/s00521-022-07720-5

Download citation

Received: 13 January 2022
Accepted: 09 August 2022
Published: 27 August 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s00521-022-07720-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hierarchical conversation flow transition and reasoning for conversational machine comprehension

Abstract

Access this article

Subscribe and save

Buy Now

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now