skip to main content
research-article

Retrospective Multi-granularity Fusion Network for Chinese Idiom Cloze-style Reading Comprehension

Published: 20 July 2023 Publication History

Abstract

Chinese idiom cloze-style reading comprehension task is of great significance for improving the machine’s ability to understand Chinese idioms, which is one of the essential application requirements in advanced artificial intelligence. Existing methods suffer from an insufficient deep semantic understanding of the text. To solve this problem, this paper proposes a novel Retrospective Multi-granularity Fusion Network (RMFNet) for Chinese idiom cloze-style reading comprehension. Our RMFNet is equipped with two novel modules to model deeper contextual information of passage and Chinese idioms, respectively. First, we propose a novel Multi-granularity Passage Fusion (MgPF) module, which enhances the passage representation by integrating different semantic perspectives. Second, we propose a Retrospective Reading (Re\(^2\)) module that implements a back-and-forth reading mechanism to concentrate on critical Chinese idioms, thereby generating an ultimate memory for the whole text. Notably, the intuition of the MgPF module and the Re\(^2\) module is based on human reading strategies in the real world. The strategies in these modules are similar to how humans perceive the text. Extensive experiments are conducted on Chinese benchmark datasets to evaluate the effectiveness and superiority of the proposed method. Our RMFNet achieves state-of-the-art performance and in-depth analysis verifies its capability for understanding the deep semantics of the text.

References

[1]
Teresa S. Bossert and Frederick M. Schwantes. 1995. Children’s comprehension monitoring: Training children to use rereading to aid comprehension. Literacy Research and Instruction 35, 2 (1995), 109–121.
[2]
Cristina Cacciari and Patrizia Tabossi. 2014. Idioms: Processing, Structure, and Interpretation. Psychology Press.
[3]
Danqi Chen, Jason Bolton, and Christopher D. Manning. 2016. A thorough examination of the CNN/daily mail reading comprehension task. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Berlin, Germany, 2358–2367.
[4]
Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, and Ziqing Yang. 2021. Pre-training with whole word masking for Chinese BERT. IEEE/ACM Transactions on Audio, Speech, and Language Processing 29 (2021), 3504–3514.
[5]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186.
[6]
Yuwei Fang, Siqi Sun, Zhe Gan, Rohit Pillai, Shuohang Wang, and Jingjing Liu. 2019. Hierarchical graph network for multi-hop question answering. arXiv preprint arXiv:1911.03631 (2019).
[7]
Chengzhen Fu, Yuntao Li, and Yan Zhang. 2019. ATNet: Answering Cloze-style questions via intra-attention and inter-attention. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 242–252.
[8]
Karl Moritz Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. 2015. Teaching machines to read and comprehend. Advances in Neural Information Processing Systems 28 (2015), 1693–1701.
[9]
Wan Yu Ho, Christine Kng, Shan Wang, and Francis Bond. 2014. Identifying idioms in Chinese translations. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14). 716–721.
[10]
Ray Jackendoff.2002. Foundations of Language: Brain, Meaning, Grammar, Evolution. Oxford University Press, USA.
[11]
Zhiying Jiang, Boliang Zhang, Lifu Huang, and Heng Ji. 2018. Chengyu Cloze test. In Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, New Orleans, Louisiana, 154–158.
[12]
Marcel A. Just and Patricia A. Carpenter. 1980. A theory of reading: From eye fixations to comprehension. Psychological Review 87, 4 (1980), 329.
[13]
Walter Kintsch. 1980. Learning from text, levels of comprehension, or: Why anyone would read a story anyway. Poetics 9, 1-3 (1980), 87–98.
[14]
Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2019. ALBERT: A lite BERT for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019).
[15]
Jingye Li, Hao Fei, Jiang Liu, Shengqiong Wu, Meishan Zhang, Chong Teng, Donghong Ji, and Fei Li. 2021. Unified named entity recognition as word-word relation classification. arXiv preprint arXiv:2112.10070 (2021).
[16]
Ronghan Li, Lifang Wang, Shengli Wang, and Zejun Jiang. 2021. Asynchronous multi-grained graph network for interpretable multi-hop reading comprehension. (2021), 3857–3863.
[17]
Weikang Li, Wei Li, and Yunfang Wu. 2018. A unified model for document-based question answering based on human-like reading strategy. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.
[18]
Li Liu and Jiayi Yao. 2017. The learning of Chinese idiomatic expressions as a foreign language. Higher Education Studies 7, 2 (2017), 27–34.
[19]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
[20]
Yuanchao Liu, Bo Pang, and Bingquan Liu. 2019. Neural-based Chinese idiom recommendation for enhancing elegance in essay writing. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics,Florence, Italy, 5522–5526.
[21]
Siyu Long, Ran Wang, Kun Tao, Jiali Zeng, and Xinyu Dai. 2020. Synonym knowledge enhanced reader for Chinese idiom reading comprehension. In Proceedings of the 28th International Conference on Computational Linguistics. 3684–3695.
[22]
Sijie Mai, Ying Zeng, Shuangjia Zheng, and Haifeng Hu. 2022. Hybrid contrastive learning of tri-modal representation for multimodal sentiment analysis. IEEE Transactions on Affective Computing (2022).
[23]
Lili Mou, Rui Men, Ge Li, Yan Xu, Lu Zhang, Rui Yan, and Zhi Jin. 2016. Natural language inference by tree-based convolution and heuristic matching. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Berlin, Germany, 130–136.
[24]
Rutu Mulkar-Mehta, Jerry R. Hobbs, and Eduard Hovy. 2011. Granularity in natural language discourse. In Proceedings of the Ninth International Conference on Computational Semantics (IWCS 2011). ACM,.
[25]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. PyTorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems 32 (2019).
[26]
P. David Pearson and Gina Cervetti. 2013. The psychology and pedagogy of reading processes. (2013).
[27]
Rupesh Kumar Srivastava, Klaus Greff, and Jürgen Schmidhuber. 2015. Highway networks. arXiv preprint arXiv: 1505.00387 (2015).
[28]
Kai Sun, Dian Yu, Dong Yu, and Claire Cardie. 2019. Improving machine reading comprehension with general reading strategies. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2633–2643.
[29]
Xin Sun, Xuan Wang, Jialin Gao, Qiong Liu, and Xi Zhou. 2022. You need to read again: Multi-granularity perception network for moment retrieval in videos. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1022–1032.
[30]
Minghuan Tan and Jing Jiang. 2020. A BERT-based dual embedding model for Chinese idiom prediction. In Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Barcelona, Spain (Online), 1312–1322.
[31]
Minghuan Tan and Jing Jiang. 2021. Learning and evaluating Chinese idiom embeddings. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021). 1387–1396.
[32]
Minghuan Tan, Jing Jiang, and Bingtian Dai. 2021. A BERT-based Two-stage model for Chinese Chengyu recommendation. ACM Transactions on Asian and Low-Resource Language Information Processing (2021).
[33]
Min Tang, Jiaran Cai, and Hankz Hankui Zhuo. 2019. Multi-matching network for multiple choice reading comprehension. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 7088–7095.
[34]
Robert Thibadeau, Marcel Adam Just, and Patricia A. Carpenter. 1982. A model of the time course and content of reading. Cognitive Science 6, 2 (1982), 157–203.
[35]
Xinyu Wang, Hongsheng Zhao, Tan Yang, and Hongbo Wang. 2020. Correcting the misuse: A method for the Chinese idiom Cloze test. In Proceedings of Deep Learning Inside Out (DeeLIO): The First Workshop on Knowledge Extraction and Integration for Deep Learning Architectures. 1–10.
[36]
Zhu Xiaomeng, Zhang Chong, and Yang Zipei. 2020. Emotional tendency analysis of Twitter texts based on CNNs model. In 2020 International Conference on Big Data and Informatization Education (ICBDIE). IEEE, 379–382.
[37]
Xinsong Zhang, Pengshuai Li, and Hang Li. 2021. AMBERT: A pre-trained language model with multi-grained tokenization. (2021), 421–435.
[38]
Zhuosheng Zhang, Junlong Li, and Hai Zhao. 2021. Multi-turn dialogue reading comprehension with pivot turns and knowledge. IEEE/ACM Transactions on Audio, Speech, and Language Processing 29 (2021), 1161–1173.
[39]
Yilin Zhao, Zhuosheng Zhang, and Hai Zhao. 2020. Reference knowledgeable network for machine reading comprehension. arXiv preprint arXiv:2012.03709 (2020).
[40]
Bo Zheng, Haoyang Wen, Yaobo Liang, Nan Duan, Wanxiang Che, Daxin Jiang, Ming Zhou, and Ting Liu. 2020. Document modeling with graph attention networks for multi-grained machine reading comprehension. arXiv preprint arXiv:2005.05806 (2020).
[41]
Chujie Zheng, Minlie Huang, and Aixin Sun. 2019. ChID: A large-scale Chinese idiom dataset for Cloze test. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 778–787.
[42]
Yukun Zheng, Jiaxin Mao, Yiqun Liu, Zixin Ye, Min Zhang, and Shaoping Ma. 2019. Human behavior inspired machine reading comprehension. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 425–434.
[43]
Pengfei Zhu, Hai Zhao, and Xiaoguang Li. 2020. DUMA: Reading comprehension with transposition thinking. arXiv: 2001.09415 (2020).

Cited By

View all
  • (2024)A cross-guidance cross-lingual model on generated parallel corpus for classical Chinese machine reading comprehensionInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10360761:2Online publication date: 12-Apr-2024

Index Terms

  1. Retrospective Multi-granularity Fusion Network for Chinese Idiom Cloze-style Reading Comprehension

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian and Low-Resource Language Information Processing
    ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 22, Issue 7
    July 2023
    422 pages
    ISSN:2375-4699
    EISSN:2375-4702
    DOI:10.1145/3610376
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 20 July 2023
    Online AM: 09 June 2023
    Accepted: 13 May 2023
    Received: 15 October 2022
    Published in TALLIP Volume 22, Issue 7

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Deep learning
    2. machine reading comprehension
    3. Chinese idiom understanding
    4. human reading strategy
    5. information fusion

    Qualifiers

    • Research-article

    Funding Sources

    • National Natural Science Foundation of China
    • National Social Science Foundation of China

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)68
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A cross-guidance cross-lingual model on generated parallel corpus for classical Chinese machine reading comprehensionInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10360761:2Online publication date: 12-Apr-2024

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media