Skip to main content

CoLISA: Inner Interaction via Contrastive Learning for Multi-choice Reading Comprehension

  • Conference paper
  • First Online:
Advances in Information Retrieval (ECIR 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13980))

Included in the following conference series:

Abstract

Multi-choice reading comprehension (MC-RC) is supposed to select the most appropriate answer from multiple candidate options by reading and comprehending a given passage and a question. Recent studies dedicate to catching the relationships within the triplet of passage, question, and option. Nevertheless, one limitation in current approaches relates to the fact that confusing distractors are often mistakenly judged as correct, due to the fact that models do not emphasize the differences between the answer alternatives. Motivated by the way humans deal with multi-choice questions by comparing given options, we propose CoLISA (Contrastive Learning and In-Sample Attention), a novel model to prudently exclude the confusing distractors. In particular, CoLISA acquires option-aware representations via contrastive learning on multiple options. Besides, in-sample attention mechanisms are applied across multiple options so that they can interact with each other. The experimental results on QuALITY and RACE demonstrate that our proposed CoLISA pays more attention to the relation between correct and distractive options, and recognizes the discrepancy between them. Meanwhile, CoLISA also reaches the state-of-the-art performance on QuALITY (Our code is available at https://github.com/Walle1493/CoLISA..).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The issue that input passages exceed the length constraint exists in QuALITY, we seriously consider it in our work as well.

  2. 2.

    Considering an appropriate value of the extracted context length, we simply define k as 2 and n as 1 in our experiments.

  3. 3.

    DPR-based retriever is available for long input, which is the characteristic owned by QuALITY. Besides, the best distractors are only annotated in QuALITY, therefore, we execute experiments mainly on QuALITY.

  4. 4.

    Source data in QuALITY is divided into full and hard subsets according to the question difficulty, while in RACE, middle and high subsets represent two levels of school entrance exams.

  5. 5.

    The performance on base models is actually far lower than listed, we transfer identical experiments from large models to base models, which display worse results. Results are listed in the last column in Table 3. The performance drops fiercely from 40.8 to 37.1. Our baseline here is Roberta-base. We have to deploy a small batch size on large models due to device limitations. Hence, both manners of in-batch and in-sample on large models do not show such an enormous dissimilarity.

  6. 6.

    The parameter n is 12 for base models or 24 for large models.

References

  1. Beltagy, I., Peters, M.E., Cohan, A.: Longformer: the long-document transformer. arXiv preprint arXiv:2004.05150 (2020)

  2. Caciularu, A., Dagan, I., Goldberger, J., Cohan, A.: Utilizing evidence spans via sequence-level contrastive learning for long-context question answering. arXiv preprint arXiv:2112.08777 (2021)

  3. Conneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A.: Supervised learning of universal sentence representations from natural language inference data. arXiv preprint arXiv:1705.02364 (2017)

  4. Daniel, K.: Thinking, fast and slow (2017)

    Google Scholar 

  5. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  6. Dong, M., Zou, B., Qian, J., Huang, R., Hong, Y.: ThinkTwice: a two-stage method for long-text machine reading comprehension. In: Wang, L., Feng, Y., Hong, Y., He, R. (eds.) Natural Language Processing and Chinese Computing, pp. 427–438. Springer International Publishing, Cham (2021). https://doi.org/10.1007/978-3-030-88480-2_34

    Chapter  Google Scholar 

  7. Gao, T., Yao, X., Chen, D.: Simcse: simple contrastive learning of sentence embeddings. arXiv preprint arXiv:2104.08821 (2021)

  8. He, P., Gao, J., Chen, W.: DeBERTaV 3: improving deberta using electra-style pre-training with gradient-disentangled embedding sharing. arXiv preprint arXiv:2111.09543 (2021)

  9. Hendrycks, D., Gimpel, K.: Gaussian error linear units (GELUs). arXiv preprint arXiv:1606.08415 (2016)

  10. Hershey, J.R., Olsen, P.A.: Approximating the kullback leibler divergence between gaussian mixture models. In: 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP’07, vol. 4, pp. IV-317. IEEE (2007)

    Google Scholar 

  11. Karpukhin, V., et al.: Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906 (2020)

  12. Lai, G., Xie, Q., Liu, H., Yang, Y., Hovy, E.: Race: Large-scale reading comprehension dataset from examinations. arXiv preprint arXiv:1704.04683 (2017)

  13. Li, Y., Zou, B., Li, Z., Aw, A.T., Hong, Y., Zhu, Q.: Winnowing knowledge for multi-choice question answering. In: Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 1157–1165. Association for Computational Linguistics (2021)

    Google Scholar 

  14. Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. ArXiv abs/1907.11692 (2019)

    Google Scholar 

  15. Pang, R.Y., et al.: Quality: question answering with long input texts, yes! arXiv preprint arXiv:2112.08608 (2021)

  16. Ran, Q., Li, P., Hu, W., Zhou, J.: Option comparison network for multiple-choice reading comprehension. arXiv preprint arXiv:1903.03033 (2019)

  17. Richardson, M., Burges, C.J., Renshaw, E.: MCTest: a challenge dataset for the open-domain machine comprehension of text. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 193–203 (2013)

    Google Scholar 

  18. Seo, M., Kembhavi, A., Farhadi, A., Hajishirzi, H.: Bidirectional attention flow for machine comprehension. arXiv preprint arXiv:1611.01603 (2016)

  19. Sukhbaatar, S., Grave, E., Lample, G., Jegou, H., Joulin, A.: Augmenting self-attention with persistent memory. arXiv preprint arXiv:1907.01470 (2019)

  20. Sun, K., Yu, D., Chen, J., Yu, D., Choi, Y., Cardie, C.: DREAM: a challenge data set and models for dialogue-based reading comprehension. Trans. Assoc. Comput. Linguist. 7, 217–231 (2019)

    Article  Google Scholar 

  21. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  22. Voorhees, E.M., et al.: The TREC-8 question answering track report. In: TREC, vol. 99, pp. 77–82 (1999)

    Google Scholar 

  23. Yang, N., Wei, F., Jiao, B., Jiang, D., Yang, L.: xMoCo: cross momentum contrastive learning for open-domain question answering. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 6120–6129 (2021)

    Google Scholar 

  24. Yang, Z., Dai, Z., Yang, Y., Carbonell, J.G., Salakhutdinov, R., Le, Q.V.: XLNet: generalized autoregressive pretraining for language understanding. In: Neural Information Processing Systems (2019)

    Google Scholar 

  25. Zaheer, M., et al.: Big Bird: transformers for longer sequences. Adv. Neural. Inf. Process. Syst. 33, 17283–17297 (2020)

    Google Scholar 

  26. Zhang, S., Zhao, H., Wu, Y., Zhang, Z., Zhou, X., Zhou, X.: DCMN+: dual co-matching network for multi-choice reading comprehension. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 9563–9570 (2020)

    Google Scholar 

  27. Zhao, Y., Zhang, Z., Zhao, H.: Reference knowledgeable network for machine reading comprehension. IEEE/ACM Trans. Audio Speech Lang. Process. 30, 1461–1473 (2022)

    Article  Google Scholar 

  28. Zhu, P., Zhang, Z., Zhao, H., Li, X.: DUMA: reading comprehension with transposition thinking. IEEE/ACM Trans. Audio Speech Lang. Process. 30, 269–279 (2021)

    Article  Google Scholar 

Download references

Acknowledgements

The research is supported by National Key R &D Program of China (2020YFB1313601) and National Science Foundation of China (62076174, 62076175).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yu Hong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dong, M., Zou, B., Li, Y., Hong, Y. (2023). CoLISA: Inner Interaction via Contrastive Learning for Multi-choice Reading Comprehension. In: Kamps, J., et al. Advances in Information Retrieval. ECIR 2023. Lecture Notes in Computer Science, vol 13980. Springer, Cham. https://doi.org/10.1007/978-3-031-28244-7_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-28244-7_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-28243-0

  • Online ISBN: 978-3-031-28244-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics