Skip to main content

CSS: Contrastive Span Selector for Multi-span Question Answering

  • Conference paper
  • First Online:
PRICAI 2023: Trends in Artificial Intelligence (PRICAI 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14325))

Included in the following conference series:

  • 522 Accesses

Abstract

This study investigates the task of Multi-span Question Answering (MSQA). Currently, the MSQA task is primarily modeled as a sequence tagging problem, predicting whether each word is a part of an answer. However, this approach independently predicts words without fully utilizing a comprehensive understanding of the complexities in MSQA. In this paper, we propose a novel model, Contrastive Span Selector. Our model utilizes a multi-head biaffine attention mechanism to generate the span representations and employs a CNN block for span-wise interaction. Additionally, we incorporate the question and a global token into the encoding process, projecting all vectors into a shared representation space. To train our model, we employ contrastive learning with a dynamic threshold to control the similarity boundary between answer spans and non-answer spans. Our model outperforms the tagger model by 6.32 in F1 score for exact match on the MultiSpanQA multi-span setting and 5.69 on the expand setting, establishing it as the state-of-the-art model for MSQA. The code is available at: https://github.com/phzh24/Contrastive-Span-Selector.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Dasigi, P., Liu, N.F., Marasović, A., Smith, N.A., Gardner, M.: Quoref: a reading comprehension dataset with questions requiring coreferential reasoning. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 5925–5932. Association for Computational Linguistics, Hong Kong (2019)

    Google Scholar 

  2. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: North American Chapter of the Association for Computational Linguistics (NAACL), pp. 4171–4186 (2019)

    Google Scholar 

  3. Dozat, T., Manning, C.D.: Deep biaffine attention for neural dependency parsing. In: International Conference on Learning Representations (ICLR) (2017)

    Google Scholar 

  4. Dua, D., Wang, Y., Dasigi, P., Stanovsky, G., Singh, S., Gardner, M.: DROP: a reading comprehension benchmark requiring discrete reasoning over paragraphs. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long and Short Papers), pp. 2368–2378. Association for Computational Linguistics, Minneapolis (2019)

    Google Scholar 

  5. Hendrycks, D., Gimpel, K.: Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415 (2016)

  6. Huang, Z., Xu, W., Yu, K.: Bidirectional lstm-crf models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)

  7. Li, H., Tomko, M., Vasardani, M., Baldwin, T.: Multispanqa: a dataset for multi-span question answering. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1250–1260 (2022)

    Google Scholar 

  8. Liu, Y., et al.: Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)

  9. Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: International Conference on Learning Representations (2019)

    Google Scholar 

  10. Oord, A.V.D., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018)

  11. Rajpurkar, P., Jia, R., Liang, P.: Know what you don’t know: unanswerable questions for SQuAD. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, vol. 2: Short Papers, pp. 784–789. Association for Computational Linguistics, Melbourne (2018)

    Google Scholar 

  12. Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: SQuAD: 100,000+ questions for machine comprehension of text. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 2383–2392. Association for Computational Linguistics, Austin (2016)

    Google Scholar 

  13. Ramshaw, L., Marcus, M.: Text chunking using transformation-based learning. In: Third Workshop on Very Large Corpora (1995)

    Google Scholar 

  14. Sang, E.F.T.K.: Transforming a chunker to a parser. In: Computational Linguistics in the Netherlands 2000, pp. 177–188. Brill (2001)

    Google Scholar 

  15. Segal, E., Efrat, A., Shoham, M., Globerson, A., Berant, J.: A simple and effective model for answering multi-span questions. In: Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 3074–3080 (2020)

    Google Scholar 

  16. Yan, H., Sun, Y., Li, X., Qiu, X.: An embarrassingly easy but strong baseline for nested named entity recognition. arXiv preprint arXiv:2208.04534 (2022)

  17. Yu, J., Bohnet, B., Poesio, M.: Named entity recognition as dependency parsing. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 6470–6476 (2020)

    Google Scholar 

  18. Zhang, S., Cheng, H., Gao, J., Poon, H.: optimizing bi-encoder for named entity recognition via contrastive learning. In: ICLR 2023 poster (2022)

    Google Scholar 

Download references

Acknowledgments

The authors would like to thank the three anonymous reviewers for their comments on this paper. This work is supported by the National Key Research and Development Program of China (No.2020YFC0833300).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wen Zhao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, P., Xiong, G., Zhao, W. (2024). CSS: Contrastive Span Selector for Multi-span Question Answering. In: Liu, F., Sadanandan, A.A., Pham, D.N., Mursanto, P., Lukose, D. (eds) PRICAI 2023: Trends in Artificial Intelligence. PRICAI 2023. Lecture Notes in Computer Science(), vol 14325. Springer, Singapore. https://doi.org/10.1007/978-981-99-7019-3_22

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-7019-3_22

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-7018-6

  • Online ISBN: 978-981-99-7019-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics