CSS: Contrastive Span Selector for Multi-span Question Answering

Zhang, Penghui; Xiong, Guanming; Zhao, Wen

doi:10.1007/978-981-99-7019-3_22

Penghui Zhang¹²,
Guanming Xiong¹² &
Wen Zhao¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14325))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

522 Accesses

Abstract

This study investigates the task of Multi-span Question Answering (MSQA). Currently, the MSQA task is primarily modeled as a sequence tagging problem, predicting whether each word is a part of an answer. However, this approach independently predicts words without fully utilizing a comprehensive understanding of the complexities in MSQA. In this paper, we propose a novel model, Contrastive Span Selector. Our model utilizes a multi-head biaffine attention mechanism to generate the span representations and employs a CNN block for span-wise interaction. Additionally, we incorporate the question and a global token into the encoding process, projecting all vectors into a shared representation space. To train our model, we employ contrastive learning with a dynamic threshold to control the similarity boundary between answer spans and non-answer spans. Our model outperforms the tagger model by 6.32 in F1 score for exact match on the MultiSpanQA multi-span setting and 5.69 on the expand setting, establishing it as the state-of-the-art model for MSQA. The code is available at: https://github.com/phzh24/Contrastive-Span-Selector.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Dasigi, P., Liu, N.F., Marasović, A., Smith, N.A., Gardner, M.: Quoref: a reading comprehension dataset with questions requiring coreferential reasoning. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 5925–5932. Association for Computational Linguistics, Hong Kong (2019)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: North American Chapter of the Association for Computational Linguistics (NAACL), pp. 4171–4186 (2019)
Google Scholar
Dozat, T., Manning, C.D.: Deep biaffine attention for neural dependency parsing. In: International Conference on Learning Representations (ICLR) (2017)
Google Scholar
Dua, D., Wang, Y., Dasigi, P., Stanovsky, G., Singh, S., Gardner, M.: DROP: a reading comprehension benchmark requiring discrete reasoning over paragraphs. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long and Short Papers), pp. 2368–2378. Association for Computational Linguistics, Minneapolis (2019)
Google Scholar
Hendrycks, D., Gimpel, K.: Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415 (2016)
Huang, Z., Xu, W., Yu, K.: Bidirectional lstm-crf models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)
Li, H., Tomko, M., Vasardani, M., Baldwin, T.: Multispanqa: a dataset for multi-span question answering. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1250–1260 (2022)
Google Scholar
Liu, Y., et al.: Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: International Conference on Learning Representations (2019)
Google Scholar
Oord, A.V.D., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018)
Rajpurkar, P., Jia, R., Liang, P.: Know what you don’t know: unanswerable questions for SQuAD. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, vol. 2: Short Papers, pp. 784–789. Association for Computational Linguistics, Melbourne (2018)
Google Scholar
Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: SQuAD: 100,000+ questions for machine comprehension of text. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 2383–2392. Association for Computational Linguistics, Austin (2016)
Google Scholar
Ramshaw, L., Marcus, M.: Text chunking using transformation-based learning. In: Third Workshop on Very Large Corpora (1995)
Google Scholar
Sang, E.F.T.K.: Transforming a chunker to a parser. In: Computational Linguistics in the Netherlands 2000, pp. 177–188. Brill (2001)
Google Scholar
Segal, E., Efrat, A., Shoham, M., Globerson, A., Berant, J.: A simple and effective model for answering multi-span questions. In: Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 3074–3080 (2020)
Google Scholar
Yan, H., Sun, Y., Li, X., Qiu, X.: An embarrassingly easy but strong baseline for nested named entity recognition. arXiv preprint arXiv:2208.04534 (2022)
Yu, J., Bohnet, B., Poesio, M.: Named entity recognition as dependency parsing. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 6470–6476 (2020)
Google Scholar
Zhang, S., Cheng, H., Gao, J., Poon, H.: optimizing bi-encoder for named entity recognition via contrastive learning. In: ICLR 2023 poster (2022)
Google Scholar

Download references

Acknowledgments

The authors would like to thank the three anonymous reviewers for their comments on this paper. This work is supported by the National Key Research and Development Program of China (No.2020YFC0833300).

Author information

Authors and Affiliations

Peking University, Beijing, China
Penghui Zhang, Guanming Xiong & Wen Zhao

Authors

Penghui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Guanming Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Wen Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wen Zhao .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Fenrong Liu
SEEK Limited, Cremorne, NSW, Australia
Arun Anand Sadanandan
MIMOS (Malaysia), Kuala Lumpur, Malaysia
Duc Nghia Pham
Universitas Indonesia, Depok, Indonesia
Petrus Mursanto
Tabcorp Holdings Limited, Melbourne, VIC, Australia
Dickson Lukose

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, P., Xiong, G., Zhao, W. (2024). CSS: Contrastive Span Selector for Multi-span Question Answering. In: Liu, F., Sadanandan, A.A., Pham, D.N., Mursanto, P., Lukose, D. (eds) PRICAI 2023: Trends in Artificial Intelligence. PRICAI 2023. Lecture Notes in Computer Science(), vol 14325. Springer, Singapore. https://doi.org/10.1007/978-981-99-7019-3_22

Download citation

DOI: https://doi.org/10.1007/978-981-99-7019-3_22
Published: 10 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7018-6
Online ISBN: 978-981-99-7019-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

CSS: Contrastive Span Selector for Multi-span Question Answering