An Efficient Approach for Improving the Recall of Rough Abstract Retrieval in Scientific Claim Verification

Zhang, Zhiwei; Li, Jiyi; Fukumoto, Fumiyo

doi:10.1007/978-3-031-44198-1_6

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14261))

Included in the following conference series:

International Conference on Artificial Neural Networks

599 Accesses

Abstract

Scientific claim verification can help the researchers easily find the target scientific papers with the sentence evidence from a large corpus for the given claim. Because there are a huge amount of papers in the corpus, most of the existing scientific claim verification solutions are always in a two-stage manner that first roughly detects a set of candidate related papers by some naïve but fast methods such as some similarity measures, and then utilizes the large but relatively slow deep neural models for accurate classification. To improve the recall of the overall system by improving the recall of the rough abstract retrieval stage, we propose an approach that also utilizes the neural classification model for the rough retrieval stage. To improve the scalability of the proposal, we propose a distillation-based method to obtain a lightweight model for the rough retrieval stage. The experimental results on the benchmark dataset SciFact show that our approach outperforms the existing works.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/allenai/scifact.

References

Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: a next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, pp. 2623–2631. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3292500.3330701
Chen, J., Zhang, R., Guo, J., Fan, Y., Cheng, X.: Gere: generative evidence retrieval for fact verification. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2022, pp. 2184–2189. Association for Computing Machinery, New York (2022). https://doi.org/10.1145/3477495.3531827
Chen, Q., Peng, Y., Lu, Z.: BioSentVec: creating sentence embeddings for biomedical texts. In: 2019 IEEE International Conference on Healthcare Informatics (ICHI), pp. 1–5 (2019). https://doi.org/10.1109/ICHI.2019.8904728
Ferreira, W., Vlachos, A.: Emergent: a novel data-set for stance classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1163–1168 (2016)
Google Scholar
Hanselowski, A., et al.: UKP-Athene: multi-sentence textual entailment for claim verification. In: Proceedings of the First Workshop on Fact Extraction and VERification (FEVER), pp. 103–108 (2018)
Google Scholar
Hidey, C., et al.: DeSePtion: dual sequence prediction and adversarial examples for improved fact-checking. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8593–8606. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.acl-main.761. https://www.aclweb.org/anthology/2020.acl-main.761
Hinton, G.E., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. CoRR abs/1503.02531 (2015). http://arxiv.org/abs/1503.02531
Lee, J., et al.: BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4), 1234–1240 (2019). https://doi.org/10.1093/bioinformatics/btz682
Article Google Scholar
Li, X., Burns, G.A., Peng, N.: A paragraph-level multi-task learning model for scientific fact-verification. In: Veyseh, A.P.B., Dernoncourt, F., Nguyen, T.H., Chang, W., Celi, L.A. (eds.) Proceedings of the Workshop on Scientific Document Understanding co-located with 35th AAAI Conference on Artificial Intelligence, SDU@AAAI 2021, Virtual Event, 9 February 2021. CEUR Workshop Proceedings, vol. 2831. CEUR-WS.org (2021). http://ceur-ws.org/Vol-2831/paper8.pdf
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019). http://arxiv.org/abs/1907.11692
Liu, Z., Xiong, C., Sun, M., Liu, Z.: Fine-grained fact verification with kernel graph attention network. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7342–7351. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.acl-main.655. https://aclanthology.org/2020.acl-main.655
Lu, Y.J., Li, C.T.: GCAN: graph-aware co-attention networks for explainable fake news detection on social media. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 505–514. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.acl-main.48. https://www.aclweb.org/anthology/2020.acl-main.48
Nie, Y., Chen, H., Bansal, M.: Combining fact extraction and verification with neural semantic matching networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 6859–6866 (2019)
Google Scholar
Pradeep, R., Ma, X., Nogueira, R., Lin, J.: Scientific claim verification with VerT5erini. In: Proceedings of the 12th International Workshop on Health Text Mining and Information Analysis, pp. 94–103. Association for Computational Linguistics, Online (2021). https://www.aclweb.org/anthology/2021.louhi-1.11
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21(140), 1–67 (2020). http://jmlr.org/papers/v21/20-074.html
Thorne, J., Vlachos, A., Christodoulopoulos, C., Mittal, A.: FEVER: a large-scale dataset for fact extraction and VERification. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, Louisiana (Volume 1: Long Papers), pp. 809–819. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/N18-1074. https://www.aclweb.org/anthology/N18-1074
Vlachos, A., Riedel, S.: Fact checking: task definition and dataset construction. In: Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science, pp. 18–22 (2014)
Google Scholar
Wadden, D., et al.: Fact or fiction: verifying scientific claims. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 7534–7550. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.emnlp-main.609. https://www.aclweb.org/anthology/2020.emnlp-main.609
Wadden, D., Lo, K., Wang, L., Cohan, A., Beltagy, I., Hajishirzi, H.: MultiVerS: improving scientific claim verification with weak supervision and full-document context. In: Findings of the Association for Computational Linguistics: NAACL 2022, Seattle, USA, pp. 61–76. Association for Computational Linguistics (2022). https://doi.org/10.18653/v1/2022.findings-naacl.6. https://aclanthology.org/2022.findings-naacl.6
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, California, pp. 1480–1489. Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/N16-1174. https://aclanthology.org/N16-1174
Zeng, X., Zubiaga, A.: QMUL-SDS at SCIVER: step-by-step binary classification for scientific claim verification. In: Proceedings of the Second Workshop on Scholarly Document Processing, pp. 116–123. Association for Computational Linguistics, Online (2021). https://doi.org/10.18653/v1/2021.sdp-1.15. https://aclanthology.org/2021.sdp-1.15
Zhang, Z., Li, J., Fukumoto, F., Ye, Y.: Abstract, rationale, stance: a joint model for scientific claim verification. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Punta Cana, Dominican Republic, pp. 3580–3586. Association for Computational Linguistics, Online (2021). https://doi.org/10.18653/v1/2021.emnlp-main.290. https://aclanthology.org/2021.emnlp-main.290

Download references

Acknowledgements

This works was partially supported by 23H03402.

Author information

Authors and Affiliations

Binjiang Institute of Zhejiang University, Hangzhou, China
Zhiwei Zhang
University of Yamanashi, Kofu, Japan
Jiyi Li & Fumiyo Fukumoto

Authors

Zhiwei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jiyi Li
View author publications
You can also search for this author in PubMed Google Scholar
Fumiyo Fukumoto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiyi Li .

Editor information

Editors and Affiliations

Democritus University of Thrace, Xanthi, Greece
Lazaros Iliadis
Democritus University of Thrace, Xanthi, Greece
Antonios Papaleonidas
Lancaster University, Lancaster, UK
Plamen Angelov
Teesside University, Middlesbrough, UK
Chrisina Jayne

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Z., Li, J., Fukumoto, F. (2023). An Efficient Approach for Improving the Recall of Rough Abstract Retrieval in Scientific Claim Verification. In: Iliadis, L., Papaleonidas, A., Angelov, P., Jayne, C. (eds) Artificial Neural Networks and Machine Learning – ICANN 2023. ICANN 2023. Lecture Notes in Computer Science, vol 14261. Springer, Cham. https://doi.org/10.1007/978-3-031-44198-1_6

Download citation

DOI: https://doi.org/10.1007/978-3-031-44198-1_6
Published: 22 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44197-4
Online ISBN: 978-3-031-44198-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

An Efficient Approach for Improving the Recall of Rough Abstract Retrieval in Scientific Claim Verification