Skip to main content

An Efficient Approach for Improving the Recall of Rough Abstract Retrieval in Scientific Claim Verification

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2023 (ICANN 2023)

Abstract

Scientific claim verification can help the researchers easily find the target scientific papers with the sentence evidence from a large corpus for the given claim. Because there are a huge amount of papers in the corpus, most of the existing scientific claim verification solutions are always in a two-stage manner that first roughly detects a set of candidate related papers by some naïve but fast methods such as some similarity measures, and then utilizes the large but relatively slow deep neural models for accurate classification. To improve the recall of the overall system by improving the recall of the rough abstract retrieval stage, we propose an approach that also utilizes the neural classification model for the rough retrieval stage. To improve the scalability of the proposal, we propose a distillation-based method to obtain a lightweight model for the rough retrieval stage. The experimental results on the benchmark dataset SciFact show that our approach outperforms the existing works.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/allenai/scifact.

References

  1. Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: a next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, pp. 2623–2631. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3292500.3330701

  2. Chen, J., Zhang, R., Guo, J., Fan, Y., Cheng, X.: Gere: generative evidence retrieval for fact verification. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2022, pp. 2184–2189. Association for Computing Machinery, New York (2022). https://doi.org/10.1145/3477495.3531827

  3. Chen, Q., Peng, Y., Lu, Z.: BioSentVec: creating sentence embeddings for biomedical texts. In: 2019 IEEE International Conference on Healthcare Informatics (ICHI), pp. 1–5 (2019). https://doi.org/10.1109/ICHI.2019.8904728

  4. Ferreira, W., Vlachos, A.: Emergent: a novel data-set for stance classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1163–1168 (2016)

    Google Scholar 

  5. Hanselowski, A., et al.: UKP-Athene: multi-sentence textual entailment for claim verification. In: Proceedings of the First Workshop on Fact Extraction and VERification (FEVER), pp. 103–108 (2018)

    Google Scholar 

  6. Hidey, C., et al.: DeSePtion: dual sequence prediction and adversarial examples for improved fact-checking. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8593–8606. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.acl-main.761. https://www.aclweb.org/anthology/2020.acl-main.761

  7. Hinton, G.E., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. CoRR abs/1503.02531 (2015). http://arxiv.org/abs/1503.02531

  8. Lee, J., et al.: BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4), 1234–1240 (2019). https://doi.org/10.1093/bioinformatics/btz682

    Article  Google Scholar 

  9. Li, X., Burns, G.A., Peng, N.: A paragraph-level multi-task learning model for scientific fact-verification. In: Veyseh, A.P.B., Dernoncourt, F., Nguyen, T.H., Chang, W., Celi, L.A. (eds.) Proceedings of the Workshop on Scientific Document Understanding co-located with 35th AAAI Conference on Artificial Intelligence, SDU@AAAI 2021, Virtual Event, 9 February 2021. CEUR Workshop Proceedings, vol. 2831. CEUR-WS.org (2021). http://ceur-ws.org/Vol-2831/paper8.pdf

  10. Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019). http://arxiv.org/abs/1907.11692

  11. Liu, Z., Xiong, C., Sun, M., Liu, Z.: Fine-grained fact verification with kernel graph attention network. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7342–7351. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.acl-main.655. https://aclanthology.org/2020.acl-main.655

  12. Lu, Y.J., Li, C.T.: GCAN: graph-aware co-attention networks for explainable fake news detection on social media. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 505–514. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.acl-main.48. https://www.aclweb.org/anthology/2020.acl-main.48

  13. Nie, Y., Chen, H., Bansal, M.: Combining fact extraction and verification with neural semantic matching networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 6859–6866 (2019)

    Google Scholar 

  14. Pradeep, R., Ma, X., Nogueira, R., Lin, J.: Scientific claim verification with VerT5erini. In: Proceedings of the 12th International Workshop on Health Text Mining and Information Analysis, pp. 94–103. Association for Computational Linguistics, Online (2021). https://www.aclweb.org/anthology/2021.louhi-1.11

  15. Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21(140), 1–67 (2020). http://jmlr.org/papers/v21/20-074.html

  16. Thorne, J., Vlachos, A., Christodoulopoulos, C., Mittal, A.: FEVER: a large-scale dataset for fact extraction and VERification. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, Louisiana (Volume 1: Long Papers), pp. 809–819. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/N18-1074. https://www.aclweb.org/anthology/N18-1074

  17. Vlachos, A., Riedel, S.: Fact checking: task definition and dataset construction. In: Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science, pp. 18–22 (2014)

    Google Scholar 

  18. Wadden, D., et al.: Fact or fiction: verifying scientific claims. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 7534–7550. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.emnlp-main.609. https://www.aclweb.org/anthology/2020.emnlp-main.609

  19. Wadden, D., Lo, K., Wang, L., Cohan, A., Beltagy, I., Hajishirzi, H.: MultiVerS: improving scientific claim verification with weak supervision and full-document context. In: Findings of the Association for Computational Linguistics: NAACL 2022, Seattle, USA, pp. 61–76. Association for Computational Linguistics (2022). https://doi.org/10.18653/v1/2022.findings-naacl.6. https://aclanthology.org/2022.findings-naacl.6

  20. Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, California, pp. 1480–1489. Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/N16-1174. https://aclanthology.org/N16-1174

  21. Zeng, X., Zubiaga, A.: QMUL-SDS at SCIVER: step-by-step binary classification for scientific claim verification. In: Proceedings of the Second Workshop on Scholarly Document Processing, pp. 116–123. Association for Computational Linguistics, Online (2021). https://doi.org/10.18653/v1/2021.sdp-1.15. https://aclanthology.org/2021.sdp-1.15

  22. Zhang, Z., Li, J., Fukumoto, F., Ye, Y.: Abstract, rationale, stance: a joint model for scientific claim verification. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Punta Cana, Dominican Republic, pp. 3580–3586. Association for Computational Linguistics, Online (2021). https://doi.org/10.18653/v1/2021.emnlp-main.290. https://aclanthology.org/2021.emnlp-main.290

Download references

Acknowledgements

This works was partially supported by 23H03402.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiyi Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, Z., Li, J., Fukumoto, F. (2023). An Efficient Approach for Improving the Recall of Rough Abstract Retrieval in Scientific Claim Verification. In: Iliadis, L., Papaleonidas, A., Angelov, P., Jayne, C. (eds) Artificial Neural Networks and Machine Learning – ICANN 2023. ICANN 2023. Lecture Notes in Computer Science, vol 14261. Springer, Cham. https://doi.org/10.1007/978-3-031-44198-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-44198-1_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-44197-4

  • Online ISBN: 978-3-031-44198-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics