Skip to main content

Automated SLR with a Few Labeled Papers and a Fair Workload Metric

  • Conference paper
  • First Online:
Web Information Systems and Technologies (WEBIST 2022)

Abstract

Citation screening is a crucial stage in conducting a Systematic Literature Review, where reviewers must read hundreds, if not thousands, of papers. Natural Language Processing-based models using Transformers have been successfully employed to automate this process and minimize the chances of missing relevant papers. In our research, we proposed three variations of these Transformer models, each with different pre-training techniques. With our models, reviewers only need to read 16 papers to train the model, thus saving as much as 80\(\%\) of the workload. In addition, we revisited the AWSS@R metric, which normalized the WSS@R index and provided a fair way to estimate the workload saved using the different datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/BecomeAllan/ML-SLRC/tree/main/book.

  2. 2.

    https://neuralmagic.com/deepsparse/.

References

  1. Bannach-Brown, A., et al.: Machine learning algorithms for systematic review: reducing workload in a preclinical review of animal studies and reducing human screening error. System. Rev. 8(1), 1–12 (2019). https://doi.org/10.1186/s13643-019-0942-7

    Article  Google Scholar 

  2. Beltagy, I., Cohan, A., Lo, K.: Scibert: pretrained contextualized embeddings for scientific text. CoRR abs/1903.10676 (2019). http://arxiv.org/abs/1903.10676

  3. van den Bulk, L.M., Bouzembrak, Y., Gavai, A., Liu, N., van den Heuvel, L.J., Marvin, H.J.: Automatic classification of literature in systematic reviews on food safety using machine learning. Curr. Res. Food Sci. 5, 84–95 (2022). https://doi.org/10.1016/j.crfs.2021.12.010

    Article  Google Scholar 

  4. Chen, T., Kornblith, S., Swersky, K., Norouzi, M., Hinton, G.E.: Big self-supervised models are strong semi-supervised learners. Adv. Neural Inf. Process. Syst. 33, 22243–22255 (2020). https://doi.org/10.48550/arXiv.2006.10029

    Article  Google Scholar 

  5. Cohen, A.M., Hersh, W.R., Peterson, K., Yen, P.Y.: Reducing workload in systematic review preparation using automated citation classification. J. Am. Med. Inf. Assoc. 13(2), 206–219 (2006). https://doi.org/10.1197/jamia.M1929

    Article  Google Scholar 

  6. Collins, C., Dennehy, D., Conboy, K., Mikalef, P.: Artificial intelligence in information systems research: a systematic literature review and research agenda. Int. J. Inf. Manag. 60(June), 102383 (2021). https://doi.org/10.1016/j.ijinfomgt.2021.102383

    Article  Google Scholar 

  7. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. ArXiv abs/1810.04805 (2019). https://doi.org/10.18653/v1/N19-1423

  8. van Dinter, R., Catal, C., Tekinerdogan, B.: A multi-channel convolutional neural network approach to automate the citation screening process. Appl. Soft Comput. 112, 107765 (2021). https://doi.org/10.1016/j.asoc.2021.107765

    Article  Google Scholar 

  9. van Dinter, R., Tekinerdogan, B., Catal, C.: Automation of systematic literature reviews: a systematic literature review. Inf. Softw. Technol. 136, 106589 (2021). https://doi.org/10.1016/j.infsof.2021.106589

    Article  Google Scholar 

  10. Fei-Fei, L., Fergus, R., Perona, P.: A bayesian approach to unsupervised one-shot learning of object categories. In: Proceedings Ninth IEEE International Conference on Computer Vision, pp. 1134–1141. IEEE (2003). https://doi.org/10.1109/ICCV.2003.1238476

  11. Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. CoRR abs/1703.03400 (2017). http://arxiv.org/abs/1703.03400

  12. Han, S., Pool, J., Tran, J., Dally, W.: Learning both weights and connections for efficient neural network. Adv. Neural Inf. Process. Syst. 28 (2015). https://doi.org/10.48550/arXiv.1506.02626

  13. Houlsby, N., et al.: Parameter-efficient transfer learning for nlp. In: International Conference on Machine Learning, pp. 2790–2799. PMLR (2019). https://doi.org/10.48550/arXiv.1902.00751

  14. Howard, B.E., et al.: Swift-review: a text-mining workbench for systematic review. Syst. Rev. 5(1), 1–16 (2016). https://doi.org/10.1186/s13643-016-0263-z

    Article  MathSciNet  Google Scholar 

  15. IBM Cloud Education: Natural language processing (NLP) (2021). https://www.ibm.com/cloud/learn/natural-language-processing. Acessed 08 Mar 2022

  16. Jackson, R.G., et al.: Ablations over transformer models for biomedical relationship extraction. F1000Research 9, 710 (2020). https://doi.org/10.12688/f1000research.24552.1

  17. Kontonatsios, G., Spencer, S., Matthew, P., Korkontzelos, I.: Using a neural network-based feature extraction method to facilitate citation screening for systematic reviews. Expert Syst. Appl. X 6, 100030 (2020). https://doi.org/10.1016/j.eswax.2020.100030

    Article  Google Scholar 

  18. Kurtic, E., et al.: The optimal bert surgeon: Scalable and accurate second-order pruning for large language models (2022). arXiv preprint arXiv:2203.07259

  19. Kusa, W., Hanbury, A., Knoth, P.: Automation of citation screening for systematic literature reviews using neural networks: a replicability study (2022). arXiv preprint arXiv:2201.07534

  20. Kusa, W., Lipani, A., Knoth, P., Hanbury, A.: An analysis of work saved over sampling in the evaluation of automated citation screening in systematic literature reviews. Intell. Syst. Appl. 18, 200193 (2023). https://doi.org/10.1016/j.iswa.2023.200193

    Article  Google Scholar 

  21. van der Maaten, L., Hinton, G.E.: Visualizing data using t-sne. J. Mach. Learn. Res. 9, 2579–2605 (2008). https://www.jmlr.org/papers/v9/vandermaaten08a.html

  22. Melo, M., et al.: Few-shot approach for systematic literature review classifications. In: 18th International Conference on Web Information Systems and Technologies (2022). https://doi.org/10.5220/0011526400003318

  23. Nichol, A., Achiam, J., Schulman, J.: On first-order meta-learning algorithms. CoRR abs/1803.02999 (2018). http://arxiv.org/abs/1803.02999

  24. Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. CoRR abs/1802.05365 (2018). http://arxiv.org/abs/1802.05365

  25. Reimers, N., Gurevych, I.: Sentence-bert: sentence embeddings using siamese bert-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (2019). http://arxiv.org/abs/1908.10084

  26. van de Schoot, R., et al.: An open source machine learning framework for efficient and transparent systematic reviews. Nat. Mach. Intell. 3(2), 125–133 (2021). https://doi.org/10.1038/s42256-020-00287-7

    Article  Google Scholar 

  27. Sellak, H., Ouhbi, B., Frikh, B.: Using rule-based classifiers in systematic reviews: a semantic class association rules approach. In: Proceedings of the 17th International Conference on Information Integration and Web-based Applications & Services, pp. 1–5 (2015). https://doi.org/10.1145/2837185.2837279

  28. Song, K., Tan, X., Qin, T., Lu, J., Liu, T.: Mpnet: masked and permuted pre-training for language understanding. CoRR abs/2004.09297 (2020). https://arxiv.org/abs/2004.09297

  29. Sun, C., Qiu, X., Xu, Y., Huang, X.: How to fine-tune BERT for text classification? In: Sun, M., Huang, X., Ji, H., Liu, Z., Liu, Y. (eds.) CCL 2019. LNCS (LNAI), vol. 11856, pp. 194–206. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32381-3_16

    Chapter  Google Scholar 

  30. Tsafnat, G., Glasziou, P., Karystianis, G., Coiera, E.: Automated screening of research studies for systematic reviews using study characteristics. Syst. Rev. 7(1), 1–9 (2018). https://doi.org/10.1186/s13643-018-0724-7

    Article  Google Scholar 

  31. Wang, S., Fang, H., Khabsa, M., Mao, H., Ma, H.: Entailment as few-shot learner. CoRR abs/2104.14690 (2021). https://arxiv.org/abs/2104.14690

  32. Weigang, L., da Silva, N.C.: A study of parallel neural networks. In: IJCNN 1999. International Joint Conference on Neural Networks. Proceedings (Cat. No. 99CH36339), vol. 2, pp. 1113–1116. IEEE (1999). https://doi.org/10.1109/IJCNN.1999.831112

  33. Wu, L., Won, Y.S., Jap, D., Perin, G., Bhasin, S., Picek, S.: Explain some noise: ablation analysis for deep learning-based physical side-channel analysis. Cryptology ePrint Archive (2021). https://eprint.iacr.org/2021/717

Download references

Acknowledgement

We sincerely thank the Brazilian Ministry of Science, Technology, and Innovation, which partially supported this project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Victor Rafael Rezende Celestino .

Editor information

Editors and Affiliations

Appendix

Appendix

Table 2. Summary of the mean and std. deviation of five validations, considering 16 examples (eight positive and eight negative) in the domain learner phase, after training the respective ML-SLRC in the meta learner phase (50–50 split).
Table 3. Summary of the mean and std. deviation of five validations, considering 16 examples (eight positive and eight negative) in the domain learner phase, after training the respective ML-SLRC in the meta learner phase (benchmarking).

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Faria, A.V.A., de Melo, M.K., de Oliveira, F.A.R., Weigang, L., Celestino, V.R.R. (2023). Automated SLR with a Few Labeled Papers and a Fair Workload Metric. In: Marchiori, M., Domínguez Mayo, F.J., Filipe, J. (eds) Web Information Systems and Technologies. WEBIST 2022. Lecture Notes in Business Information Processing, vol 494. Springer, Cham. https://doi.org/10.1007/978-3-031-43088-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43088-6_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43087-9

  • Online ISBN: 978-3-031-43088-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics