Abstract
Citation screening is a crucial stage in conducting a Systematic Literature Review, where reviewers must read hundreds, if not thousands, of papers. Natural Language Processing-based models using Transformers have been successfully employed to automate this process and minimize the chances of missing relevant papers. In our research, we proposed three variations of these Transformer models, each with different pre-training techniques. With our models, reviewers only need to read 16 papers to train the model, thus saving as much as 80\(\%\) of the workload. In addition, we revisited the AWSS@R metric, which normalized the WSS@R index and provided a fair way to estimate the workload saved using the different datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bannach-Brown, A., et al.: Machine learning algorithms for systematic review: reducing workload in a preclinical review of animal studies and reducing human screening error. System. Rev. 8(1), 1–12 (2019). https://doi.org/10.1186/s13643-019-0942-7
Beltagy, I., Cohan, A., Lo, K.: Scibert: pretrained contextualized embeddings for scientific text. CoRR abs/1903.10676 (2019). http://arxiv.org/abs/1903.10676
van den Bulk, L.M., Bouzembrak, Y., Gavai, A., Liu, N., van den Heuvel, L.J., Marvin, H.J.: Automatic classification of literature in systematic reviews on food safety using machine learning. Curr. Res. Food Sci. 5, 84–95 (2022). https://doi.org/10.1016/j.crfs.2021.12.010
Chen, T., Kornblith, S., Swersky, K., Norouzi, M., Hinton, G.E.: Big self-supervised models are strong semi-supervised learners. Adv. Neural Inf. Process. Syst. 33, 22243–22255 (2020). https://doi.org/10.48550/arXiv.2006.10029
Cohen, A.M., Hersh, W.R., Peterson, K., Yen, P.Y.: Reducing workload in systematic review preparation using automated citation classification. J. Am. Med. Inf. Assoc. 13(2), 206–219 (2006). https://doi.org/10.1197/jamia.M1929
Collins, C., Dennehy, D., Conboy, K., Mikalef, P.: Artificial intelligence in information systems research: a systematic literature review and research agenda. Int. J. Inf. Manag. 60(June), 102383 (2021). https://doi.org/10.1016/j.ijinfomgt.2021.102383
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. ArXiv abs/1810.04805 (2019). https://doi.org/10.18653/v1/N19-1423
van Dinter, R., Catal, C., Tekinerdogan, B.: A multi-channel convolutional neural network approach to automate the citation screening process. Appl. Soft Comput. 112, 107765 (2021). https://doi.org/10.1016/j.asoc.2021.107765
van Dinter, R., Tekinerdogan, B., Catal, C.: Automation of systematic literature reviews: a systematic literature review. Inf. Softw. Technol. 136, 106589 (2021). https://doi.org/10.1016/j.infsof.2021.106589
Fei-Fei, L., Fergus, R., Perona, P.: A bayesian approach to unsupervised one-shot learning of object categories. In: Proceedings Ninth IEEE International Conference on Computer Vision, pp. 1134–1141. IEEE (2003). https://doi.org/10.1109/ICCV.2003.1238476
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. CoRR abs/1703.03400 (2017). http://arxiv.org/abs/1703.03400
Han, S., Pool, J., Tran, J., Dally, W.: Learning both weights and connections for efficient neural network. Adv. Neural Inf. Process. Syst. 28 (2015). https://doi.org/10.48550/arXiv.1506.02626
Houlsby, N., et al.: Parameter-efficient transfer learning for nlp. In: International Conference on Machine Learning, pp. 2790–2799. PMLR (2019). https://doi.org/10.48550/arXiv.1902.00751
Howard, B.E., et al.: Swift-review: a text-mining workbench for systematic review. Syst. Rev. 5(1), 1–16 (2016). https://doi.org/10.1186/s13643-016-0263-z
IBM Cloud Education: Natural language processing (NLP) (2021). https://www.ibm.com/cloud/learn/natural-language-processing. Acessed 08 Mar 2022
Jackson, R.G., et al.: Ablations over transformer models for biomedical relationship extraction. F1000Research 9, 710 (2020). https://doi.org/10.12688/f1000research.24552.1
Kontonatsios, G., Spencer, S., Matthew, P., Korkontzelos, I.: Using a neural network-based feature extraction method to facilitate citation screening for systematic reviews. Expert Syst. Appl. X 6, 100030 (2020). https://doi.org/10.1016/j.eswax.2020.100030
Kurtic, E., et al.: The optimal bert surgeon: Scalable and accurate second-order pruning for large language models (2022). arXiv preprint arXiv:2203.07259
Kusa, W., Hanbury, A., Knoth, P.: Automation of citation screening for systematic literature reviews using neural networks: a replicability study (2022). arXiv preprint arXiv:2201.07534
Kusa, W., Lipani, A., Knoth, P., Hanbury, A.: An analysis of work saved over sampling in the evaluation of automated citation screening in systematic literature reviews. Intell. Syst. Appl. 18, 200193 (2023). https://doi.org/10.1016/j.iswa.2023.200193
van der Maaten, L., Hinton, G.E.: Visualizing data using t-sne. J. Mach. Learn. Res. 9, 2579–2605 (2008). https://www.jmlr.org/papers/v9/vandermaaten08a.html
Melo, M., et al.: Few-shot approach for systematic literature review classifications. In: 18th International Conference on Web Information Systems and Technologies (2022). https://doi.org/10.5220/0011526400003318
Nichol, A., Achiam, J., Schulman, J.: On first-order meta-learning algorithms. CoRR abs/1803.02999 (2018). http://arxiv.org/abs/1803.02999
Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. CoRR abs/1802.05365 (2018). http://arxiv.org/abs/1802.05365
Reimers, N., Gurevych, I.: Sentence-bert: sentence embeddings using siamese bert-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (2019). http://arxiv.org/abs/1908.10084
van de Schoot, R., et al.: An open source machine learning framework for efficient and transparent systematic reviews. Nat. Mach. Intell. 3(2), 125–133 (2021). https://doi.org/10.1038/s42256-020-00287-7
Sellak, H., Ouhbi, B., Frikh, B.: Using rule-based classifiers in systematic reviews: a semantic class association rules approach. In: Proceedings of the 17th International Conference on Information Integration and Web-based Applications & Services, pp. 1–5 (2015). https://doi.org/10.1145/2837185.2837279
Song, K., Tan, X., Qin, T., Lu, J., Liu, T.: Mpnet: masked and permuted pre-training for language understanding. CoRR abs/2004.09297 (2020). https://arxiv.org/abs/2004.09297
Sun, C., Qiu, X., Xu, Y., Huang, X.: How to fine-tune BERT for text classification? In: Sun, M., Huang, X., Ji, H., Liu, Z., Liu, Y. (eds.) CCL 2019. LNCS (LNAI), vol. 11856, pp. 194–206. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32381-3_16
Tsafnat, G., Glasziou, P., Karystianis, G., Coiera, E.: Automated screening of research studies for systematic reviews using study characteristics. Syst. Rev. 7(1), 1–9 (2018). https://doi.org/10.1186/s13643-018-0724-7
Wang, S., Fang, H., Khabsa, M., Mao, H., Ma, H.: Entailment as few-shot learner. CoRR abs/2104.14690 (2021). https://arxiv.org/abs/2104.14690
Weigang, L., da Silva, N.C.: A study of parallel neural networks. In: IJCNN 1999. International Joint Conference on Neural Networks. Proceedings (Cat. No. 99CH36339), vol. 2, pp. 1113–1116. IEEE (1999). https://doi.org/10.1109/IJCNN.1999.831112
Wu, L., Won, Y.S., Jap, D., Perin, G., Bhasin, S., Picek, S.: Explain some noise: ablation analysis for deep learning-based physical side-channel analysis. Cryptology ePrint Archive (2021). https://eprint.iacr.org/2021/717
Acknowledgement
We sincerely thank the Brazilian Ministry of Science, Technology, and Innovation, which partially supported this project.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Faria, A.V.A., de Melo, M.K., de Oliveira, F.A.R., Weigang, L., Celestino, V.R.R. (2023). Automated SLR with a Few Labeled Papers and a Fair Workload Metric. In: Marchiori, M., Domínguez Mayo, F.J., Filipe, J. (eds) Web Information Systems and Technologies. WEBIST 2022. Lecture Notes in Business Information Processing, vol 494. Springer, Cham. https://doi.org/10.1007/978-3-031-43088-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-43088-6_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43087-9
Online ISBN: 978-3-031-43088-6
eBook Packages: Computer ScienceComputer Science (R0)