Skip to main content

How Confident Was Your Reviewer? Estimating Reviewer Confidence from Peer Review Texts

  • Conference paper
  • First Online:
Document Analysis Systems (DAS 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13237))

Included in the following conference series:

Abstract

The scholarly peer-reviewing system is the primary means to ensure the quality of scientific publications. An area or program chair relies on the reviewer’s confidence score to address conflicting reviews and borderline cases. Usually, reviewers themselves disclose how confident they are in reviewing a certain paper. However, there could be inconsistencies in what reviewers self-annotate themselves versus how the preview text appears to the readers. This is the job of the area or program chair to consider such inconsistencies and make a reasonable judgment. Peer review texts could be a valuable source of Natural Language Processing (NLP) studies, and the community is uniquely poised to investigate some inconsistencies in the paper vetting system. Here in this work, we attempt to automatically estimate how confident was the reviewer directly from the review text. We experiment with five data-driven methods: Linear Regression, Decision Tree, Support Vector Regression, Bidirectional Encoder Representations from Transformers (BERT), and a hybrid of Bidirectional Long-Short Term Memory (BiLSTM) and Convolutional Neural Networks (CNN) on Bidirectional Encoder Representations from Transformers (BERT), to predict the confidence score of the reviewer. Our experiments show that the deep neural model grounded on BERT representations generates encouraging performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Change history

  • 18 June 2022

    In the version of this paper that was originally published one crucial acknowledgement was missing. This has now been corrected.

Notes

  1. 1.

    In this manuscript, editors/chairs are used interchangeably.

  2. 2.

    https://openreview.net/.

  3. 3.

    https://github.com/PrabhatkrBharti/Rev.Conf.git.

  4. 4.

    https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.levene.html.

References

  1. Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)

    Article  Google Scholar 

  2. Bornmann, L., Daniel, H.D.: Reliability of reviewers’ ratings when using public peer review: a case study. Learned Publishing 23(2), 124–131 (2010)

    Article  Google Scholar 

  3. Bornmann, L., Mutz, R.: Growth rates of modern science: a bibliometric analysis based on the number of publications and cited references. J. Am. Soc. Inf. Sci. 66(11), 2215–2222 (2015)

    Google Scholar 

  4. Brezis, E.S., Birukou, A.: Arbitrariness in the peer review process. Scientometrics 123(1), 393–411 (2020). https://doi.org/10.1007/s11192-020-03348-1

    Article  Google Scholar 

  5. Chavalarias, D., Ioannidis, J.P.: Science mapping analysis characterizes 235 biases in biomedical research. J. Clin. Epidemiol. 63(11), 1205–1215 (2010)

    Article  Google Scholar 

  6. Conneau, A., Schwenk, H., Barrault, L., Lecun, Y.: Very deep convolutional networks for text classification. arXiv preprint arXiv:1606.01781 (2016)

  7. Cortes, C., Lawrence, N.D.: Inconsistency in conference peer review: revisiting the 2014 neurips experiment. arXiv preprint arXiv:2109.09774 (2021)

  8. De Bellis, N.: Bibliometrics and Citation Analysis: From the Science Citation Index to Cybermetrics. Scarecrow Press (2009)

    Google Scholar 

  9. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  10. Gao, W., Yoshinaga, N., Kaji, N., Kitsuregawa, M.: Modeling user leniency and product popularity for sentiment classification. In: Proceedings of the Sixth International Joint Conference on Natural Language Processing, pp. 1107–1111 (2013)

    Google Scholar 

  11. Huang, B., Carley, K.M.: Parameterized convolutional neural networks for aspect level sentiment classification. arXiv preprint arXiv:1909.06276 (2019)

  12. Huisman, J., Smits, J.: Duration and quality of the peer review process: the author’s perspective. Scientometrics 113(1), 633–650 (2017)

    Article  Google Scholar 

  13. Kelly, J., Sadeghieh, T., Adeli, K.: Peer review in scientific publications: benefits, critiques, & a survival guide. Ejifcc 25(3), 227 (2014)

    Google Scholar 

  14. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  15. Kronick, D.A.: Peer review in 18th-century scientific journalism. JAMA 263(10), 1321–1322 (1990)

    Article  Google Scholar 

  16. Pang, B., Lee, L.: Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. arXiv preprint cs/0506075 (2005)

    Google Scholar 

  17. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)

    Google Scholar 

  18. Price, S., Flach, P.A.: Computational support for academic peer review: a perspective from artificial intelligence. Commun. ACM 60(3), 70–79 (2017)

    Article  Google Scholar 

  19. Qiao, C., et al.: A new method of region embedding for text classification. In: ICLR (Poster) (2018)

    Google Scholar 

  20. Qu, L., Ifrim, G., Weikum, G.: The bag-of-opinions method for review rating prediction from sparse text patterns. In: Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pp. 913–921 (2010)

    Google Scholar 

  21. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  22. Tabah, A.N.: Literature dynamics: studies on growth, diffusion, and epidemics. Ann. Rev. Inf. Sci. Technol. (ARIST) 34, 249–86 (1999)

    Google Scholar 

  23. Tomkins, A., Zhang, M., Heavlin, W.D.: Reviewer bias in single-versus double-blind peer review. Proc. Natl. Acad. Sci. 114(48), 12708–12713 (2017)

    Article  Google Scholar 

  24. Wang, B.: Disconnected recurrent neural networks for text categorization. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2311–2320 (2018)

    Google Scholar 

  25. Zhang, Y., Wallace, B.: A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. arXiv preprint arXiv:1510.03820 (2015)

Download references

Acknowledgments

The first author, Prabhat Kumar Bharti, acknowledges Quality Improvement Programme, an initiative of All India Council for Technical Education (AICTE), Government of India, for fellowship support. Tirthankar Ghosal is funded by Cactus Communications, India (Award # CAC-2021-01) to carry out this research. The fourth author Asif Ekbal receives the Visvesvaraya Young Faculty Award. Thanks to the Digital India Corporation, Ministry of Electronics and Information Technology, Government of India for funding this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Prabhat Kumar Bharti .

Editor information

Editors and Affiliations

Ethics declarations

Ethics Compliance

In no way does this work attempt to replace human reviewers; instead, a good use case of this research is to focus and enhance the quality of the review process is to leverage Human AI collaboration to predict the confidence scores of the reviewers. It will eventually help the editors with review quality.

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bharti, P.K., Ghosal, T., Agrawal, M., Ekbal, A. (2022). How Confident Was Your Reviewer? Estimating Reviewer Confidence from Peer Review Texts. In: Uchida, S., Barney, E., Eglin, V. (eds) Document Analysis Systems. DAS 2022. Lecture Notes in Computer Science, vol 13237. Springer, Cham. https://doi.org/10.1007/978-3-031-06555-2_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-06555-2_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-06554-5

  • Online ISBN: 978-3-031-06555-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics