How Confident Was Your Reviewer? Estimating Reviewer Confidence from Peer Review Texts

Bharti, Prabhat Kumar; Ghosal, Tirthankar; Agrawal, Mayank; Ekbal, Asif

doi:10.1007/978-3-031-06555-2_9

Prabhat Kumar Bharti¹⁰,
Tirthankar Ghosal¹¹,
Mayank Agrawal¹⁰ &
…
Asif Ekbal¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13237))

Included in the following conference series:

International Workshop on Document Analysis Systems

1937 Accesses
5 Citations

The original version of this chapter was revised: one crucial acknowledgement was missing. This has now been corrected. The correction to this chapter is available at https://doi.org/10.1007/978-3-031-06555-2_53

Abstract

The scholarly peer-reviewing system is the primary means to ensure the quality of scientific publications. An area or program chair relies on the reviewer’s confidence score to address conflicting reviews and borderline cases. Usually, reviewers themselves disclose how confident they are in reviewing a certain paper. However, there could be inconsistencies in what reviewers self-annotate themselves versus how the preview text appears to the readers. This is the job of the area or program chair to consider such inconsistencies and make a reasonable judgment. Peer review texts could be a valuable source of Natural Language Processing (NLP) studies, and the community is uniquely poised to investigate some inconsistencies in the paper vetting system. Here in this work, we attempt to automatically estimate how confident was the reviewer directly from the review text. We experiment with five data-driven methods: Linear Regression, Decision Tree, Support Vector Regression, Bidirectional Encoder Representations from Transformers (BERT), and a hybrid of Bidirectional Long-Short Term Memory (BiLSTM) and Convolutional Neural Networks (CNN) on Bidirectional Encoder Representations from Transformers (BERT), to predict the confidence score of the reviewer. Our experiments show that the deep neural model grounded on BERT representations generates encouraging performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Change history

18 June 2022
In the version of this paper that was originally published one crucial acknowledgement was missing. This has now been corrected.

Notes

1.
In this manuscript, editors/chairs are used interchangeably.
2.
https://openreview.net/.
3.
https://github.com/PrabhatkrBharti/Rev.Conf.git.
4.
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.levene.html.

References

Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)
Article Google Scholar
Bornmann, L., Daniel, H.D.: Reliability of reviewers’ ratings when using public peer review: a case study. Learned Publishing 23(2), 124–131 (2010)
Article Google Scholar
Bornmann, L., Mutz, R.: Growth rates of modern science: a bibliometric analysis based on the number of publications and cited references. J. Am. Soc. Inf. Sci. 66(11), 2215–2222 (2015)
Google Scholar
Brezis, E.S., Birukou, A.: Arbitrariness in the peer review process. Scientometrics 123(1), 393–411 (2020). https://doi.org/10.1007/s11192-020-03348-1
Article Google Scholar
Chavalarias, D., Ioannidis, J.P.: Science mapping analysis characterizes 235 biases in biomedical research. J. Clin. Epidemiol. 63(11), 1205–1215 (2010)
Article Google Scholar
Conneau, A., Schwenk, H., Barrault, L., Lecun, Y.: Very deep convolutional networks for text classification. arXiv preprint arXiv:1606.01781 (2016)
Cortes, C., Lawrence, N.D.: Inconsistency in conference peer review: revisiting the 2014 neurips experiment. arXiv preprint arXiv:2109.09774 (2021)
De Bellis, N.: Bibliometrics and Citation Analysis: From the Science Citation Index to Cybermetrics. Scarecrow Press (2009)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Gao, W., Yoshinaga, N., Kaji, N., Kitsuregawa, M.: Modeling user leniency and product popularity for sentiment classification. In: Proceedings of the Sixth International Joint Conference on Natural Language Processing, pp. 1107–1111 (2013)
Google Scholar
Huang, B., Carley, K.M.: Parameterized convolutional neural networks for aspect level sentiment classification. arXiv preprint arXiv:1909.06276 (2019)
Huisman, J., Smits, J.: Duration and quality of the peer review process: the author’s perspective. Scientometrics 113(1), 633–650 (2017)
Article Google Scholar
Kelly, J., Sadeghieh, T., Adeli, K.: Peer review in scientific publications: benefits, critiques, & a survival guide. Ejifcc 25(3), 227 (2014)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kronick, D.A.: Peer review in 18th-century scientific journalism. JAMA 263(10), 1321–1322 (1990)
Article Google Scholar
Pang, B., Lee, L.: Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. arXiv preprint cs/0506075 (2005)
Google Scholar
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Google Scholar
Price, S., Flach, P.A.: Computational support for academic peer review: a perspective from artificial intelligence. Commun. ACM 60(3), 70–79 (2017)
Article Google Scholar
Qiao, C., et al.: A new method of region embedding for text classification. In: ICLR (Poster) (2018)
Google Scholar
Qu, L., Ifrim, G., Weikum, G.: The bag-of-opinions method for review rating prediction from sparse text patterns. In: Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pp. 913–921 (2010)
Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Tabah, A.N.: Literature dynamics: studies on growth, diffusion, and epidemics. Ann. Rev. Inf. Sci. Technol. (ARIST) 34, 249–86 (1999)
Google Scholar
Tomkins, A., Zhang, M., Heavlin, W.D.: Reviewer bias in single-versus double-blind peer review. Proc. Natl. Acad. Sci. 114(48), 12708–12713 (2017)
Article Google Scholar
Wang, B.: Disconnected recurrent neural networks for text categorization. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2311–2320 (2018)
Google Scholar
Zhang, Y., Wallace, B.: A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. arXiv preprint arXiv:1510.03820 (2015)

Download references

Acknowledgments

The first author, Prabhat Kumar Bharti, acknowledges Quality Improvement Programme, an initiative of All India Council for Technical Education (AICTE), Government of India, for fellowship support. Tirthankar Ghosal is funded by Cactus Communications, India (Award # CAC-2021-01) to carry out this research. The fourth author Asif Ekbal receives the Visvesvaraya Young Faculty Award. Thanks to the Digital India Corporation, Ministry of Electronics and Information Technology, Government of India for funding this research.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Indian Institute of Technology Patna, Daulatpur, India
Prabhat Kumar Bharti, Mayank Agrawal & Asif Ekbal
Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic
Tirthankar Ghosal

Authors

Prabhat Kumar Bharti
View author publications
You can also search for this author in PubMed Google Scholar
Tirthankar Ghosal
View author publications
You can also search for this author in PubMed Google Scholar
Mayank Agrawal
View author publications
You can also search for this author in PubMed Google Scholar
Asif Ekbal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Prabhat Kumar Bharti .

Editor information

Editors and Affiliations

Kyushu University, Fukuoka, Japan
Seiichi Uchida
Boise State University, BOISE, ID, USA
Elisa Barney
LIRIS UMR CNRS, Villeurbanne, France
Véronique Eglin

Ethics declarations

Ethics Compliance

In no way does this work attempt to replace human reviewers; instead, a good use case of this research is to focus and enhance the quality of the review process is to leverage Human AI collaboration to predict the confidence scores of the reviewers. It will eventually help the editors with review quality.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bharti, P.K., Ghosal, T., Agrawal, M., Ekbal, A. (2022). How Confident Was Your Reviewer? Estimating Reviewer Confidence from Peer Review Texts. In: Uchida, S., Barney, E., Eglin, V. (eds) Document Analysis Systems. DAS 2022. Lecture Notes in Computer Science, vol 13237. Springer, Cham. https://doi.org/10.1007/978-3-031-06555-2_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-06555-2_9
Published: 18 May 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-06554-5
Online ISBN: 978-3-031-06555-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

How Confident Was Your Reviewer? Estimating Reviewer Confidence from Peer Review Texts