Elsevier

Neurocomputing

Volume 428, 7 March 2021, Pages 218-238
Neurocomputing

A deep neural architecture based meta-review generation and final decision prediction of a scholarly article

https://doi.org/10.1016/j.neucom.2020.11.004Get rights and content

Highlights

  • We propose MetaGen: which can able to provide solutions for two tasks such as peer review prediction, meta-review generation.

  • Integrated framework of convolution layer, Bi-LSTM model, and attention mechanism to predict the final decision.

  • Provide a concise meta-review which maximizes information coverage, coherence, readability and also reduces redundancy.

  • Experiments on PeerRead dataset shows that MetaGen outperforms state-of-the-art methods.

Abstract

Peer reviews form an essential part of scientific communications. Research papers and proposals are reviewed by several peers before they are finally accepted or rejected for publication and funding, respectively. With the steady increase in the number of research domains, scholarly venues (journal and/or conference), researchers, and papers, managing the peer review process is becoming a daunting task. Application of recommender systems to assist peer reviewing is, therefore, being explored and becoming an emerging research area. In this paper, we present a deep learning network based Meta-Review Generation considering peer review prediction of the scholarly article (MRGen). MRGen is able to provide solutions for: (i) Peer review prediction (Task 1) and (ii) Meta-review generation (Task 2). First, the system takes the peer reviews as input and produces a draft meta-review. Then it employs an integrated framework of convolution layer, long short-term memory (LSTM) model, Bi-LSTM model, and attention mechanism to predict the final decision (accept/reject) of the scholarly article. Based on the final decision, the proposed model MRGen incorporates Pointer Generator Network-based abstractive summarization to generate the final meta-review. The focus of our approach is to give a concise meta-review that maximizes information coverage, coherence, readability and also reduces redundancy. Extensive experiments conducted on the PeerRead dataset demonstrate good consistency between the recommended decisions and original decisions. We also compare the performance of MRGen with some of the existing state-of-the-art multi-document summarization methods. The system also outperforms a few existing models based on accuracy, Rouge scores, readability, non-redundancy, and cohesion.

Introduction

In the modern era, the enormous growth of large academic data has made it difficult to extract insightful and efficient information [1]. There has been rapid development in the scale of academic entities such as researchers, publications, venues, and academic relations such as co-authorship, inter-citation [2]. For example, DBLP1 dataset, a collection of scientific publication records and their relationship within that collection exceeds 9585 computer science conferences2 and the number of journals is more than 41523 [3]. Not only has this phenomenon provided enough opportunities for researchers, but it has also given birth to new challenges, especially in the field of the peer-review system for peer-review process management.

The exponential growth of the volume and variety of data demands more sophisticated resources to assist with the exploration of academic data, and a significant effort has been made to encourage different types of meaningful applications [4]. Representative works include reference recommendation, scholars’ profiling, and academic impact estimation. Peer review mining and recommendation has been regarded as a useful application in exploiting academic reviews, which aims to provide appropriate review specific scores for individual reviews. Recommender systems on a peer review platform are, therefore, an emerging research area [5].

With the increasing amount of scientific research work being done, there is a need to speed up the process of evaluation so as to handle a large number of papers and encourage a large number of submissions. One of the best and most widely used ways for the assessment of research papers is scholarly peer review(refereeing) [6]. Peer review is an essential part of scientific communications. Research papers and proposals are reviewed by several peers before they are finally accepted or rejected for publication and funding, respectively. The expert mainly focuses on important points in the meta-review, which influence the decision regarding the paper. The procedure followed requires experts to give their reviews on research work, and then another expert writes a meta-review considering the peer reviews and comments by other experts, as depicted in Fig. 1.

It is still unclear, however, whether or not the review texts and overall recommendations are consistent with each other. If not, the review submission method should alert the editor or program chair if there is a possible error in the overall recommendation. In addition, if we can automatically highlight the controversies of the reviewer on the submitted paper (e.g., the main pros and cons in the review text), it would not only assist the chair to write a thorough meta-review, but will also be useful for authors to further enhance their article.

Most of the current studies have attempted to automatically predict the helpfulness of peer reviews or determine the consistency of peer reviews based on the hand-crafted feature (e.g., bag-of-words) and the model based on machine learning [7], [8], [9]. These methods have taken advantage of different hand-crafted features, which is a time-consuming and repetitive operation. In addition, for other domains or areas, they can not generalise well. Deep learning has emerged in recent years as a powerful way of solving problems of sentiment classification in academia [10], [11]. However, the word sequence information is not fully utilized by the semantic representation of existing methods.

Text summarization is an essential tool in today’s world, where data is getting generated at a tremendous rate [12]. Text summarization can be of two types: extractive and abstractive. While extractive text summarization focuses on selecting important sentences from the original document(s), abstractive summarization focuses on generating new sentences from the important points of the original document(s). Generally, extractive summaries have higher sentence quality as the sentences were written by actual individuals. The task of meta-review generation from given peer reviews also has a similar objective, i.e., to select the crucial sentences/points from the peer review and present the relevant information in a concise manner.

The current text summarization methods, however, can’t be directly applicable to the task of meta-review generation. This is because a meta-review focuses on specific aspects of a paper which can affect the decision regarding the acceptance of the paper. For example, plagiarism in an article can lead to straightaway rejection, while spelling mistakes mostly won’t have any effect on the decision. Also, the existing summarization methods don’t distinguish between the positive and negative points in a summary; thus, they can’t realize which points are strong or weak. Meta-reviews, on the other hand, require strong points which affect the decision regarding the acceptance of the paper.

To address the above problems, we investigate two tasks (i) Predicting the final decision status (accept, reject), (ii) Meta-review generation from peer reviews of a scholarly article. Therefore, we propose a deep neural network-based Meta-Review Generation considering peer review prediction of the academic article (MRGen). To predict the final decision (accept/reject) of the scholarly article, we propose a unified architecture integrating convolution layer, Long short-term memory (LSTM) model, Bi-LSTM model, and attention mechanism. In particular, our convolution layer on adjacent words is able to capture significant local features of the text (reviews). LSTMs can learn the long-term temporal dependencies and the positional relation of features as well as the global features of the whole reviews. Besides, the model can be improved by considering the difference in contributions of various words by employing attention mechanism for the decision prediction task.

The proposed system MRGen makes use of different graph and NLP based techniques to generate a meta-review. It focuses on reducing redundancy and increasing coherence and readability. It also makes use of sentiment analysis to identify whether the sentences/comments in the review are positive or negative and then structures the meta-review accordingly. Table 1 shows one example of original meta-review and meta-review generated by our proposed system MRGen.

We tend to review the following research queries (RQs) to thoroughly analyse our proposed approach and, more precisely, to resolve the current problems discussed in Section 1:

  • [RQ1:] How effective is MRGen to predict the final decision (accept/reject) of a scholarly article?

  • [RQ2:] How is the quality of meta-review generated by MRGen in terms of Rouge score (Relevance)?

  • [RQ3:] How good is the meta-review generated by MRGen in terms of Readability?

  • [RQ4:] What is the overall quality of the meta-review generated by MRGen in terms of Non-redundancy?

  • [RQ5:] What is the cohesion score of meta-review generated by MRGen?

This paper’s main contributions are summarised as follows:
  • 1.

    We are the first to explore the meta-review generation challenge in the field of peer review texts for scholarly articles to the best of our knowledge. We employ an agglomerative clustering-based approach to remove the redundancy in order to improve the information coverage of the meta-review. To improve the readability of the meta-review, deep coherence model has been applied.

  • 2.

    To make the generated meta-review coherent, we use coreference to determine which sentences are dependent on each other and group them accordingly. Later on we use a technique called Pointer Generator Network to make an abstractive summarization on each group of sentences obtained above.

  • 3.

    We propose a novel decision support system MRGen, which can predict the overall recommendation status of a scholarly article from the peer reviews. The most important terms from individual reviews, which can influence the overall recommendations, are captured by an integrated approach of a convolutional layer, Long short-term memory (LSTM) model, Bi-LSTM model, and attention mechanism.

  • 4.

    We propose an approach for generating sentence graph to obtain an order for selecting sentences to generate the draft review using Random Walk With Restart (RWR).

  • 5.

    Large-scale, systematic studies are performed with the academic data (PeerRead dataset) and evaluation findings demonstrate the efficacy of our proposed model. Through empirical analysis, we also draw certain interesting conclusions.

The remainder of our paper is structured as follows. Related work is recorded in Section 2 to illustrate the paper’s position with respect to the existing state-of-the-art. We elaborate problem statement in Section 3. In Section 4, we describe the proposed model with functional architecture. Detailed description of the proposed model is given in Section 5, Section 6, and Section 7. Experimental details including data description are provided in Section 8; followed by results analysis in Section 9. We finally conclude in Section 10.

Section snippets

Related work

In this section, we have reported the existing work related to Task 1 and Task 2 separately. Initially, we mention a few state-of-the-art techniques for peer review prediction (acceptance/rejection). Similarly, in the subsequent section, we reported about the background work and a few state-of-the-art techniques related to summarization and sentiment analysis.

Problem statement and other definitions

Automatically predicting the final decision of peer reviews for scholarly papers is a tedious task due to eterogeneous nature of its entities. In this segment, we exhibit the problem description and elaborate on various notations and terminology.

Definition 1

Final Decision Prediction (Task 1). Final decision prediction is a classification task where a binary classifier is used to estimate the probability of accept vs. reject of a given paper Pi having peer reviews Ri={ri1,ri2,,rin}, i.e., P(accept=True|Ri).

Methodology

We present a systematic framework of the proposed model alongside its operational strategies. We present a layered architecture where each layer realizes a specialized task. The system contains three major Blocks Draft Review Generation (BLOCK 1), Peer Review Prediction (BLOCK 2), and Meta-Review Generation (BLOCK 3) respectively. BLOCK 1 consists of three essential layers; BLOCK 2 consists of three layers, and similarly BLOCK 3 is having one essential layer. The functional architecture of our

Draft review generation (BLOCK 1)

To address the task of meta-review generation, we aim at developing a summarizer in the multi-document (multi-review) setting. Initially, we propose a deep coherence model based extractive summarizer for peer review of scholarly articles to generate draft review in order to retain only contextually relevant and non-redundant information. Fig. 3 shows the architecture of BLOCK 1.

Peer review prediction (BLOCK 2)

To address the task of predicting the final outcome of a scholarly article, we attempt to propose a novel decision support system incorporating a convolutional layer, one-dimensional LSTM, Bi-LSTM, an attention mechanism. We have provided a detail description of the functional architecture of the proposed BLOCK-2. This block contains three essential layers namely Embedding layer, Deep learning-based feature extraction layer, and Dense layer.

Meta-review generation (BLOCK 3)

To generate the final meta-review we use the draft review generated in BLOCK 1. We also make use of the output of BLOCK 2 to determine the ratio of positive and negative sentences in the meta-review and thus determine what should be the overall sentiment of the generated meta-review. This block also involves abstractive summarization. The draft review contains a list of pros and cons sorted on the basis of the score obtained using RWR in BLOCK 1. To determine the number of positive and negative

Experiments

First, we outline the experimental dataset and experimental setup that are used for the assessment of the proposed system MRGen, including individual evaluations of Task 1 and Task 2, respectively. Then the parameter tuning, evaluation metrics, and baseline methods are explained in further sub-sections. All experiments were conducted on a 64-bit and 2.4GHz Intel Core i5, 8-GB memory system. All the programs are implemented with python. We implement our model based on Keras and use a TITAN XP

Results and discussion

The output of MRGen against the current state-of-the-art methods is recorded in this section. We include the findings and discussion in two stages for clarification and easy comprehension. (Results of Task 1 and Results of Task 2) as given below. During the assessment of MRGen, best results and the second-best performer are marked by the ‘*’ and ‘+’ symbols.

Conclusion and future work

Peer reviews form an essential part of scientific communications. Due to a large number of submissions in the journal and/or conferences, it’s quite cumbersome to manage peer review. In recent times, the recommender system on a peer review platform is an emerging area of research in the field of recommendation. In this paper, we propose a system able to provide a solution to two problems: peer review prediction and meta-review generation. We present MRGen, a novel decision support system, which

Compliance with ethical standards

“The authors declare no conflicts of interest. In this paper, we present a system which provide solution to two problems: Peer review prediction, and meta-review generation. The article does not contain any examinations with human or creature subjects”.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Tribikram Pradhan received his M. Tech in Software Technology from VIT University, Vellore, Tamil Nadu, India in the year 2013. He is currently pursuing PhD in the Department of Computer Science and Engineering, Indian Institute of Technology (BHU), Varanasi, India. Prior to this, he worked as an assistant professor in the Department of Information and Communication Technology (ICT), Manipal Institute of Technology, Manipal, India from 2013 to 2016. His primary research interests include the

References (59)

  • K. Cho

    Machine classification of peer comments in physics

    Educ. Data Min.

    (2008)
  • L. Ramachandran, E.F. Gehringer, Automated assessment of review quality using latent semantic analysis, in: 2011 IEEE...
  • W. Xiong, D. Litman, Automatically predicting peer-review helpfulness, in: Proceedings of the 49th Annual Meeting of...
  • F. Qiao, L. Xu, X. Han, Modularized and attention-based recurrent convolutional neural network for automatic academic...
  • T. Ghosal, R. Verma, A. Ekbal, S. Saha, P. Bhattacharyya, Investigating impact features in editorial pre-screening of...
  • K. Ježek, J. Steinberger, Automatic text summarization (the state of the art 2007 and new challenges), in: Proceedings...
  • D. Kang, W. Ammar, B. Dalvi, M. van Zuylen, S. Kohlmeier, E. Hovy, R. Schwartz, A dataset of peer reviews (peerread):...
  • M.J. Mrowinski et al.

    Artificial intelligence in peer review: How can evolutionary computation support journal editors?

    PloS One

    (2017)
  • K. Wang, X. Wan, Sentiment analysis of peer review texts for scholarly papers, in: The 41st International ACM SIGIR...
  • C. Dos Santos et al.

    Deep convolutional neural networks for sentiment analysis of short texts

  • T. Ghosal et al.

    Deepsentipeer: harnessing sentiment in review texts to recommend peer review decisions

  • T. Ghosal, D. Dey, A. Dutta, A. Ekbal, S. Saha, P. Bhattacharyya, A multiview clustering approach to identify...
  • J. Goldstein, V. Mittal, J. Carbonell, M. Kantrowitz, Multi-document summarization by sentence extraction, in:...
  • R. Mihalcea et al.

    Bringing order into text

  • G. Erkan et al.

    Lexrank: graph-based lexical centrality as salience in text summarization

    J. Artif. Intell. Res.

    (2004)
  • M.T. Nayeem, Y. Chali, Extract with order for coherent multi-document summarization, arXiv preprint arXiv:1706.06542...
  • M. Kågebäck et al.

    Extractive summarization using continuous vector space models

  • R. Nallapati, F. Zhai, B. Zhou, Summarunner: a recurrent neural network based sequence model for extractive...
  • S. Banerjee et al.

    Multi-document abstractive summarization using ILP based multi-sentence compression

    CoRR

    (2016)
  • Cited by (11)

    • Recent reports on hydrogen evolution reactions and catalysis

      2022, Results in Chemistry
      Citation Excerpt :

      In the discussion of advantages behind electro catalysis different metals and metal oxide-based materials have received considerable interest for various applications. Due to the compositional/structural diversity, flexible tunability, low cost, earth-abundance, easy synthesis and environmental friendliness, Nevertheless, pure metal oxides, especially the bulk materials, were generally believed inactive towards HER in the past because of poor electrical conductivity, inappropriate hydrogen adsorption ability and limited catalytic-active site, although many of them display high activity for the anodic oxygen evolution reaction (OER) [10,11]. So far, state-of-the-art HER electro catalysts are mostly noble metal-based materials, which exhibit low reduction over potentials and good long-term durability [12,13].

    • AI for Knowledge Creation, Curation, and Consumption in Context

      2024, Journal of the Association for Information Systems
    • MOPRD: A multidisciplinary open peer review dataset

      2023, Neural Computing and Applications
    View all citing articles on Scopus

    Tribikram Pradhan received his M. Tech in Software Technology from VIT University, Vellore, Tamil Nadu, India in the year 2013. He is currently pursuing PhD in the Department of Computer Science and Engineering, Indian Institute of Technology (BHU), Varanasi, India. Prior to this, he worked as an assistant professor in the Department of Information and Communication Technology (ICT), Manipal Institute of Technology, Manipal, India from 2013 to 2016. His primary research interests include the areas of Recommender System, Social Network Analysis, NLP, and Machine Learning.

    Chaitanya Bhatia currently pursuing his Integrated Dual Degree (Bachelors+Master) from Department of Computer Science and Engineering, Indian Institute of Technology (BHU), Varanasi, India. His research interests include Deep Learning, Parallel Computing, Natural Language Processing, and Machine Learning.

    Prashant Kumar currently working as a software developer in Amazon Development Centre (India) Private Limited, Chennai, Tamilnadu. He received his B.Tech degree from Department of Information and Communication Technology, Manipal Institute of Technology, Manipal, India. His research interests include Deep Learning, Machine Learning, Algorithms, Parallel Computing, and Natural Language Processing.

    Sukomal Pal received his M. Tech and Ph.D. from Department of Computer Science and Engineering, Indian Statistical Institute, Kolkata in the year 2005 and 2012 respectively. He joined as an assistant professor in the Department of Computer Science and Engineering, Indian Institute of Technology (BHU), Varanasi, India in the year 2016. Prior to this, he has worked as an assistant professor in the Department of Computer Science and Engineering, Indian Institute of Technology (ISM), Dhanbad from 2010 to 2015. He authored a book entitled “Sub-document level Information Retrieval: Retrieval and Evaluation, LAP LAMBERT Academic Publishing”. His research interests include the areas of Information Retrieval, Recommender Systems, Text Mining, and Data Science.

    View full text