Experimental analysis of multiple criteria for extractive multi-document text summarization

doi:10.1016/j.eswa.2019.112904

Expert Systems with Applications

Volume 140, February 2020, 112904

https://doi.org/10.1016/j.eswa.2019.112904 Get rights and content

Highlights

•
We focus on the different criteria used for generating automatic text summaries.
•
We contribute with a complete study evaluating and comparing these different criteria.
•
This paper analyzes the best criteria and their combinations.
•
Our tests use Document Understanding Conferences datasets and the ROUGE metrics.
•
Coverage, redundancy reduction, and relevance obtain the most balanced results.

Abstract

Automatic text summarization methods are increasingly needed in different fields of knowledge. In the scientific literature, generic extractive multi-document text summarization can be formulated as an optimization problem which involves several criteria. Only two criteria have been considered simultaneously, i.e., content coverage and redundancy reduction, whereas the other ones, relevance and coherence have been considered separately. Therefore, there is a lack of studies comparing the performance of different criteria. For this reason, a comparative study of the different criteria suitable for generic extractive multi-document text summarization is performed here. All possible combinations of two, three, and four criteria have been considered within a multi-objective optimization context. Experiments have been carried out based on datasets from Document Understanding Conferences (DUC), and the combinations of objective functions have been compared and evaluated with Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metrics. Redundancy reduction has been demonstrated as an indispensable criterion, being the coherence the least significant and efficient criterion. The combination that includes content coverage, redundancy reduction, and relevance obtains the most balanced results in terms of average ROUGE and execution time.

Introduction

A large amount of information is collected in World Wide Web nowadays. In addition, this volume of information grows continuously. This explosion of digital information makes it difficult for the Internet users to quickly extract the most relevant information on a specific topic. Using the tools of text mining, it is possible to carry out this process (Fan & Bifet, 2013). From all the textual information of the specific topic, text mining tools should be able to automatically generate a summary (Hashimi, Hafez, & Mathkour, 2015). In addition, the generated summary must meet some basic requirements, including the needs of users.

In the scientific literature, there are several types of summarization methods. Text summarization methods can be query-oriented or generic. On the one hand, according to Huang, He, Wei, and Li (2010), query-oriented summarization requires the creation of a summary according to a given query from the user. This query describes the user’s information need, and it usually consists of one or several narrative or interrogative sentences to interact with. On the other hand, generic summarization methods do not need any query specified by the user (see e.g. Alguliev, Aliguliyev, & Mehdiyev, 2011c). Zajic, Dorr, and Lin (2008) differentiate summarization methods in single-document or multi-document depending on where the information is obtained from: a single-document summary reduces the information from only a document; and a multi-document summary is obtained from a document collection, selecting information from each of them.

Text summarization methods can also be classified as abstractive or extractive (Wan, 2008). On the one hand, abstractive summarization methods may generate a summary which contains words or parts of sentences that do not exist in the original text. On the other hand, extractive summarization methods only select entire sentences from the original text.

Text summarization problem can be addressed by single-objective or multi-objective optimization approaches. In the first ones, only one objective function is optimized (see e.g. Alguliev et al., 2011c). This objective function can include one criterion or more than one by weighting them. Most approaches for text summarization found in the scientific literature have used single-objective optimization. However, recently, multi-objective optimization is gaining strength for text summarization, as in Sanchez-Gomez, Vega-Rodríguez, and Pérez (2018). This optimization approach allows the optimization of more than one objective function simultaneously. In addition, the results obtained with these approaches improve those obtained with the single-objective optimization approaches.

The generation of a summary is carried out by means of an optimization problem. This usually involves a set of criteria to be optimized. In most of the works, the used criteria have been the content coverage and the redundancy reduction: the content coverage criterion concerns to the presence of the main topic of the document collection in the generated summary, whereas the redundancy reduction criterion tries to avoid similar sentences in the summary (see e.g. Alguliev, Aliguliyev, Mehdiyev, 2011c, Saleh, Kadhim, Attea, 2015, Sanchez-Gomez, Vega-Rodríguez, Pérez, 2018). Other criteria have also been used to a lesser extent, such as relevance and coherence: the relevance criterion tries to include the most important information from the document collection by including the appropriate sentences (Alguliev, Aliguliyev, & Isazade, 2012c), and the coherence criterion assesses the readability of the summary by means of a smooth connectivity among its sentences (Umam, Putro, Pratamasunu, Arifin, & Purwitasari, 2015). These four criteria have been used in the scientific literature as it is shown in the related work section. In addition, these criteria have also been used in other approaches that are not focused on optimization techniques, such as graph-based, clustering, or latent semantic analysis. These approaches applied content coverage (see e.g. Angheluta, De Busser, Moens, 2002, Baralis, Cagliero, Mahoto, Fiori, 2013) and redundancy reduction (see e.g. Garg, Favre, Reidhammer, & Hakkani-Tür, 2009). As for relevance and coherence, see e.g. Gong and Liu (2001) and Azmi and Al-Thanyyan (2012), respectively. To the best of the authors’ knowledge, no study has been conducted to analyse the performance of all the possible combinations of these four criteria/objectives and assess their importance.

In this work, generic extractive multi-document text summarization problem is addressed with a multi-objective optimization approach. Experiments have been carried out based on datasets from Document Understanding Conferences (DUC) (NIST, 2014). The combinations of objective functions have been evaluated and compared by using Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metrics (Lin, 2004). The main contributions of this work are the following:

•
All criteria suitable for generic extractive multi-document text summarization problem have been analyzed, implemented, and compared.
•
The experiments have taken into account all possible combinations of the different criteria (a total of eleven).
•
The multi-objective optimization approach used in the experiments has been adapted to consider different number of criteria/objectives (two, three, and four) and all possible combinations of them.
•
A complete statistical analysis has been performed for the results obtained in the combinations of two, three, and four objective functions.
•
The most relevant criteria have been obtained, and also the best combinations among them.
•
The balance between quality improvement and computational cost has been studied for the best combinations.
•
A comparison with other works has been included, showing the improvements of the best combination found.

The remainder of this paper is organized as follows. Section 2 presents the related work. In Section 3, the generic extractive multi-document text summarization is formulated as a multi-objective optimization problem, and the different criteria are described. Section 4 presents the datasets used for the experiments, the evaluation metrics used, the statistical methods carried out, and the algorithm used in the experiments. In Section 5, the results obtained for all combinations of objective functions and their statistical analysis are detailed. Finally, Section 6 includes the conclusions.

Section snippets

Related work

This section presents the state-of-the-art related to criteria applied for generic extractive multi-document text summarization from an optimization viewpoint.

First, the case for two criteria is reviewed. All references found refer to the content coverage and the redundancy reduction. Alguliev et al. (2011c) proposed an adaptive Differential Evolution (DE) algorithm for the optimization model. Alguliev, Aliguliyev, Hajirahimova, and Mehdiyev (2011) solved the text summarization model with

Problem definition

Generic extractive multi-document text summarization is formalized as an optimization problem. In the field of text summarization, vector-based word methods are the most used, where each sentence is represented as a vector of words. To measure similarity between sentences by means of pairwise comparisons, the most used measure in the scientific literature is the cosine similarity.

In this section, the cosine similarity measure is explained next. After that, the different criteria used for the

Methodology

This section presents the datasets involved in the experiments, the evaluation metrics used, the statistical methods performed, and the algorithm used for the experiments.

Experimental results

This section presents the results separated by number of objective functions and a discussion on the best combinations of objective functions.

Conclusions

Generic extractive multi-document text summarization is a problem whose nature is multi-objective. However, this problem has been traditionally solved from a single-objective perspective. When multi-objective optimization has been considered to address this problem, only two criteria have been used, i.e., content coverage and redundancy reduction. In other cases, criteria such as relevance and coherence have been used independently. Evaluating and comparing the different combinations of the

CRediT authorship contribution statement

Jesus M. Sanchez-Gomez: Software, Validation, Formal analysis, Investigation, Data curation, Writing - original draft, Visualization. Miguel A. Vega-Rodríguez: Conceptualization, Methodology, Formal analysis, Resources, Writing - review & editing, Supervision, Project administration, Funding acquisition. Carlos J. Pérez: Conceptualization, Methodology, Formal analysis, Resources, Writing - review & editing, Supervision, Project administration, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

We thank the referees for comments and suggestions which have improved both the content and the readability of this paper. This research has been supported by Ministerio de Economía y Competitividad (Centro para el Desarrollo Tecnológico Industrial, contract IDI-20161039; Agencia Estatal de Investigación, projects TIN2016-76259-P and MTM2017-86875-C3-2-R), Junta de Extremadura (Contract AA-16-0017-1, and projects GR18090 and GR18108), Cátedra/Aula ASPgems, and European Union (European Regional

References (35)

R.M. Alguliev et al.
GenDocSum+MCLR: Generic document summarization based on maximum coverage and less redundancy
Expert Systems with Applications
(2012)
R.M. Alguliev et al.
MCMR: Maximum coverage and minimum redundant text summarization model
Expert Systems with Applications
(2011)
R.M. Alguliev et al.
DESAMC+DocSum: Differential evolution with self-adaptive mutation and crossover parameters for multi-document summarization
Knowledge-Based Systems
(2012)
R.M. Alguliev et al.
CDDS: Constraint-driven document summarization models
Expert Systems with Applications
(2013)
R.M. Alguliev et al.
Formulation of document summarization as a 0–1 nonlinear programming problem
Computers & Industrial Engineering
(2013)
R.M. Alguliev et al.
Multiple documents summarization based on evolutionary optimization algorithm
Expert Systems with Applications
(2013)
R.M. Alguliev et al.
Sentence selection for generic document summarization using an adaptive differential evolution algorithm
Swarm and Evolutionary Computation
(2011)
A.M. Azmi et al.
A text summarizer for Arabic
Computer Speech & Language
(2012)
E. Baralis et al.
GRAPHSUM: Discovering correlations among multiple terms for graph-based summarization
Information Sciences
(2013)
D. Bollegala et al.
A preference learning approach to sentence ordering for multi-document summarization
Information Sciences
(2012)

H. Hashimi et al.

Selection criteria for text mining approaches

Computers in Human Behavior

(2015)

G. Salton et al.

Term-weighting approaches in automatic text retrieval

Information Processing & Management

(1988)

J.M. Sanchez-Gomez et al.

Extractive multi-document text summarization using a multi-objective artificial bee colony optimization approach

Knowledge-Based Systems

(2018)

P. Verma et al.

MCRMR: Maximum coverage and relevancy with minimal redundancy based multi-document summarization

Expert Systems with Applications

(2019)

D.M. Zajic et al.

Single-document and multi-document summarization techniques for email threads using sentence compression

Information Processing & Management

(2008)

R.M. Alguliev et al.

Quadratic boolean programming model and binary differential evolution algorithm for text summarization

Problems of Information Technology

(2012)

R.M. Alguliev et al.

An unsupervised approach to generating generic summaries of documents

Applied Soft Computing

(2015)

Cited by (29)

RankSum—An unsupervised extractive text summarization based on rank fusion
2022, Expert Systems with Applications
In this paper, we propose Ranksum, an approach for extractive text summarization of single documents based on the rank fusion of four multi-dimensional sentence features extracted for each sentence: topic information, semantic content, significant keywords, and position. The Ranksum obtains the sentence saliency rankings corresponding to each feature in an unsupervised way followed by the weighted fusion of the four scores to rank the sentences according to their significance. The scores are generated in completely unsupervised way, and a labeled document set is required to learn the fusion weights. Since we found that the fusion weights can generalize to other datasets, we consider the Ranksum as an unsupervised approach. To determine topic rank, we employ probabilistic topic models whereas semantic information is captured using sentence embeddings. To derive rankings using sentence embeddings, we utilize Siamese networks to produce abstractive sentence representation and then we formulate a novel strategy to arrange them in their order of importance. A graph-based strategy is applied to find the significant keywords and related sentence rankings in the document. We also formulate a sentence novelty measure based on bigrams, trigrams, and sentence embeddings to eliminate redundant sentences from the summary. The ranks of all the sentences – computed for each feature – are finally fused to get the final score for each sentence in the document. We evaluate our approach on publicly available summarization datasets — CNN/DailyMail and DUC 2002. Experimental results show that our approach outperforms other existing state-of-the-art summarization methods.
Automatic text summarization: A comprehensive survey
2021, Expert Systems with Applications
Citation Excerpt :
The sentence scoring steps of an optimization-based extractive summarizer include: 1) creating a suitable representation to the input text such as the commonly-used vector representation (i.e. each sentence in the input text is represented as a vector of words), and 2) using an optimization algorithm (e.g. A Multi-Objective Artificial Bee Colony (MOABC) algorithm) to select the summary sentences based on the required summary length limit besides one or more optimization criteria: content coverage, redundancy reduction, relevance and coherence (Sanchez-Gomez et al., 2020b). In addition, the strength of genetic algorithms in adjusting weights could be used for ATS.
Automatic Text Summarization (ATS) is becoming much more important because of the huge amount of textual content that grows exponentially on the Internet and the various archives of news articles, scientific papers, legal documents, etc. Manual text summarization consumes a lot of time, effort, cost, and even becomes impractical with the gigantic amount of textual content. Researchers have been trying to improve ATS techniques since the 1950s. ATS approaches are either extractive, abstractive, or hybrid. The extractive approach selects the most important sentences in the input document(s) then concatenates them to form the summary. The abstractive approach represents the input document(s) in an intermediate representation then generates the summary with sentences that are different than the original sentences. The hybrid approach combines both the extractive and abstractive approaches. Despite all the proposed methods, the generated summaries are still far away from the human-generated summaries. Most researches focus on the extractive approach. It is required to focus more on the abstractive and hybrid approaches. This research provides a comprehensive survey for the researchers by presenting the different aspects of ATS: approaches, methods, building blocks, techniques, datasets, evaluation methods, and future research directions.
Control Stochastic Selection-Based Biomedical Text Summarization Using Sim-TLBO
2024, Arabian Journal for Science and Engineering
RankSum-An Unsupervised Extractive Text Summarization based on Rank Fusion
2024, arXiv
Survey on Multi-Document Summarization: Systematic Literature Review
2023, arXiv
State-of-the-art approach to extractive text summarization: a comprehensive review
2023, Multimedia Tools and Applications

View all citing articles on Scopus

View full text

Experimental analysis of multiple criteria for extractive multi-document text summarization

Highlights

Abstract

Introduction

Section snippets

Related work

Problem definition

Methodology

Experimental results

Conclusions

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

Expert Systems with Applications

Expert Systems with Applications

Knowledge-Based Systems

Expert Systems with Applications

Computers & Industrial Engineering

Expert Systems with Applications

Swarm and Evolutionary Computation

Computer Speech & Language

Information Sciences

Information Sciences

Computers in Human Behavior

Information Processing & Management

Knowledge-Based Systems

Expert Systems with Applications

Information Processing & Management

Quadratic boolean programming model and binary differential evolution algorithm for text summarization

Problems of Information Technology

An unsupervised approach to generating generic summaries of documents

Applied Soft Computing