Authors:
Felipe Melo Soares
;
Ticiana L. Coelho da Silva
and
Jose F. de Macêdo
Affiliation:
Insight Data Science Lab, Fortaleza, Brazil
Keyword(s):
Sentence Compression, Text Summarization, Natural Language Processing.
Abstract:
The majority amount of information available on the Web remains unstructured, i.e., text documents from articles, news, blog posts, product reviews, forums discussions, among others. Given the huge amount of textual content continuously produced on the Web, it has been challenging for users to read and consume every document. Text summarization refers to the technique of shortening long pieces of text. The intention is to create a coherent and fluent summary having only the main points outlined in the document. Sentence compression can improve text summarization by removing redundant information, preserving the grammaticality and the important content of the original sentences. In this paper, we propose a sentence compression neural network model that achieved promising results compared to other neural network-based models, even when trained with smaller amounts of data. Rather than training the model only with the words from the training set, the proposed model was trained with diff
erent features extracted from the texts. This improves the ability of the model to decide whether or not to retain each word in the compressed sentence.
(More)