Generating summary sentences using Adversarially Regularized Autoencoders with conditional context
Introduction
Text summarization is a technique for generating a compressed version of an original document. Manual text summarization is expensive, so there has been an increasing demand for automatic summarization technology.
When making decisions in business, companies usually summarizes Voice of Customer (VOC) data (e.g., review data) to improve their products to reflect customer feedback. Companies usually categorize customer response documents by topic and extract their key content by summarization (Fig. 1). In most cases, this task is usually done manually by the experts, making it expensive in terms of time and money. Furthermore, as amount of VOC data is growing faster, automatic summarization techniques become increasingly important more than ever before. In response to this trend, we proposed an efficient, unsupervised summarization method for each topic automatically.
Text summarization can be categorized in two distinct ways: extractive and abstractive. Extractive summarization extracts the important parts of a document in place of whole sentences. Abstractive summarization compresses the important content of the document and generates new sentences based on that content. Abstractive summarization is difficult, so most text summarization has been done by extractive summarization methods, including keyword extraction, similarity detection, and rule-based methods. However, recent successful work is sequence to sequence model (Sutskever, Vinyals, & Le, 2014), have shown that abstractive summarization is possible by reading and freely generating text using recursive neural networks (RNN) (Chopra, Auli, Rush, 2016, Nallapati, Zhou, dos Santos, Gulcehre, Xiang, 2016, Rush, Chopra, Weston, 2015, See, Liu, Manning, 2017, Zeng, Luo, Fidler, & Urtasun). Most of these studies used target sentences as summary sentences for learning. However, the method proposed in this study used an unsupervised abstractive summarization method that did not require the use of target sentences for learning.
Our work is a multi-document summarization and aims to summarize various review data. We summarized review data clustered by each topic and conducted text summarization on review data from beer review data which is in Korean and Opinosis data set (Ganesan, Zhai, & Han, 2010) which is in English.
The proposed abstractive summarization method in this study generated summary sentences using a code vector of autoencoder. Several studies have used autoencoders to extract summaries (Yousefi-Azar, Hamey, 2017, Zhong, Liu, Li, Long, 2015), but our method was different from those previous methods because it used an autoencoder for abstractive summarization.
We used code vectors of Adversarially Regularized Autoencoder (ARAE) (Zhao, Kim, Zhang, Rush, & LeCun, 2017) model to summarize review data categorized by topic. ARAE is known as a generative model for text data and it is an autoencoder that can represent discretely structured text data as continuous space by learning Generative Adversarial Network (GAN) (Goodfellow et al., 2014) jointly. This property was leveraged to generate abstractive summaries in this work.
In addition, we proposed the Conditional Adversarially Regularized Autoencoder (CARAE) model, which can consider the conditional probability of the review data cluster. This model was designed to improve summary sentence generation performance for each cluster.
In summary, this paper’s contributions are the followings:
- •
We proposed a novel and intuitive summary generation method by using code vectors of autoencoder.
- •
We used ARAE model for acquiring code vectors in an abstractive summarization method that can be trained unsupervisedly.
- •
We proposed a new autoencoder model, CARAE, which is more specialized in summarizing review data categorized by topic.
- •
Our proposed models performed better than the existing approaches in both English and Korean document summarization.
- •
Our proposed models can help many companies to understand customer feedback more easily and clearly.
The remainder of the paper is organized as follows. Section 2 discusses related work. Sections 3 and 4 explain the experimental methods and setup, respectively. Section 5 presents the experimental results. Section 6 discusses the results. Section 7 provides conclusions and ideas for future research.
Section snippets
Related work
Text summarization has been conducted using a variety of approaches (Abdi, Shamsuddin, Hasan, Piran, 2018, Condori, Pardo, 2017, Ferreira, de Souza Cabral, Lins, e Silva, Freitas, Cavalcanti, Lima, Simske, Favaro, 2013, Peng, Gao, Zhu, Huang, Yuan, Li, 2016, Wu, Lei, Li, Huang, Zheng, Chen, Xu, 2017). Neural network models are one such approach. Kaikhah (2004) successfully proposed automatic text summarization by using shallow neural networks. Svore, Vanderwende, and Burges (2007) proposed a
Models and method
Our study used an autoencoder structure to perform text summarization through the encoded code space which represented each sentence well. Instead of extracting the summary sentences by ranking each sentence, summary sentences were generated using each sentences code space to generated the central code. The ARAE model used in this study was designed to learn a more robust discrete-space representation for discrete structural data such as text (Zhao et al., 2017). This model consists of
Experiments
This section describes the experimental input data, the baselines for comparison, how the proposed models were implemented, and how the proposed models were evaluated.
Korean data summarization
Tables 2 and 3 show the results of the Korean text summarization. These results are the average summarization scores for each of the 277 topics. The evaluation was performed after removing stopwords.
Table 2 shows the results of ROUGE-1, ROUGE-2, and ROUGE-L. Overall, ARAE and CARAE performed better than their corresponding baselines and CARAE performed better than ARAE.
Comparing the models with the difference in condition nodes existence, ARAE was 0.27 points higher than CARAE for ROUGE-1, but
Discussion
This paper discussed the proposed ARAE and CARAE summary methods. ARAE encoded text data better through learning with GAN, this model was used to generate summarized sentences using average code vectors. AE and CAE, which did not have GAN, were developed as the ARAE and CARAE comparison models, respectively.
Our work compared experimental results from two perspectives. First is comparing the performance of ARAE and CARAE with that of AE and CAE, respectively, to determine the effect of condition
Conclusion and future work
In this research we tested the effectiveness of proposed text summarization methods at generating summary sentences for review data that was grouped by topic. We proposed a novel abstractive summary method by decoding average code vectors. We used ARAE model to our summary method because it could smoothly encode discrete text data. The review data in this study covered various topics, so the CARAE model was proposed which considered conditional probability to account for topic information. ARAE
Acknowledgements
This research was supported by the project “Development of Automatic Coding Method for Survey” of MACROMILL EMBRAIN.
References (34)
- et al.
Machine learning-based multi-documents sentiment-oriented summarization using linguistic treatment
Expert Systems with Applications
(2018) - et al.
Assessing sentence scoring techniques for extractive text summarization
Expert Systems with Applications
(2013) - et al.
Framewise phoneme classification with bidirectional LSTM and other neural network architectures
Neural Networks
(2005) - et al.
High quality information extraction and query-oriented summarization for automatic query-reply in social network
Expert Systems with Applications
(2016) - et al.
A topic modeling based approach to novel document automatic summarization
Expert Systems with Applications
(2017) - et al.
Text summarization using unsupervised deep learning
Expert Systems with Applications
(2017) - et al.
Query-oriented unsupervised multi-document summarization via deep learning model
Expert Systems with Applications
(2015) - et al.
Sentiment diversification for short review summarization
Proceedings of the international conference on web intelligence - WI ’17
(2017) - et al.
Wasserstein gan
STAT
(2017) - et al.
Abstractive sentence summarization with attentive recurrent neural networks
Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: Human language technologies
(2016)
Opinion summarization methods: Comparing and extending extractive and abstractive approaches
Expert Systems with Applications
LexRank: Graph-based lexical centrality as salience in text summarization
Journal of Artificial Intelligence Research
Opinosis: A graph-based approach to abstractive summarization of highly redundant opinions
Proceedings of the 23rd international conference on computational linguistics
Micropinion generation: An unsupervised approach to generating ultra-concise summaries of opinions
Proceedings of the 21st international conference on world wide web
Generative adversarial nets
Advances in neural information processing systems
Cited by (6)
CSGAN: Cyclic-Synthesized Generative Adversarial Networks for image-to-image transformation
2021, Expert Systems with ApplicationsCitation Excerpt :Generative Adversarial Networks (GANs) were initially proposed to generate models that are nearer to the training samples based on the given noise distribution. Later on, the GANs were also used for different applications like image blending (Wu, Zheng, Zhang, & Huang, 2017), outlier detection (Oh, Hong, & Baek, 2019), data generation (Kong & Kim, 2019), semantic segmentation (Luc, Couprie, Chintala, & Verbeek, 2016), single image super-resolution (Ledig et al., 2017) and image inpainting (Pathak, Krahenbuhl, Donahue, Darrell, & Efros, 2016), etc. These methods were introduced for particular applications; thus the generalized framework is still missing.
AnomalP: An approach for detecting anomalous protein conformations using deep autoencoders
2021, Expert Systems with ApplicationsCitation Excerpt :The sub-diffusive dynamics of a protein is also shown to be highly dependent on the protein’s mode of diffusion. Autoencoders are self-supervised machine learning models used in a wide variety of data mining applications, ranging from various natural language processing tasks (Kong & Kim, 2019; Su et al., 2018) to malware detection (Kim et al., 2018), facial emotion recognition (Chen et al., 2018), business process management (Nguyen, Lee et al., 2019) or protein functional dynamics analysis (Teletin, Czibula, Bocicor et al., 2018). An autoencoder is a neural network which aims to learn how to reconstruct the input by learning an approximation of the identity function.
Cross-Adversarial Learning for Molecular Generation in Drug Design
2022, Frontiers in PharmacologySurvey on abstractive text summarization using various approaches
2019, International Journal of Advanced Trends in Computer Science and Engineering