Recent advances in document summarization

Yao, Jin-ge; Wan, Xiaojun; Xiao, Jianguo

doi:10.1007/s10115-017-1042-4

Recent advances in document summarization

Survey Paper
Published: 28 March 2017

Volume 53, pages 297–336, (2017)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

4248 Accesses
6 Altmetric
Explore all metrics

Abstract

The task of automatic document summarization aims at generating short summaries for originally long documents. A good summary should cover the most important information of the original document or a cluster of documents, while being coherent, non-redundant and grammatically readable. Numerous approaches for automatic summarization have been developed to date. In this paper we give a self-contained, broad overview of recent progress made for document summarization within the last 5 years. Specifically, we emphasize on significant contributions made in recent years that represent the state-of-the-art of document summarization, including progress on modern sentence extraction approaches that improve concept coverage, information diversity and content coherence, as well as attempts from summarization frameworks that integrate sentence compression, and more abstractive systems that are able to produce completely new sentences. In addition, we review progress made for document summarization in domains, genres and applications that are different from traditional settings. We also point out some of the latest trends and highlight a few possible future directions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic Text Summarization Methods: A Comprehensive Review

Article 28 October 2022

A Survey on Automatic Text Summarisation

State-of-the-art approach to extractive text summarization: a comprehensive review

Article 16 February 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

www.summly.com.
However, readers are still assumed to have some basic knowledge in natural language processing and text mining in general.
The tf-idf weighting scheme is a well-known concept in information retrieval that uses the term frequency (tf) in the document for each term and a complementary weight for each term which penalizes terms found in many documents in the collection by using the inverse document frequency (idf), i.e., the inverse of the number of documents that contain the term, as weights.
There is an equivalent definition which provides less intuition in the context of document summarization: f is submodular iff for $\forall A,B\subseteq V$ we have $f(A)+f(B)\ge f(A\cup B) + f(A\cap B)$.
A set function f is called monotone, if $f(A)\le f(B)$ whenever $A\subseteq B$.
The original paper [116] incorrectly proved a better $(1-1/\sqrt{e})$ bound, as pointed out in a later work from a different research group [134].
Available at http://www.cs.cornell.edu/~rs/sfour/.
Starting from [70], all these papers weirdly evaluate their systems merely on query-focused datasets although they are designed for generic cases.
Nevertheless, in some specific domains and genres such as meeting summarization or opinion summarization, the system has to produce abstractive summaries. We will briefly give some relevant introduction in next section.
That said, designing architectures that actually work is commonly reckoned to be equally labor-intensive.
The authors of [119] use ROUGE-1 recall as the fitness function for measuring summarization quality. The discreteness of objective function (ROUGE) hampers the use of linear programming solutions. In principle, other more advanced and more efficient global optimization techniques such as Bayesian optimization [173] may also be applicable.
For a more specific, comprehensive discussion on opinion summarization, readers may refer to existing survey papers (e.g., [90, 120]).
A scheme of information structure that classifies sentences in scientific text into categories (such as Aim, Background, Own, Contrast and Basis) based on their rhetorical status in scientific discourse.

References

Alfonseca E, Pighin D, Garrido G (2013) Heady: news headline abstraction through event pattern clustering. In: Proceedings of the 51st annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Sofia, pp 1243–1253
Almeida M, Martins A (2013) Fast and robust compressive summarization with dual decomposition and multi-task learning. In: Proceedings of the 51st annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Sofia, pp 196–206
Ayana, Shen S, Liu Z, Sun M (2016) Neural headline generation with minimum risk training. CoRR abs/1604.01904
Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: International conference on learning representations (ICLR)
Bairi R, Iyer R, Ramakrishnan G, Bilmes J (2015) Summarization of multi-document topic hierarchies using submodular mixtures. In: Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 1: long papers). Association for Computational Linguistics, Beijing, pp 553–563
Banerjee S, Mitra P, Sugiyama K (2015) Multi-document abstractive summarization using ilp based multi-sentence compression. In: International joint conference on artificial intelligence
Barzilay R, Elhadad M (1999) Using lexical chains for text summarization. Advances in automatic text summarization, pp 111–121
Barzilay R, Elhadad N (2002) Inferring strategies for sentence ordering in multidocument news summarization. J Artif Intell Res 17:35–55
MATH Google Scholar
Barzilay R, McKeown K (2005) Sentence fusion for multidocument news summarization. Comput Linguist 31(3):297–328. doi:10.1162/089120105774321091
Article MATH Google Scholar
Baumel T, Cohen R, Elhadad M (2014) Query-chain focused summarization. In: Proceedings of the 52nd annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Baltimore, pp 913–922
Baumel T, Cohen R, Elhadad M (2016) Topic concentration in query focused summarization datasets. In: AAAI Conference on Artificial Intelligence
Berg-Kirkpatrick T, Gillick D, Klein D (2011) Jointly learning to extract and compress. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Portland, pp 481–490
Bing L, Li P, Liao Y, Lam W, Guo W, Passonneau R (2015) Abstractive multi-document summarization via phrase selection and merging. In: Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 1: long papers). Association for Computational Linguistics, Beijing, pp 1587–1597
Boudin F, Mougard H, Favre B (2015) Concept-based summarization using integer linear programming: From concept pruning to multiple optimal solutions. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 1914–1918
Cao Z, Wei F, Dong L, Li S, Zhou M (2015) Ranking with recursive neural networks and its application to multi-document summarization. In: AAAI conference on artificial intelligence
Cao Z, Wei F, Li S, Li W, Zhou M, Wang H (2015) Learning summary prior representation for extractive summarization. In: Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 2: short papers). Association for Computational Linguistics, Beijing, pp 829–833
Cao Z, Chen C, Li W, Li S, Wei F, Zhou M (2016) Tgsum: build tweet guided multi-document summarization dataset. In: AAAI conference on artificial intelligence
Cao Z, Li W, Li S, Wei F, Li Y (2016) Attsum: Joint learning of focusing and summarization with neural attention. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers. The COLING 2016 Organizing Committee. Osaka, pp 547–556
Carbonell JG, Goldstein J (1998) The use of mmr, diversity-based reranking for reordering documents and producing summaries. In: SIGIR ’98: Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval, August 24–28, 1998, Melbourne, Australia, pp 335–336. doi:10.1145/290941.291025
Carenini G, Cheung JCK, Pauls A (2013) Multi-document summarization of evaluative text. Comput Intell 29(4):545–576. doi:10.1111/j.1467-8640.2012.00417.x
Article MathSciNet Google Scholar
Celikyilmaz A, Hakkani-Tur D (2010) A hybrid hierarchical model for multi-document summarization. In: Proceedings of the 48th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Uppsala, pp 815–824
Celikyilmaz A, Hakkani-Tur D (2011) Discovery of topically coherent sentences for extractive summarization. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Portland, pp 491–499
Ceylan H, Mihalcea R, Özertem U, Lloret E, Palomar M (2010) Quantifying the limits and success of extractive summarization systems across domains. In: Human language technologies: the 2010 annual conference of the North American chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Los Angeles, pp 903–911
Chakrabarti D, Punera K (2011) Event summarization using tweets. In: International AAAI conference on web and social media
Chali Y, Hasan SA (2012) On the effectiveness of using sentence compression models for query-focused multi-document summarization. In: Proceedings of COLING 2012. The COLING 2012 Organizing Committee. Mumbai, pp 457–474
Chan W, Zhou X, Wang W, Chua TS (2012) Community answer summarization for multi-sentence question with group l1 regularization. In: Proceedings of the 50th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Jeju Island, pp 582–591
Cheng G, Xu D, Qu Y (2015) Summarizing entity descriptions for effective and efficient human-centered entity linking. In: Proceedings of the 24th international conference on World Wide Web, WWW 2015, Florence, Italy, May 18–22, 2015, pp 184–194. doi:10.1145/2736277.2741094
Cheng J, Lapata M (2016) Neural summarization by extracting sentences and words. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Berlin, pp 484–494
Cheung JCK, Penn G (2013) Towards robust abstractive multi-document summarization: In: A caseframe analysis of centrality and domain. In: Proceedings of the 51st annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Sofia, pp 1233–1242
Cheung JCK, Penn G (2014) Unsupervised sentence enhancement for automatic summarization. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, pp 775–786
Chopra S, Auli M, Rush AM (2016) Abstractive sentence summarization with attentive recurrent neural networks. In: Proceedings of the 2016 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, San Diego, pp 93–98
Christensen J, Mausam Soderland S, Etzioni O (2013) Towards coherent multi-document summarization. In: Proceedings of the 2013 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Atlanta, pp 1163–1173
Christensen J, Soderland S, Bansal G, Mausam, (2014) Hierarchical summarization: Scaling up multi-document summarization. In: Proceedings of the 52nd annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Baltimore, pp 902–912
Clarke J, Lapata M (2008) Global inference for sentence compression: an integer linear programming approach. J Artif Intell Res 31:399–429. doi:10.1613/jair.2433
MATH Google Scholar
Cohan A, Goharian N (2015) Scientific article summarization using citation-context and article’s discourse structure. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 390–400
Cohen WW, Schapire RE, Singer Y (1999) Learning to order things. J Artif Intell Res 10:243–270. doi:10.1613/jair.587
MathSciNet MATH Google Scholar
Conroy JM, O’Leary DP (2001) Text summarization via hidden markov models. In: SIGIR 2001: proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval, September 9–13, 2001, New Orleans, Louisiana, USA, pp 406–407. doi:10.1145/383952.384042
Contractor D, Guo Y, Korhonen A (2012) Using argumentative zones for extractive summarization of scientific articles. In: Proceedings of COLING 2012, The COLING 2012 Organizing Committee. Mumbai, India, pp 663–678
Das D, Martins AF (2007) A survey on automatic text summarization. Lit Surv Lang Stat II Course CMU 4:192–195
Google Scholar
Dasgupta A, Kumar R, Ravi S (2013) Summarization through submodularity and dispersion. In: Proceedings of the 51st annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Sofia, pp 1014–1022
Davis ST, Conroy JM, Schlesinger JD (2012) Occams–an optimal combinatorial covering algorithm for multi-document summarization. In: 2012 IEEE 12th international conference on data mining workshops. IEEE, pp 454–463
Delort JY, Alfonseca E (2012) Dualsum: a topic-model based approach for update summarization. In: Proceedings of the 13th conference of the European chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Avignon, pp 214–223
Di Fabbrizio G, Stent A, Gaizauskas R (2014) A hybrid approach to multi-document summarization of opinions in reviews. In: Proceedings of the 8th international natural language generation conference (INLG). Association for Computational Linguistics, Philadelphia, pp 54–63
Duan Y, Chen Z, Wei F, Zhou M, Shum HY (2012) Twitter topic summarization by ranking tweets using social influence and content quality. In: Proceedings of COLING 2012. The COLING 2012 Organizing Committee. Mumbai, pp 763–780
Durrett G, Berg-Kirkpatrick T, Klein D (2016) Learning-based single-document summarization with compression and anaphoricity constraints. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Berlin, pp 1998–2008
Elsner M, Santhanam D (2011) Learning to fuse disparate sentences. In: Proceedings of the workshop on monolingual text-to-text generation. Association for Computational Linguistics, Portland, pp 54–63
Erkan G, Radev DR (2004) Lexrank: graph-based lexical centrality as salience in text summarization. J Artif Intell Res 22:457–479
Google Scholar
Fang Y, Teufel S (2014) A summariser based on human memory limitations and lexical competition. In: Proceedings of the 14th conference of the European chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Gothenburg, pp 732–741
Fang Y, Teufel S (2016) Improving argument overlap for proposition-based summarisation. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics (volume 2: short papers). Association for Computational Linguistics, Berlin, pp 479–485
Fang Y, Zhu H, Muszyńska E, Kuhnle A, Teufel S (2016) A proposition-based abstractive summariser. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers. The COLING 2016 Organizing Committee. Osaka, pp 567–578
Filippova K (2010) Multi-sentence compression: Finding shortest paths in word graphs. In: Proceedings of the 23rd international conference on computational linguistics (Coling 2010). Coling 2010 Organizing Committee, Beijing, pp 322–330
Fried D, Jansen P, Hahn-Powell G, Surdeanu M, Clark P (2015) Higher-order lexical semantic models for non-factoid answer reranking. Trans Assoc Comput Linguist 3:197–210
Google Scholar
Galanis D, Lampouras G, Androutsopoulos I (2012) Extractive multi-document summarization with integer linear programming and support vector regression. In: Proceedings of COLING 2012. The COLING 2012 Organizing Committee. Mumbai, pp 911–926
Gambhir M, Gupta V (2016) Recent automatic text summarization techniques: a survey. Artif Intell Rev 47:1–66
Article Google Scholar
Ganesan K, Zhai C, Han J (2010) Opinosis: a graph based approach to abstractive summarization of highly redundant opinions. In: Proceedings of the 23rd international conference on computational linguistics (Coling 2010). Coling 2010 Organizing Committee, Beijing, pp 340–348
Gao D, Li W, Zhang R (2013) Sequential summarization: A new application for timely updated twitter trending topics. In: Proceedings of the 51st annual meeting of the Association for Computational Linguistics (volume 2: short papers). Association for Computational Linguistics, Sofia, pp 567–571
Ge T, Pei W, Ji H, Li S, Chang B, Sui Z (2015) Bring you to the past: Automatic generation of topically relevant event chronicles. In: Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 1: long papers). Association for Computational Linguistics, Beijing, pp 575–585
Ge T, Cui L, Chang B, Li S, Zhou M, Sui Z (2016) News stream summarization using burst information networks. In: Proceedings of the 2016 conference on empirical methods in natural language processing. Association for Computational Linguistics, Austin, pp 784–794
Genest PE, Lapalme G (2012) Fully abstractive approach to guided summarization. In: Proceedings of the 50th annual meeting of the Association for Computational Linguistics (volume 2: short papers). Association for Computational Linguistics, Jeju Island, pp 354–358
Gerani S, Mehdad Y, Carenini G, Ng RT, Nejat B (2014) Abstractive summarization of product reviews using discourse structure. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, pp 1602–1613
Gillenwater J, Kulesza A, Taskar B (2012) Discovering diverse and salient threads in document collections. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning. Association for Computational Linguistics, Jeju Island, pp 710–720
Gillick D, Favre B, Hakkani-Tur D (2008) The ICSI summarization system at TAC 2008. In: Proceedings of the text understanding conference
Gillick D, Favre B, Hakkani-Tur D, Bohnet B, Liu Y, Xie S (2009) The ICSI/UTD summarization system at TAC 2009. In: Proceedings of the second text analysis conference. National Institute of Standards and Technology, Gaithersburg
Gorinski PJ, Lapata M (2015) Movie script summarization as graph-based scene extraction. In: Proceedings of the 2015 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Denver, pp 1066–1076
Graham Y (2015) Re-evaluating automatic summarization with bleu and 192 shades of rouge. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 128–137
Gu J, Lu Z, Li H, Li VO (2016) Incorporating copying mechanism in sequence-to-sequence learning. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Berlin, pp 1631–1640
Gulcehre C, Ahn S, Nallapati R, Zhou B, Bengio Y (2016) Pointing the unknown words. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Berlin, pp 140–149
Haghighi A, Vanderwende L (2009) Exploring content models for multi-document summarization. In: Proceedings of human language technologies: the 2009 annual conference of the North American chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Boulder, pp 362–370
He L, Li W, Zhuge H (2016) Exploring differential topic models for comparative summarization of scientific papers. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers. The COLING 2016 Organizing Committee, Osaka, pp 1028–1038
He Z, Chen C, Bu J, Wang C, Zhang L, Cai D, He X (2012) Document summarization based on data reconstruction. In: AAAI conference on artificial intelligence
Hirao T, Yoshida Y, Nishino M, Yasuda N, Nagata M (2013) Single-document summarization as a tree knapsack problem. In: Proceedings of the 2013 conference on empirical methods in natural language processing. Association for Computational Linguistics, Seattle, pp 1515–1520
Hong K, Nenkova A (2014) Improving the estimation of word importance for news multi-document summarization. In: Proceedings of the 14th conference of the European chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Gothenburg, pp 712–721
Hong K, Conroy J, Favre B, Kulesza A, Lin H, Nenkova A (2014) A repository of state of the art and competitive baseline summaries for generic news summarization. In: Calzolari N, Choukri K, Declerck T, Loftsson H, Maegaard B, Mariani J, Moreno A, Odijk J, Piperidis S (eds) Proceedings of the ninth international conference on language resources and evaluation (LREC’14). European Language Resources Association (ELRA), Reykjavik, pp 1608–1616, aCL Anthology Identifier: L14-1070
Hong K, Marcus M, Nenkova A (2015) System combination for multi-document summarization. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 107–117
Hovy E, Lin CY, Zhou L, Fukumoto J (2006) Automated summarization evaluation with basic elements. In: Proceedings of the Fifth conference on language resources and evaluation (LREC 2006), Citeseer, pp 604–611
Hu B, Chen Q, Zhu F (2015) Lcsts: A large scale chinese short text summarization dataset. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 1967–1972
Hu P, Ji D, Teng C, Guo Y (2012) Context-enhanced personalized social summarization. In: Proceedings of COLING 2012. The COLING 2012 Organizing Committee, Mumbai, pp 1223–1238
Hu Y, Wan X (2015) Ppsgen: Learning-based presentation slides generation for academic papers. IEEE Trans Knowl Data Eng 27(4):1085–1097. doi:10.1109/TKDE.2014.2359652
Article Google Scholar
Huang X, Wan X, Xiao J (2011) Comparative news summarization using linear programming. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Portland, pp 648–653
Iyer S, Konstas I, Cheung A, Zettlemoyer L (2016) Summarizing source code using a neural attention model. In: Proceedings of the 54th annual meeting of the Association for computational linguistics (volume 1: long papers). Association for Computational Linguistics, Berlin, pp 2073–2083
Jayanth J, Sundararaj J, Bhattacharyya P (2015) Monotone submodularity in opinion summaries. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 169–178
Jha R, Finegan-Dollak C, King B, Coke R, Radev D (2015) Content models for survey generation: a factoid-based evaluation. In: Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 1: long papers). Association for Computational Linguistics, Beijing, pp 441–450
Ji H, Favre B, Lin WP, Gillick D, Hakkani-Tur D, Grishman R (2013) Open-domain multi-document summarization via information extraction: challenges and prospects. In: Poibeau T, Saggion H, Piskorski J, Yangarber R (eds) Multi-source, multilingual information extraction and summarization. Springer, Berlin, pp 177–201
Ji Y, Haffari G, Eisenstein J (2016) A latent variable recurrent neural network for discourse-driven language models. In: Proceedings of the 2016 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, San Diego, pp 332–342
Judd J, Kalita J (2013) Better twitter summaries? In: Proceedings of the 2013 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Atlanta, pp 445–449
Kedzie C, McKeown K, Diaz F (2015) Predicting salient updates for disaster summarization. In: Proceedings of the 53rd annual meeting of the Association for computational linguistics and the 7th international joint conference on natural language processing (volume 1: long papers). Association for Computational Linguistics, Beijing, pp 1608–1617
Kedzie C, Diaz F, McKeown K (2016) Real-time web scale event summarization using sequential decision making. In: International joint conference on artificial intelligence, pp 3754–3760
Kikuchi Y, Hirao T, Takamura H, Okumura M, Nagata M (2014) Single document summarization based on nested tree structure. In: Proceedings of the 52nd annual meeting of the Association for Computational Linguistics (volume 2: short papers). Association for Computational Linguistics, Baltimore, pp 315–320
Kikuchi Y, Neubig G, Sasano R, Takamura H, Okumura M (2016) Controlling output length in neural encoder-decoders. In: Proceedings of the 2016 conference on empirical methods in natural language processing. Association for Computational Linguistics, Austin, pp 1328–1338
Kim HD, Ganesan K, Sondhi P, Zhai CX (2011) Comprehensive review of opinion summarization. UIUC Technical Report, USA
Google Scholar
Kobayashi H, Noguchi M, Yatsuka T (2015) Summarization based on embedding distributions. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 1984–1989
Kågebäck M, Mogren O, Tahmasebi N, Dubhashi D (2014) Extractive summarization using continuous vector space models. In: Proceedings of the 2nd workshop on continuous vector space models and their compositionality (CVSC). Association for Computational Linguistics, Gothenburg, pp 31–39
Kulesza A, Taskar B (2011) Learning determinantal point processes. In: Proceedings of the 27th conference on uncertainty in artificial intelligence
Kulesza A, Taskar B (2012) Determinantal point processes for machine learning. Found Trends Mach Learn 5(2–3):123–286
Article MATH Google Scholar
Lei T, Barzilay R, Jaakkola T (2016) Rationalizing neural predictions. In: Proceedings of the 2016 conference on empirical methods in natural language processing. Association for Computational Linguistics, Austin, pp 107–117
Li C, Liu F, Weng F, Liu Y (2013) Document summarization via guided sentence compression. In: Proceedings of the 2013 conference on empirical methods in natural language processing. Association for Computational Linguistics, Seattle, pp 490–500
Li C, Qian X, Liu Y (2013) Using supervised bigram-based ilp for extractive summarization. In: Proceedings of the 51st Annual Meeting of the Association for computational linguistics (volume 1: long papers). Association for Computational Linguistics, Sofia, pp 1004–1013
Li C, Liu Y, Liu F, Zhao L, Weng F (2014) Improving multi-documents summarization by sentence compression based on expanded constituent parse trees. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, pp 691–701
Li C, Liu Y, Zhao L (2015) Improving update summarization via supervised ilp and sentence reranking. In: Proceedings of the 2015 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Denver, pp 1317–1322
Li C, Liu Y, Zhao L (2015) Using external resources and joint learning for bigram weighting in ilp-based multi-document summarization. In: Proceedings of the 2015 conference of the North American chapter of the Association for computational linguistics: human language technologies. Association for Computational Linguistics, Denver, pp 778–787
Li C, Wei Z, Liu Y, Jin Y, Huang F (2016) Using relevant public posts to enhance news article summarization. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers. The COLING 2016 Organizing Committee. Osaka, pp 557–566
Li J, Cardie C (2014) Timeline generation: tracking individuals on twitter. In: 23rd international world wide web conference, WWW ’14, Seoul, Republic of Korea, April 7–11, 2014, pp 643–652. doi:10.1145/2566486.2567969
Li J, Li S (2013) Evolutionary hierarchical dirichlet process for timeline summarization. In: Proceedings of the 51st annual meeting of the Association for Computational linguistics (volume 2: short papers). Association for Computational Linguistics, Sofia, pp 556–560
Li J, Li S (2013) A novel feature-based bayesian model for query focused multi-document summarization. Trans Assoc Comput Linguist 1:89–98
Google Scholar
Li J, Li S, Wang X, Tian Y, Chang B (2012) Update summarization using a multi-level hierarchical dirichlet process model. In: Proceedings of COLING 2012. The COLING 2012 Organizing Committee. Mumbai, pp 1603–1618
Li J, Gao W, Wei Z, Peng B, Wong KF (2015) Using content-level structures for summarizing microblog repost trees. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 2168–2178
Li J, Luong T, Jurafsky D (2015) A hierarchical neural autoencoder for paragraphs and documents. In: Proceedings of the 53rd annual meeting of the Association for Computational linguistics and the 7th international joint conference on natural language processing (volume 1: long papers). Association for Computational Linguistics, Beijing, pp 1106–1115
Li JJ, Thadani K, Stent A (2016) The role of discourse units in near-extractive summarization. In: Proceedings of the 17th annual meeting of the special interest group on discourse and dialogue. Association for Computational Linguistics, Los Angeles, pp 137–147
Li L, Zhou K, Xue G, Zha H, Yu Y (2009) Enhancing diversity, coverage and balance for summarization through structure learning. In: Proceedings of the 18th international conference on world wide web, WWW 2009, Madrid, Spain, April 20–24, 2009, pp 71–80. doi:10.1145/1526709.1526720
Li P, Bing L, Lam W, Li H, Liao Y (2015) Reader-aware multi-document summarization via sparse coding. In: International joint conference on artificial intelligence
Li Y, Li S (2014) Query-focused multi-document summarization: Combining a topic model with graph-based semi-supervised learning. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics: technical papers. Dublin City University and Association for Computational Linguistics, Dublin, pp 1197–1207
Liakata M, Dobnik S, Saha S, Batchelor C, Rebholz-Schuhmann D (2013) A discourse-driven content model for summarising scientific articles evaluated in a complex question answering task. In: Proceedings of the 2013 conference on empirical methods in natural language processing. Association for Computational Linguistics, Seattle, pp 747–757
Lin CY (2003) Improving summarization performance by sentence compression—a pilot study. In: Proceedings of the sixth international workshop on information retrieval with Asian languages. Association for Computational Linguistics, Sapporo, pp 1–8
Lin CY, Hovy E (2000) The automated acquisition of topic signatures for text summarization. In: Proceedings of the 18th conference on computational linguistics—volume 1. Association for Computational Linguistics, pp 495–501
Lin CY, Hovy E (2003) Automatic evaluation of summaries using n-gram co-occurrence statistics. In: Proceedings of the 2003 conference of the North American chapter of the Association for Computational Linguistics on human language technology—volume 1. Association for Computational Linguistics, pp 71–78
Lin H, Bilmes J (2010) Multi-document summarization via budgeted maximization of submodular functions. In: Human language technologies: the 2010 annual conference of the North American chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Los Angeles, pp 912–920
Lin H, Bilmes J (2011) A class of submodular functions for document summarization. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Portland, pp 510–520
Lin H, Bilmes JA (2012) Learning mixtures of submodular shells with application to document summarization. In: Proceedings of the 28th conference on uncertainty in artificial intelligence
Litvak M, Last M (2013) Cross-lingual training of summarization systems using annotated corpora in a foreign language. Inf Retr 16(5):629–656. doi:10.1007/s10791-012-9210-3
Article Google Scholar
Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167
Article MathSciNet Google Scholar
Liu F, Flanigan J, Thomson S, Sadeh N, Smith NA (2015) Toward abstractive summarization using semantic representations. In: Proceedings of the 2015 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Denver, pp 1077–1086
Liu H, Yu H, Deng ZH (2015) Multi-document summarization based on two-level sparse representation model. In: AAAI conference on artificial intelligence
Liu X, Li Y, Wei F, Zhou M (2012) Graph-based multi-tweet summarization using social signals. In: Proceedings of COLING 2012. The COLING 2012 Organizing Committee, Mumbai, pp 1699–1714
Liu Y, hua Zhong S, Li W (2012) Query-oriented multi-document summarization via unsupervised deep learning. In: AAAI conference on artificial intelligence, pp 1699–1705
Lloret E, Palomar M (2013) Towards automatic tweet generation: a comparative study from the text summarization perspective in the journalism genre. Expert Syst Appl 40(16):6624–6630. doi:10.1016/j.eswa.2013.06.021
Article Google Scholar
Louis A, Nenkova A (2013) Automatically assessing machine summary content without a gold standard. Comput Linguist 39(2):267–300
Article Google Scholar
Loza V, Lahiri S, Mihalcea R, Lai PH (2014) Building a dataset for summarization and keyword extraction from emails. In: Proceedings of the ninth international conference on language resources and evaluation (LREC’14). European Language Resources Association (ELRA), Reykjavik, Iceland, pp 2441–2446, aCL Anthology Identifier: L14-1028
Luo W, Litman D (2015) Summarizing student responses to reflection prompts. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 1955–1960
Ma S, Deng ZH, Yang Y (2016) An unsupervised multi-document summarization framework based on neural document model. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers. The COLING 2016 Organizing Committee, Osaka, pp 1514–1523
Mann WC, Thompson SA (1988) Rhetorical structure theory: toward a functional theory of text organization. Text Interdiscip J Study Discourse 8(3):243–281
Article Google Scholar
McDonald RT (2007) A study of global inference algorithms in multi-document summarization. In: Advances in information retrieval, 29th European conference on IR research, ECIR 2007, Rome, Italy, April 2–5, 2007, proceedings, pp 557–564
Metzler D, Kanungo T (2008) Machine learned sentence selection strategies for query-biased summarization. In: SIGIR learning to rank workshop, pp 40–47
Mihalcea R, Tarau P (2004) Textrank: bringing order into texts. In: Lin D, Wu D (eds) Proceedings of EMNLP 2004. Association for Computational Linguistics, Barcelona, pp 404–411
Morita H, Sasano R, Takamura H, Okumura M (2013) Subtree extractive summarization via submodular maximization. In: Proceedings of the 51st annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Sofia, pp 1023–1032
Nallapati R, Zhou B, glar Gulcehre C, Xiang B, (2016) Abstractive text summarization using sequence-to-sequence rnns and beyond. In: Proceedings of the 20th SIGNLL conference on computational natural language learning. Association for Computational Linguistics, Berlin, pp 280–290
Nemhauser GL, Wolsey LA, Fisher ML (1978) An analysis of approximations for maximizing submodular set functionsi. Math Program 14(1):265–294
Article MATH Google Scholar
Nenkova A, McKeown K (2012) A survey of text summarization techniques. In: Aggarwal CC, Zhai CX (eds) Mining text data. Springer, Berlin, pp 43–76
Nenkova A, Passonneau R (2004) Evaluating content selection in summarization: the pyramid method. In: Susan Dumais DM, Roukos S (eds) HLT-NAACL 2004: main proceedings. Association for Computational Linguistics, Boston, pp 145–152
Nenkova A, McKeown K et al (2011) Automatic summarization. Found Trends Inf Retr 5(2–3):103–233
Article Google Scholar
Ng JP, Abrecht V (2015) Better summarization evaluation with word embeddings for rouge. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 1925–1930
Ng JP, Bysani P, Lin Z, Kan MY, Tan CL (2012) Exploiting category-specific information for multi-document summarization. In: Proceedings of COLING 2012. The COLING 2012 Organizing Committee. Mumbai, pp 2093–2108
Ng JP, Chen Y, Kan MY, Li Z (2014) Exploiting timelines to enhance multi-document summarization. In: Proceedings of the 52nd annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Baltimore, pp 923–933
Nichols J, Mahmud J, Drews C (2012) Summarizing sporting events using twitter. In: Proceedings of the 2012 ACM international conference on intelligent user interfaces. ACM, pp 189–198
Nishikawa H, Arita K, Tanaka K, Hirao T, Makino T, Matsuo Y (2014) Learning to generate coherent summary with discriminative hidden semi-markov model. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics: technical papers. Dublin City University and Association for Computational Linguistics, Dublin, pp 1648–1659
Nishino M, Yasuda N, Hirao T, Si Minato, Nagata M (2015) A dynamic programming algorithm for tree trimming-based text summarization. In: Proceedings of the 2015 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Denver, pp 462–471
Olariu A (2014) Efficient online summarization of microblogging streams. In: Proceedings of the 14th conference of the European chapter of the Association for Computational Linguistics, volume 2: short papers. Association for Computational Linguistics, Gothenburg, pp 236–240
Owczarzak K, Conroy JM, Dang HT, Nenkova A (2012) An assessment of the accuracy of automatic evaluation in summarization. In: Proceedings of workshop on evaluation metrics and system comparison for automatic summarization. Association for Computational Linguistics, Montréal, pp 1–9
Oya T, Mehdad Y, Carenini G, Ng R (2014) A template-based abstractive meeting summarization: Leveraging summary and source text relationships. In: Proceedings of the 8th international natural language generation conference (INLG). Association for Computational Linguistics, Philadelphia, pp 45–53
Parveen D, Strube M (2015) Integrating importance, non-redundancy and coherence in graph-based extractive summarization. In: International joint conference on artificial intelligence
Parveen D, Ramsl HM, Strube M (2015) Topical coherence for graph-based extractive summarization. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 1949–1954
Passonneau RJ, Chen E, Guo W, Perin D (2013) Automated pyramid scoring of summaries using distributional semantics. In: Proceedings of the 51st annual meeting of the Association for Computational Linguistics (volume 2: short papers). Association for Computational Linguistics, Sofia, pp 143–147
Pei Y, Yin W, Fan Q, Huang L (2012) A supervised aggregation framework for multi-document summarization. In: Proceedings of COLING 2012. The COLING 2012 Organizing Committee. Mumbai, pp 2225–2242
Peyrard M, Eckle-Kohler J (2016) Optimizing an approximation of rouge - a problem-reduction approach to extractive multi-document summarization. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Berlin, pp 1825–1836
Pighin D, Cornolti M, Alfonseca E, Filippova K (2014) Modelling events through memory-based, open-ie patterns for abstractive summarization. In: Proceedings of the 52nd annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Baltimore, pp 892–901
Qazvinian V, Radev DR, Mohammad S, Dorr BJ, Zajic DM, Whidby M, Moon T (2013) Generating extractive summaries of scientific paradigms. J Artif Intell Res 46:165–201. doi:10.1613/jair.3732
MathSciNet Google Scholar
Qian X, Liu Y (2013) Fast joint compression and summarization via graph cuts. In: Proceedings of the 2013 conference on empirical methods in natural language processing. Association for Computational Linguistics, Seattle, pp 1492–1502
Radev DR, Jing H, Sty M, Tam D (2004) Centroid-based summarization of multiple documents. Inf Process Manag 40(6):919–938. doi:10.1016/j.ipm.2003.10.006
Article MATH Google Scholar
Rankel PA, Conroy JM, Dang HT, Nenkova A (2013) A decade of automatic content evaluation of news summaries: reassessing the state of the art. In: Proceedings of the 51st annual meeting of the Association for Computational Linguistics (volume 2: short papers). Association for Computational Linguistics, Sofia, pp 131–136
Ranzato M, Chopra S, Auli M, Zaremba W (2016) Sequence level training with recurrent neural networks. In: International conference on learning representations (ICLR)
Ren P, Wei F, CHEN Z, MA J, Zhou M (2016) A redundancy-aware sentence regression framework for extractive summarization. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers. The COLING 2016 Organizing Committee, Osaka, pp 33–43
Ren Z, de Rijke M (2015) Summarizing contrastive themes via hierarchical non-parametric processes. In: Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval, Santiago, Chile, August 9–3, 2015, pp 93–102. doi:10.1145/2766462.2767713
Rioux C, Hasan SA, Chali Y (2014) Fear the reaper: A system for automatic multi-document summarization with reinforcement learning. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, pp 681–690
Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496
Article Google Scholar
Ross S, Zhou J, Yue Y, Dey D, Bagnell D (2013) Learning policies for contextual submodular prediction. In: Proceedings of the 30th international conference on machine learning, ICML 2013, Atlanta, GA, USA, 16–21 June 2013, pp 1364–1372
Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 379–389
Saggion H (2013) Unsupervised learning summarization templates from concise summaries. In: Proceedings of the 2013 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Atlanta, pp 270–279
Schluter N, Søgaard A (2015) Unsupervised extractive summarization via coverage maximization with syntactic and semantic concepts. In: Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 2: short papers). Association for Computational Linguistics, Beijing, pp 840–844
Sharifi B, Hutton MA, Kalita J (2010) Summarizing microblogs automatically. In: Human language technologies: the 2010 annual conference of the North American chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Los Angeles, pp 685–688
Shen C, Li T (2011) Learning to rank for query-focused multi-document summarization. In: 2011 IEEE 11th international conference on data mining (ICDM). IEEE, pp 626–634
Shen D, Sun JT, Li H, Yang Q, Chen Z (2007) Document summarization using conditional random fields. In: International joint conference on artificial intelligence, vol 7, pp 2862–2867
Sidhaye P, Cheung JCK (2015) Indicative tweet generation: an extractive summarization problem? In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 138–147
Sipos R, Shivaswamy P, Joachims T (2012) Large-margin learning of submodular summarization models. In: Proceedings of the 13th conference of the European chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Avignon, pp 224–233
Snoek J, Larochelle H, Adams RP (2012) Practical bayesian optimization of machine learning algorithms. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems. Curran Associates, Inc., Lake Tahoe, Nevada, pp 2951–2959
Sukhbaatar S, Szlam A, Weston J, Fergus R (2015) End-to-end memory networks. Adv Neural Inf Process Syst 28:2440–2448
Google Scholar
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. Adv Neural Inf Process Syst 27:3104–3112
Google Scholar
Swisher K (2013) Yahoo paid $30 million in cash for 18 months of young summly entrepreneur’s time. http://allthingsd.com/20130325/yahoo-paid-30-million-in-cash-for-18-months-of-young-summly-entrepreneurs-time/. Accessed 30 Dec 2016
Takamura H, Yokono H, Okumura M (2011) Summarizing a document stream. In: Advances in information retrieval—33rd European conference on IR research, ECIR 2011, Dublin, Ireland, April 18–21, 2011. Proceedings, pp 177–188
Thadani K, McKeown K (2013) Supervised sentence fusion with single-stage inference. In: Proceedings of the sixth international joint conference on natural language processing. Asian Federation of Natural Language Processing, Nagoya, pp 1410–1418
Toutanova K, Brockett C, Tran KM, Amershi S (2016) A dataset and evaluation metrics for abstractive compression of sentences and short paragraphs. In: Proceedings of the 2016 conference on empirical methods in natural language processing. Association for Computational Linguistics, Austin, pp 340–350
Tran G, Herder E, Markert K (2015) Joint graphical models for date selection in timeline summarization. In: Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 1: long papers). Association for Computational Linguistics, Beijing, pp 1598–1607
Trione J, Favre B, Béchet F (2016) Beyond utterance extraction: summary recombination for speech summarization. Interspeech 2016:680–684
Article Google Scholar
Vanderwende L, Suzuki H, Brockett C, Nenkova A (2007) Beyond sumbasic: task-focused summarization with sentence simplification and lexical expansion. Inf Process Manag 43(6):1606–1618. doi:10.1016/j.ipm.2007.01.023
Article Google Scholar
Wan X (2011) Using bilingual information for cross-language document summarization. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Portland, pp 1546–1555
Wan X (2012) Update summarization based on co-ranking with constraints. In: Proceedings of COLING 2012: posters. The COLING 2012 Organizing Committee, Mumbai, pp 1291–1300
Wan X, Zhang J (2014) CTSUM: extracting more certain summaries for news articles. In: The 37th international ACM SIGIR conference on research and development in information retrieval, SIGIR ’14, Gold Coast , QLD, Australia, July 06–11, 2014, pp 787–796. doi:10.1145/2600428.2609559
Wang D, Li T (2012) Weighted consensus multi-document summarization. Inf Process Manag 48(3):513–523
Article Google Scholar
Wang D, Zhu S, Li T, Gong Y (2013) Comparative document summarization via discriminative sentence selection. TKDD 7(1):21–218. doi:10.1145/2435209.2435211
Article Google Scholar
Wang L, Cardie C (2013) Domain-independent abstract generation for focused meeting summarization. In: Proceedings of the 51st annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Sofia, pp 1395–1405
Wang L, Ling W (2016) Neural network-based abstract generation for opinions and arguments. In: Proceedings of the 2016 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, San Diego, pp 47–57
Wang L, Raghavan H, Castelli V, Florian R, Cardie C (2013) A sentence compression based framework to query-focused multi-document summarization. In: Proceedings of the 51st annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Sofia, pp 1384–1394
Wang L, Raghavan H, Cardie C, Castelli V (2014) Query-focused opinion summarization for user-generated content. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics: technical papers. Dublin City University and Association for Computational Linguistics, Dublin, pp 1660–1669
Wang WY, Mehdad Y, Radev DR, Stent A (2016) A low-rank approximation approach to learning joint embeddings of news stories and images for timeline summarization. In: Proceedings of the 2016 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, San Diego, pp 58–68
Wang X, Yoshida Y, Hirao T, Sudoh K, Nagata M (2015) Summarization based on task-oriented discourse parsing. IEEE/ACM Trans Audio Speech Lang Process 23(8):1358–1367. doi:10.1109/TASLP.2015.2432573
Article Google Scholar
Wang X, Nishino M, Hirao T, Sudoh K, Nagata M (2016) Exploring text links for coherent multi-document summarization. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers. The COLING 2016 Organizing Committee, Osaka, pp 213–223
Woodsend K, Lapata M (2012) Multiple aspect summarization using integer linear programming. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning. Association for Computational Linguistics, Jeju Island, pp 233–243
Xiong W, Litman D (2014) Empirical analysis of exploiting review helpfulness for extractive summarization of online reviews. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics: technical papers. Dublin City University and Association for Computational Linguistics, Dublin, pp 1985–1995
Xu H, Martin E, Mahidadia A (2015) Extractive summarisation based on keyword profile and language model. In: Proceedings of the 2015 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Denver, pp 123–132
Yan R, Kong L, Huang C, Wan X, Li X, Zhang Y (2011) Timeline generation through evolutionary trans-temporal summarization. In: Proceedings of the 2011 conference on empirical methods in natural language processing. Association for Computational Linguistics, Edinburgh, pp 433–443
Yan R, Wan X, Otterbacher J, Kong L, Li X, Zhang Y (2011) Evolutionary timeline summarization: a balanced optimization framework via iterative substitution. In: Proceeding of the 34th international ACM SIGIR conference on research and development in information retrieval, SIGIR 2011, Beijing, China, July 25–29, 2011, pp 745–754, doi:10.1145/2009916.2010016
Yan R, Jiang H, Lapata M, Lin SD, Lv X, Li X (2013) I, poet: automatic chinese poetry composition through a generative summarization framework under constrained optimization. In: Proceedings of the twenty-third international joint conference on artificial intelligence. AAAI Press, pp 2197–2203
Yan S, Wan X (2014) Srrank: leveraging semantic roles for extractive multi-document summarization. IEEE/ACM Trans Audio Speech Lang Process 22(12):2048–2058
Article MathSciNet Google Scholar
Yang L, Ai Q, Spina D, Chen RC, Pang L, Croft WB, Guo J, Scholer F (2016) Beyond factoid QA: effective methods for non-factoid answer sentence retrieval. In: European conference on information retrieval, Springer, Berlin pp 115–128
Yang Z, Cai K, Tang J, Zhang L, Su Z, Li J (2011) Social context summarization. In: Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval. ACM, pp 255–264
Yao J, Wan X, Xiao J (2015) Compressive document summarization via sparse optimization. In: International joint conference on artificial intelligence
Yao J, Wan X, Xiao J (2015) Phrase-based compressive cross-language summarization. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 118–127
Yin W, Pei Y (2015) Optimizing sentence modeling and selection for document summarization. In: International joint conference on artificial intelligence
Yogatama D, Liu F, Smith NA (2015) Extractive summarization by maximizing semantic volume. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, pp 1961–1966
Yoshida Y, Suzuki J, Hirao T, Nagata M (2014) Dependency-based discourse parser for single-document summarization. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, pp 1834–1839
You O, Li W, Li S, Lu Q (2011) Applying regression models to query-focused multi-document summarization. Inf Process Manag 47(2):227–237. doi:10.1016/j.ipm.2010.03.005
Article Google Scholar
Yu N, Huang M, Shi Y, zhu x, (2016) Product review summarization by exploiting phrase properties. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers. The COLING 2016 Organizing Committee, Osaka, pp 1113–1124
Zajic DM, Dorr B, Lin J, Schwartz R (2006) Sentence compression as a component of a multi-document summarization system. In: Proceedings of the 2006 document understanding workshop, New York
Zhang J, Yao J, Wan X (2016a) Towards constructing sports news from live text commentary. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Berlin, pp 1361–1371
Zhang J, Zhou Y, Zong C (2016b) Abstractive cross-language summarization via translation model enhanced predicate argument structure fusing. IEEE/ACM Trans Audio Speech Lang Process 24(10):1842–1853
Article Google Scholar
Zhang R, Li W, Gao D (2013) Towards content-level coherence with aspect-guided summarization. TSLP 10(1):2:1–2:22. doi:10.1145/2442076.2442078
Article Google Scholar
Zhang Y, Xia Y, Liu Y, Wang W (2015) Clustering sentences with density peaks for multi-document summarization. In: Proceedings of the 2015 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, Denver, pp 1262–1267
Zhao WX, Guo Y, Yan R, He Y, Li X (2013) Timeline generation with social attention. In: The 36th international ACM SIGIR conference on research and development in information retrieval, SIGIR ’13, Dublin, Ireland, July 28–August 01, 2013, pp 1061–1064. doi:10.1145/2484028.2484103
Zopf M, Loza Mencía E, Fürnkranz J (2016) Sequential clustering and contextual importance measures for incremental update summarization. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers. The COLING 2016 Organizing Committee, Osaka, pp 1071–1082
Zopf M, Mencıa EL, Fürnkranz J (2016b) Beyond centrality and structural features: Learning information importance for text summarization. In: Proceedings of the 20th SIGNLL conference on computational natural language learning. Association for Computational Linguistics, Berlin, pp 84–94

Download references

Acknowledgements

The work was supported by National Hi-Tech Research and Development Program (863 Program) of China (2015AA015403), National Natural Science Foundation of China (61331011) and IBM Global Faculty Award Program. We would like to thank anonymous reviewers for feedbacks and Jiwei Tan for reporting typos in an earlier draft of this paper.

Author information

Authors and Affiliations

Institute of Computer Science and Technology, Peking University, Beijing, 100871, China
Jin-ge Yao, Xiaojun Wan & Jianguo Xiao
The MOE Key Laboratory of Computational Linguistics, Peking University, Beijing, China
Jin-ge Yao, Xiaojun Wan & Jianguo Xiao

Authors

Jin-ge Yao
View author publications
You can also search for this author inPubMed Google Scholar
Xiaojun Wan
View author publications
You can also search for this author inPubMed Google Scholar
Jianguo Xiao
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Xiaojun Wan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yao, Jg., Wan, X. & Xiao, J. Recent advances in document summarization. Knowl Inf Syst 53, 297–336 (2017). https://doi.org/10.1007/s10115-017-1042-4

Download citation

Received: 27 October 2016
Accepted: 17 March 2017
Published: 28 March 2017
Issue Date: November 2017
DOI: https://doi.org/10.1007/s10115-017-1042-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Recent advances in document summarization

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Automatic Text Summarization Methods: A Comprehensive Review

A Survey on Automatic Text Summarisation

State-of-the-art approach to extractive text summarization: a comprehensive review

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now