Skip to main content
Log in

Text summarisation in progress: a literature review

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

This paper contains a large literature review in the research field of Text Summarisation (TS) based on Human Language Technologies (HLT). TS helps users manage the vast amount of information available, by condensing documents’ content and extracting the most relevant facts or topics included in them. The rapid development of emerging technologies poses new challenges to this research field, which still need to be solved. Therefore, it is essential to analyse its progress over the years, and provide an overview of the past, present and future directions, highlighting the main advances achieved and outlining remaining limitations. With this purpose, several important aspects are addressed within the scope of this survey. On the one hand, the paper aims at giving a general perspective on the state-of-the-art, describing the main concepts, as well as different summarisation approaches, and relevant international forums. Furthermore, it is important to stress upon the fact that the birth of new requirements and scenarios has led to new types of summaries with specific purposes (e.g. sentiment-based summaries), and novel domains within which TS has proven to be also suitable for (e.g. blogs). In addition, TS is successfully combined with a number of intelligent systems based on HLT (e.g. information retrieval, question answering, and text classification). On the other hand, a deep study of the evaluation of summaries is also conducted in this paper, where the existing methodologies and systems are explained, as well as new research that has emerged concerning the automatic evaluation of summaries’ quality. Finally, some thoughts about TS in general and its future will encourage the reader to think of novel approaches, applications and lines to conduct research in the next years. The analysis of these issues allows the reader to have a wide and useful background on the main important aspects of this research field.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agnihotri L, Kender JR, Dimitrova N, Zimmerman J (2005) User study for generating personalized summary profiles. In: Proceedings of the IEEE international conference on multimedia and expo (ICME). pp 1094–1097

  • Ahmet A, Gaizauskas R (2010) Generating image descriptions using dependency relational patterns. In: Proceedings of the 48th annual meeting of the association for computational linguistics

  • Aker A, Gaizauskas R (2009) Summary generation for toponym-referenced images using object type language models. In: Proceedings of the international conference on recent advances in natural language processing (RANLP-2009)

  • Aker A, Gaizauskas R (2010) Model summaries for OPTlocation-related images. In: Proceedings of language resources and evaluation

  • Amigó E, Gonzalo J, Peñas A, Verdejo F (2005) QARLA: a framework for the evaluation of text summarization systems. In: ACL ’05: proceedings of the 43rd annual meeting on association for computational linguistics. pp 280–289

  • Ando R, Boguraev B, Byrd R, Neff M (2005) Visualization-enabled multi-document summarization by Iterative Residual Rescaling. Nat Lang Eng 11(1): 67–86

    Article  Google Scholar 

  • Angheluta R, Busser RD, Francine Moens M (2002) The use of topic segmentation for automatic summarization. In: Proceedings of the ACL-2002 post-conference workshop on automatic summarization. pp 66–70

  • Aone C, Okurowski ME, Gorlinsky J (1998) Trainable, scalable summarization using robust NLP and machine learning. In: Proceedings of the 36th annual meeting of the association for computational linguistics and 17th international conference on computational linguistics, vol 1. pp 62–66

  • Azzam S, Humphreys K, Gaizauskas R (1999) Using coreference chains for text summarization. In: Proceedings of the ACL’99 workshop on coreference and its applications

  • Balahur A, Montoyo A (2008) Multilingual feature-driven opinion extraction and summarization from customer reviews. In: Proceedings of 13th international conference on applications of natural language to information systems. pp 345–346

  • Balahur A, Lloret E, Ferrández O, Montoyo A, Palomar M, Muñoz R (2008) The DLSIUAES team’s participation in the TAC 2008 tracks. In: Proceedings of the text analysis conference (TAC)

  • Balahur A, Lloret E, Boldrini E, Montoyo A, Palomar M, Martinez-Barco P (2009) Summarizing threads in blogs using opinion polarity. In: Proceedings of the international workshop on events in emerging text types (eETTs). pp 5–13

  • Balahur-Dobrescu A, Kabadjov M, Steinberger J, Steinberger R, Montoyo A (2009) Summarizing opinions in blog threads. In: Proceedings of the Pacific Asia conference on language, information and computation conference. pp 606–613

  • Baldwin B, Morton TS (1998) Dynamic coreference-based summarization. In: Proceedings of the third conference on empirical methods in natural language processing (EMNLP-3)

  • Barzilay R, Elhadad M (1999) Using lexical chains for text summarization. In: Advances in automatic text summarization. pp 111–122

  • Barzilay R, Lapata M (2005) Modeling local coherence: an entity-based approach. In: Proceedings of the 43rd annual meeting of the association for computational linguistics (ACL’05). pp 141–148

  • Barzilay R, McKeown KR (2005) Sentence fusion for multidocument news summarization. Comput Linguist 31(3): 297–328

    Article  Google Scholar 

  • Beineke P, Hastie T, Manning C, Vaithyanathan S (2004) An exploration of sentiment summarization. In: Proceedings of the AAAI spring symposium on exploring attitude and affect in text: theories and applications

  • Bellemare S, Bergler S, Witte R (2008) ERSS at TAC 2008. In: Proceedings of the text analysis conference (TAC)

  • Belz A (2008) Automatic Generation of Weather Forecast Texts Using Comprehensive Probabilistic Generation-space Models. Nat Lang Eng 14(4): 431–455

    Article  Google Scholar 

  • Berkovsky S, Baldwin T, Zukerman I (2008) Aspect-based personalized text summarization. In: Proceedings of the 5th international conference on adaptive hypermedia and adaptive web-based systems. pp 267–270

  • Biadsy F, Hirschberg J, Filatova E (2008) An unsupervised approach to biography production using Wikipedia. In: Proceedings of ACL-08: HLT. pp 807–815

  • Boguraev BK, Neff MS (2000) Discourse segmentation in aid of document summarization. In: Proceedings of the 33rd Hawaii international conference on system sciences, vol 3. p 3004

  • Bossard A, Généreux M, Poibeau T (2008) Description of the LIPN systems at TAC 2008: summarizing information and opinions. In: Proceedings of the text analysis conference (TAC)

  • Branny E (2007) Automatic summary evaluation based on text grammars. J Digit Inf 8(3). http://journals.tdl.org/jodi/article/viewArticle/232

  • Burges C, Shaked T, Renshaw E, Lazier A, Deeds M, Hamilton N, Hullender G (2005) Learning to rank using gradient descent. In: Proceedings of the 22nd international conference on Machine learning. pp 89–96

  • Carenini G, Cheung JCK (2008) Extractive vs. NLG-based abstractive summarization of evaluative text: the effect of corpus controversiality. In: Proceedings of the fifth international natural language generation conference, ACL 2008. pp 33–40

  • Cesarano C, Mazzeo A, Picariello A (2007) A system for summary-document similarity in notary domain. International Workshop on Database Expert Syst Appl:254–258

  • Ceylan H, Mihalcea R (2009) The decomposition of human-written book summaries. In: Proceedings of the 10th international conference on computational linguistics and intelligent text processing (CICLing ’09). pp 582–593

  • Cole, R (ed) (1997) Survey of the state of the art in human language technology. Cambridge University Press, Cambridge

    Google Scholar 

  • Conroy J, Schlesinger J (2008) CLASSY at TAC 2008 Metrics. In: Proceedings of the text analysis conference (TAC)

  • Conroy JM, Dang HT (2008) Mind the gap: dangers of divorcing evaluations of summary content from linguistic quality. In: Proceedings of the 22nd international conference on computational linguistics (Coling 2008). pp 145–152

  • Conroy JM, O’leary DP (2001) Text summarization via hidden Markov models. In: SIGIR ’01: proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval. pp 406–407

  • Conroy JM, Schlesinger JD, O’Leary DP (2009) CLASSY 2009: summarization and metrics. In: Proceedings of the text analysis conference (TAC)

  • Cristea D, Postolache O, Pistol I (2005) Summarisation through discourse structure. In: Proceedings of the computational linguistics and intelligent text processing, 6th International conference (CICLing 2005). pp 632–644

  • Cunha ID, Fernández S, Velázquez-Morales P, Vivaldi J, SanJuan E, Moreno JMT (2007) A new hybrid summarizer based on vector space model, statistical physics and linguistics. In: MICAI 2007: advances in artificial intelligence. pp 872–882

  • Dang HT (2006) Overview of DUC 2006. In: The document understanding workshop (presented at the HLT/NAACL). Brooklyn, New York, USA

  • Demner-Fushman D, Lin J (2006) Answer extraction, semantic clustering, and extractive summarization for clinical question answering. In: Proceedings of the 21st international conference on computational linguistics and the 44th annual meeting of the association for computational linguistics. pp 841–848

  • Deschacht K, Moens MF (2007) Text analysis for automatic image annotation. In: Proceedings of the 45th annual meeting of the association of computational linguistics. pp 1000–1007

  • Díaz A, Gervás P (2007) User-model based Personalized Summarization. Inf Process Manag 43(6): 1715–1734

    Article  Google Scholar 

  • Donaway RL, Drummey KW, Mather LA (2000) A comparison of rankings produced by summarization evaluation measures. In: Proceedings of NAACL-ANLP 2000 workshop on automatic summarization. pp 69–78

  • Dunlavy DM, O’Leary DP, Conroy JM, Schlesinger JD (2007) QCS: A system for querying, clustering and summarizing documents. Inf Process Manag 43(6): 1588–1605

    Article  Google Scholar 

  • Edmundson HP (1969) New methods in automatic extracting. In: Mani I, Maybury M (eds) Advances in automatic text summarization. pp 23–42

  • Elsner M, Charniak E (2008) Coreference-inspired coherence modeling. In: Proceedings of ACL-08: HLT, short papers. pp 41–44

  • Ercan G, Cicekli I (2008) Lexical cohesion based topic modeling for summarization. In: Proceedings of the 9th international conference in computational linguistics and intelligent text processing. pp 582–592

  • Erkan G, Radev DR (2004) LexRank: Graph-based Lexical Centrality as Salience in Text Summarization. J Artif Intell Res (JAIR) 22: 457–479

    Google Scholar 

  • Fan J, Gao Y, Luo H, Keim DA, Li Z (2008) A novel approach to enable semantic and visual image summarization for exploratory image search. In: MIR ’08: proceeding of the 1st ACM international conference on multimedia information retrieval. pp 358–365

  • Fellbaum C (1998) WordNet: an electronical lexical database. The MIT Press, Cambridge

    Google Scholar 

  • Feng Y, Lapata M (2008) Automatic image annotation using auxiliary text information. In: Proceedings of ACL-08: HLT. pp 272–280

  • Filatova E, Hatzivassiloglou V (2004) Event-based extractive summarization. In: Marie-Francine Moens SS (ed) Text summarization branches out: proceedings of the ACL-04 workshop. pp 104–111

  • Fisher S, Dunlop A, Roark B, Chen Y, Burmeister J (2009) OHSU summarization and entity linking systems. In: Proceedings of the text analysis conference (TAC)

  • Fiszman M, Rindflesch TC, Kilicoglu H (2004) Abstraction summarization for managing the biomedical research literature. In: Moldovan D, Girju R (eds) HLT-NAACL 2004: workshop on computational lexical semantics. pp 76–83

  • Fuentes M, González E, Ferrés D, Rodríguez H (2005) QASUM-TALP at DUC 2005 automatically evaluated with a pyramid based metric. In: The document understanding workshop (presented at the HLT/EMNLP annual meeting)

  • Fuentes M, Alfonseca E, Rodríguez H (2007) Support vector machines for query-focused summarization trained and evaluated on pyramid data. In: Proceedings of the 45th annual meeting of the association for computational linguistics companion volume proceedings of the demo and poster sessions. pp 57–60

  • Fukushima T, Okumura M (2001) Text summarization challenge: text summarization evaluation at NTCIR workshop 2. In: Proceedings of the second NTCIR workshop meeting on evaluation of chinese and japanese text retrieval and text summarization. pp 9–13

  • Giannakopoulos G, Karkaletsis V (2009) N-GRAM GRAPHS: representing documents and document sets in summary system evaluation. In: Proceedings of the text analysis conference (TAC)

  • Giannakopoulos G, Karkaletsis V, Vouros G (2008a) Testing the use of n-gram graphs in summarization sub-tasks. In: Proceedings of the text analysis conference (TAC)

  • Giannakopoulos G, Karkaletsis V, Vouros G, Stamatopoulos P (2008) Summarization System Evaluation Revisited: N-gram graphs. ACM Trans Speech Lang Process 5(3): 1–39

    Article  Google Scholar 

  • Goldstein J, Mittal V, Carbonell J, Kantrowitz M (2000) Multi-document summarization by sentence extraction. In: NAACL-ANLP 2000 workshop on automatic Summarization. pp. 40–48

  • Gonçalves PN, Rino L, Vieira R (2008) Summarizing and referring: towards cohesive extracts. In: DocEng ’08: proceeding of the eighth ACM symposium on document engineering. pp 253–256

  • Gotti F, Lapalme G, Nerima L, Wehrli E (2007) GOFAISUM: a symbolic summarizer for DUC. In: The document understanding workshop (presented at the HLT/NAACL)

  • Grosz BJ, Weinstein S, Joshi AK (1995) Centering: A Framework for Modeling the Local Coherence of Discourse. Comput Linguist 21(2): 203–225

    Google Scholar 

  • Harabagiu S, Lacatusu F (2005) Topic themes for multi-document summarization. In: SIGIR ’05: proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval. pp 202–209

  • Hasler L (2007) From extracts to abstracts: human summary production operations for computer-aided summarisation. In: Proceedings of the RANLP 2007 workshop on computer-aided language processing (CALP). pp 11–18

  • Hasler L (2008) Centering theory for evaluation of coherence in computer-aided summaries. In: (ELRA) ELRA (ed) Proceedings of the sixth international conference on language resources and evaluation (LREC’08)

  • Hassel M (2007) Resource lean and portable automatic text summarization. PhD thesis, Department of Numerical Analysis and Computer Science, Royal Institute of Technology

  • He L, Sanocki E, Gupta A, Grudin J (1999) Auto-summarization of audio-video presentations. In: MULTIMEDIA ’99: proceedings of the seventh ACM international conference on multimedia (Part 1). pp 489–498

  • He T, Chen J, Gui Z, Li F (2008) CCNU at TAC 2008: proceeding on using semantic method for automated summarization. In: Proceedings of the text analysis conference (TAC)

  • Hearst MA (1997) TextTiling: segmenting text into multi-paragraph subtopic passages. Comput Linguist 23(1): 33–64

    Google Scholar 

  • Hirao T, Okumura M, Fukusima T, Nanba H (2005) Text summarization challenge 3—text summarization evaluation at NTCIR workshop 4. In: Proceedings of the fourth NTCIR workshop on research in information access technologies information retrieval, question answering and summarization. pp 407–411

  • Hovy E, Lin CY (1999) Automated multilingual text summarization and its evaluation. Technical report Information Sciences Institute, University of Southern California

  • Hovy E, Lin CY, Zhou L, Fukumoto J (2006) Automated summarization evaluation with basic elements. In: Proceedings of the 5th international conference on language resources and evaluation (LREC)

  • Jaoua M, Hamadou AB (2003) Automatic text summarization of scientific articles based on classification of extract’s Population. In: Proceedings of computational linguistics and intelligent text processing, 4th international conference. pp 623–634

  • Jing H (2002) Using hidden Markov modeling to decompose human-written summaries. Comput Linguist 28(4): 527–543

    Article  Google Scholar 

  • Jing H, McKeown KR (2000) Cut and paste based text summarization. In: Proceedings of the 1st North American chapter of the association for computational linguistics Conference. pp 178–185

  • Kaisser M, Hearst MA, Lowe JB (2008) Improving search results quality by customizing summary lengths. In: Proceedings of ACL-08: HLT. pp 701–709

  • Kan MY, Klavans JL (2002) Using librarian techniques in automatic text summarization for information retrieval. In: JCDL ’02: proceedings of the 2nd ACM/IEEE-CS joint conference on digital libraries. pp 36–45

  • Kan MY, Klavans JL, Mckeown KR (2002) Using the annotated bibliography as a resource for indicative summarization. In: Proceedings of the language resources and evaluation conference. pp 1746–1752

  • Katragadda R (2010) GEMS: generative modeling for evaluation of summaries. In: Proceedings of the 11th international conference on computational linguistics and intelligent text processing, CICLing. pp 724–735

  • Kazantseva A (2006) An approach to summarizing short stories. In: Proceedings of the student research workshop at the 11th conference of the European chapter of the association for computational linguistics. pp 55–62

  • Ker SJ, Chen JN (2000) A text categorization based on summarization technique. In: Proceedings of the ACL-2000 workshop on recent advances in natural language processing and information retrieval. pp 79–83

  • Khan AU, Khan S, Mahmood W (2005) MRST: a new technique for information summarization. In: The second world enformatika conference, WEC’05. pp 249–252

  • Kumar C, Pingali P, Varma V (2008) Generating personalized summaries using publicly available web documents. In: Proceedings of the 2008 IEEE/WIC/ACM international conference on web intelligence and international conference on intelligent agent technology. pp 103–106

  • Kumar M, Das D, Agarwal S, Rudnicky A (2009) Non-textual event summarization by applying machine learning to template-based language generation. In: Proceedings of the 2009 workshop on language generation and summarisation (UCNLG + Sum 2009). pp 67–71

  • Kuo JJ, Chen HH (2008) Multidocument Summary Generation: Using Informative and Event Words. ACM Trans Asian Lang Inf Process (TALIP) 7(1): 1–23

    Article  Google Scholar 

  • Kupiec J, Pedersen J, Chen F (1995) A trainable document summarizer. In: SIGIR ’95: proceedings of the 18th annual international ACM SIGIR conference on research and development in information retrieval. pp 68–73

  • Lapata M, Barzilay R (2005) Automatic evaluation of text coherence: models and representations. In: Proceedings of the 19th international joint conference on artificial intelligence. pp 1085–1090

  • Lerman K, McDonald R (2009) Contrastive summarization: an experiment with consumer reviews. In: Proceedings of human language technologies: the 2009 annual conference of the North American chapter of the association for computational linguistics, companion volume: short papers. pp 113–116

  • Lerman K, Blair-Goldensohn S, McDonald R (2009) Sentiment summarization: evaluating and learning user preferences. In: Proceedings of the 12th conference of the European chapter of the ACL (EACL 2009). pp 514–522

  • Li S, Ouyang Y, Wang W, Sun B (2007) Multi-document summarization using support vector regression. In: The document understanding workshop (presented at the HLT/NAACL). Rochester, New York USA

  • Li S, Wan W, Wang C (2008) TAC 2008 update summarization task of ICL. In: Proceedings of the text analysis conference (TAC)

  • Li S, Wang W, Zhang Y (2009) Tac 2009 update summarization of icl. In: Proceedings of the text analysis conference (TAC)

  • Lin CY (2004) ROUGE: a package for automatic evaluation of summaries. In: Proceedings of ACL text summarization workshop. pp 74–81

  • Lin CY, Hovy E (2000) The automated acquisition of topic signatures for text summarization. In: Proceedings of the 18th conference on computational linguistics. pp 495–501

  • Liu F, Liu Y (2008) Correlation between ROUGE and human evaluation of extractive meeting summaries. In: Proceedings of ACL-08: HLT, short papers. pp 201–204

  • Liu M, Yu B, Fang F, Sun H (2009) TAC 2009 update summarization task of WUST. In: Proceedings of the text analysis conference (TAC)

  • Lloret E, Palomar M (2009) A gradual combination of features for building automatic summarisation systems. In: Proceedings of the 12th international conference on text, speech and dialogue (TSD). pp 16–23

  • Lloret E, Ferrández O, Muñoz R, Palomar M (2008) A text summarization approach under the influence of textual entailment. In: Proceedings of the 5th international workshop on natural language processing and cognitive science (NLPCS 2008). pp 22–31

  • Lloret E, Balahur A, Palomar M, Montoyo A (2009) Towards building a competitive opinion summarization system: challenges and keys. In: Proceedings of the NAACL. Student Research Workshop and Doctoral Consortium. pp 72–77

  • Lloret E, Saggion H, Palomar M (2010) Experiments on summary-based opinion classification. In: Proceedings of the NAACL HLT 2010 workshop on computational approaches to analysis and generation of emotion in text. pp 107–115

  • Luhn HP (1958) The automatic creation of literature abstracts. In: Advances in automatic text summarization. pp 15–22

  • Mani I (2001) Automatic summarization. John Benjamins Publishing Co. Amsterdam, Philadelphia, USA

  • Mani I (2001b) Summarization evaluation: an overview. In: Proceedings of the North American chapter of the association for computational linguistics (NAACL). Workshop on Automatic Summarization

  • Mani I, Maybury MT (1999) Advances in automatic text summarization. The MIT Press, Cambridge

    Google Scholar 

  • Mani I, House D, Klein G, Hirschman L, Firmin T, Sundheim B (1999) The TIPSTER SUMMAC text summarization evaluation. In: Proceedings of the ninth conference on European chapter of the association for computational linguistics. pp 77–85

  • Mani I, Klein G, House D, Hirschman L, Firmin T, Sundheim B (2002) SUMMAC: a text summarization evaluation. Nat Lang Eng 8(1): 43–68

    Article  Google Scholar 

  • Mann WC, Thompson SA (1988) Rhetorical structure theory: Toward a functional theory of text organization. Text 8(3): 243–281

    Article  Google Scholar 

  • Manning CD, Raghavan P, Schtze H (2008) Introduction to information retrieval. Cambridge University Press, New York, NY, USA

    MATH  Google Scholar 

  • Marcu D (1999) Discourse trees are good indicators of importance in text. In: Advances in automatic text summarization. pp 123–136

  • McCargar V (2005) Statistical Approaches to Automatic Text Summarization. Bull Am Soc Inf Sci Technol 30(4): 21–25

    Article  Google Scholar 

  • Medelyan O (2007) Computing lexical chains with graph clustering. In: Proceedings of the ACL 2007 student research workshop. pp 85–90

  • Mihalcea R (2004) Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: Proceedings of the ACL 2004 on interactive poster and demonstration sessions. p 20

  • Mihalcea R, Ceylan H (2007) Explorations in automatic book summarization. In: Proceedings of the joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL). pp 380–389

  • Mille S, Wanner L (2008) Multilingual summarization in practice: the case of patent claims. In: Proceedings of the 12th European association of machine translation conference. pp 120–129

  • Minel JL, Nugier S, Piat G (1997) How to appreciate the quality of automatic text summarization? Examples of FAN and MLUCE protocols and their results on SERAPHIN. In: Proceedings of intelligent scalable text summarization workshop in conjunction with the European chapter of the association of computational linguistics (EACL). pp 25–30

  • Mitkov R, Evans R, Orasan C, Ha LA, Pekar V (2007) Anaphora resolution: to what extent does it help NLP applications? In: Proceedings of the 6th discourse anaphora and anaphor resolution colloquium. pp 179–190

  • Mohammad S, Dorr B, Egan M, Hassan A, Muthukrishan P, Qazvinian V, Radev D, Zajic D (2009) Using citations to generate surveys of scientific paradigms. In: Proceedings of human language technologies: The 2009 annual conference of the North American chapter of the association for computational linguistics. pp 584–592

  • Mori T (2002) Information gain ratio as term weight: the case of summarization of IR results. In: Proceedings of the 19th international conference on computational linguistics. pp 1–7

  • Mori T, Nozawa M, Asada Y (2004) Multi-answer-focused multi-document summarization using a question-answering engine. In: COLING ’04: proceedings of the 20th international conference on computational linguistics. pp 439–445

  • Mori T, Nozawa M, Asada Y (2005) Multi-answer-focused multi-document summarization using a question-answering engine. ACM Trans Asian Lang Inf Process (TALIP) 4(3): 305–320

    Article  Google Scholar 

  • Morris AH, Kasper GM, Adams DA (1992) The Effect and Limitations of Automatic Text Condensing on Reading Comprehension Performance. Inf Syst Res 3(1): 17–35

    Article  Google Scholar 

  • Nastase V, Milne D, Filippova K (2009) Summarizing with encyclopedic knowledge. In: Proceedings of the text analysis conference (TAC)

  • Nenkova A (2005) Automatic text summarization of newswire: lessons learned from the document understanding conference. In: Proceedings of the American association fro artificial intelligence (AAAI). pp 1436–1441

  • Nenkova A (2006) Summarization evaluation for text and speech: issues and Approaches. In: INTERSPEECH-2006, paper 2079-Wed1WeS.1

  • Nenkova A, Siddharthan A, McKeown K (2005) Automatically learning cognitive status for multi-document summarization of newswire. In: HLT ’05: proceedings of the conference on human language technology and empirical methods in natural language processing. pp 241–248

  • Neto JL, Santos A, Kaestner CAA, Freitas AA (2000) Generating text summaries through the relative importance of topics. In: IBERAMIA-SBIA ’00: proceedings of the international joint conference, 7th Ibero-American conference on AI. pp 300–309

  • Okumura M, Fukusima T, Nanba H, Hirao T (2004) Text Summarization Challenge 2 text summarization evaluation at NTCIR workshop 3. SIGIR Forum 38(1): 29–38

    Article  Google Scholar 

  • Orăsan C (2004) The influence of personal pronouns for automatic summarisation of scientific articles. In: Proceedings of the discourse anaphora and anaphor resolution colloquium. pp 127–132

  • Orăsan C (2007) Pronominal anaphora resolution for text summarisation. In: Proceedings of the recent advances on natural language processing. pp 430–436

  • Orăsan C (2009) Comparative Evaluation of Term-Weighting Methods for Automatic Summarization. J Quant Linguist 16(1): 67–95

    Article  Google Scholar 

  • Orăsan C, Pekar V, Hasler L (2004) A comparison of summarisation methods based on term specificity estimation. In: Proceedings of the fourth international conference on language resources and evaluation (LREC2004). pp 1037–1041. Available at:http://clg.wlv.ac.uk/papers/orasan-04a.pdf

  • Over P, Ligget W (2002) Introduction to DUC: an intrinsic evaluation of generic news text summarization systems. In: The document understanding workshop

  • Over P, Dang H, Harman D (2007) DUC in Context. Inf Process Manag 43(6): 1506–1520

    Article  Google Scholar 

  • Owczarzak K (2009) DEPEVAL(summ): dependency-based evaluation for automatic summaries. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP. pp 190–198

  • Pang B, Lee L (2005) Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the association of computational linguistics. pp 115–124

  • Pang B, Lee L (2008) Opinion Mining and Sentiment Analysis. Found Trends Inf Retr 2(1–2): 1–135

    Article  Google Scholar 

  • Papineni K, Roukos S, Ward T, Zhu WJ (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of 40th annual meeting of the association for computational linguistics. pp 311–318

  • Passoneau RJ (2010) Formal and Functional Assessment of the Pyramid Method for Summary Content Evaluation. Nat Lang Eng 16(2): 107–131

    Article  Google Scholar 

  • Pitler E, Nenkova A (2008) Revisiting readability: a unified framework for predicting text quality. In: Proceedings of the 2008 conference on empirical methods in natural language processing. pp 186–195

  • Plaza L, Díaz A, Gervás P (2008) Concept-graph based biomedical automatic Summarization Using Ontologies. In: Coling 2008: Proceedings of the 3rd textgraphs workshop on graph-based algorithms for natural language processing. pp 53–56

  • Plaza L, Lloret E, Aker A (2010) Improving automatic image captioning using text summarization techniques. In: Proceedings of the 13th international conference on text, speech and dialogue (TSD)

  • Qazvinian V, Radev DR (2008) Scientific paper summarization using citation summary networks. In: Proceedings of the 22nd international conference on computational linguistics (Coling 2008). pp 689–696

  • Radev DR, Fan W (2000) Automatic summarization of search engine hit lists. In: Proceedings of the ACL-2000 workshop on recent advances in natural language processing and information retrieval. pp 99–109

  • Radev DR, McKeown KR (1998) Generating Natural Language Summaries from Multiple on-line Sources. Comput Linguist 24(3): 470–500

    Google Scholar 

  • Radev DR, Tam D (2003) Summarization evaluation using relative utility. In: CIKM ’03: proceedings of the 12th international conference on information and knowledge management. pp 508–511

  • Radev DR, Blair-Goldensohn S, Zhang Z (2001) Experiments in single and multi-document summarization using MEAD. In: First document understanding conference. pp 1–7

  • Radev DR, Hovy E, McKeown K (2002) Introduction to the Special Issue on Summarization. Comput Linguist 28(4): 399–408

    Article  Google Scholar 

  • Saggion H (2008) Automatic summarization: an overview. Revue franaise de linguistique appliquée XIII(1). pp 63–81

  • Saggion H (2009) A classification algorithm for predicting the structure of summaries. In: Proceedings of the 2009 workshop on language generation and summarisation (UCNLG+Sum 2009). pp 31–38

  • Saggion H, Funk A (2009) Extracting Opinions and Facts for Business Intelligence. RNTI E-17: 119–146

    Google Scholar 

  • Saggion H, Lapalme G (2000) Selective analysis for automatic abstracting: evaluating indicativeness and acceptability. In: Proceedings of content-based multimedia information access (RIAO). pp 747–764

  • Saggion H, Lloret E, Palomar M (2010) Using text summaries for predicting rating scales. In: Proceedings of the 1st workshop on computational approaches to subjectivity and sentiment analysis (WASSA)

  • Sakai T, Sparck-Jones K (2001) Generic summaries for indexing in information retrieval. In: SIGIR ’01: proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval. pp 190–198

  • Saravanan M, Ravindran B, Raman S (2006) Improving legal document summarization using graphical models. In: Proceedings of legal knowledge and information systems—JURIX 2006: the 19th annual conference on legal knowledge and information systems. pp 51–60

  • Sauper C, Barzilay R (2009) Automatically generating wikipedia articles: a structure-aware approach. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP. pp 208–216

  • Schilder F, Kondadadi R (2008) FastSum: fast and accurate query-based multi-document summarization. In: Proceedings of ACL-08: HLT, short papers. pp 205–208

  • Schilder F, Kondadadi R, Leidner JL, Conrad JG (2008) Thomson reuters at TAC 2008: aggressive filtering with FastSum for update and opinion summarization. In: Proceedings of the text analysis conference (TAC)

  • Schlesinger JD, Okurowski ME, Conroy JM, O’Leary DP, Taylor A, Hobbs J, Wilson H (2002) Understanding machine performance in the context of human performance for multi-document summarization. In: Proceedings of the DUC 2002 workshop on text summarization

  • Sebastiani F (2002) Machine Learning in Automated Text Categorization. ACM Comput Surv 34(1): 1–47

    Article  Google Scholar 

  • Shen D, Chen Z, Yang Q, Zeng HJ, Zhang B, Lu Y, Ma WY (2004) Web-page classification through summarization. In: SIGIR ’04: proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval. pp 242–249

  • Shen D, Yang Q, Chen Z (2007) Noise Reduction through Summarization for Web-page Classification. Inf Process Manag 43(6): 1735–1747

    Article  Google Scholar 

  • Shi Z, Melli G, Wang Y, Liu Y, Gu B, Kashani MM, Sarkar A, Popowich F (2007) Question answering summarization of multiple biomedical documents. In: CAI ’07: proceedings of the 20th conference of the Canadian society for computational studies of intelligence on advances in artificial intelligence. pp 284–295

  • Sjöbergh J (2007) Older Versions of the ROUGEeval Summarization Evaluation System were Easier to Fool. Inf Process Manag 43(6): 1500–1505

    Article  Google Scholar 

  • Spärck Jones K (1999) Automatic summarizing: factors and directions. In: Advances in automatic text summarization. pp 1–14

  • Spärck Jones K (2007) Automatic Summarising: The State of the Art. Inf Process Manag 43(6): 1449–1481

    Article  Google Scholar 

  • Spärck-Jones K, Galliers JR (eds) (1996) Evaluating natural language processing systems, an analysis and review, lecture notes in computer science, vol 1083. Springer, Berlin

  • Steinberger J, Poesio M, Kabadjov MA, Ježek K (2007) Two Uses of Anaphora Resolution in Summarization. Inf Process Manag 43(6): 1663–1680

    Article  Google Scholar 

  • Steinberger J, Jezek K, Sloup M (2008) Web topic summarization. In: Proceedings of the 12th international conference on electronic publishing. pp 322–334

  • Strzalkowski T, Harabagiu S (2007) Advances in open domain question answering (Text, Speech and Language Technology). Springer-Verlag New York, Inc., Secaucus, NJ, USA

    Google Scholar 

  • Sun JT, Shen D, Zeng HJ, Yang Q, Lu Y, Chen Z (2005) Web-page summarization using clickthrough data. In: SIGIR ’05: proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval. pp 194–201

  • Svore KM, Vanderwende L, Burges CJ (2007) Enhancing single-document summarization by combining RankNet and third-party sources. In: Proceedings of the joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL). pp 448–457

  • Sweeney S, Crestani F, Losada DE (2008) Show me more: Incremental length summarisation using novelty detection. Inf Process Manag 44(2): 663–686

    Article  Google Scholar 

  • Szlávik Z, Tombros A, Lalmas M (2006) Investigating the use of summarisation for interactive XML retrieval. In: SAC ’06: Proceedings of the 2006 ACM symposium on applied computing. pp 1068–1072

  • Teng Z, Liu Y, Ren F, Tsuchiya S, Ren F (2008) Single document summarization based on local topic identification and word frequency. In: MICAI ’08: proceedings of the 2008 seventh Mexican international conference on artificial intelligence. pp 37–41. http://dx.doi.org/10.1109/MICAI.2008.12

  • Teufel S, Halteren Hv (2004) Evaluating information content by factoid analysis: human annotation and stability. In: Proceedings of the 2004 conference on empirical methods in natural language processing. pp 419–426

  • Teufel S, Moens M (2002) Summarizing Scientific Articles: Experiments with Relevance and Rhetorical Status.. Comput Linguist 28(4): 409–445

    Article  Google Scholar 

  • Titov I, McDonald R (2008) A joint model of text and aspect ratings for sentiment summarization. In: Proceedings of ACL-08: HLT. pp 308–316

  • Torres-Moreno JM, St-Onge PL, Gagnon M, El-Bze M, Bellot P (2009) Automatic summarization system coupled with a question-answering system (QAAS). NLP News Computing Language. http://arxiv.org/abs/0905.2990v1

  • Trappey A, Trappey C, Wu CY (2009) Automatic patent Document Summarization for Collaborative Knowledge Systems and Services. J Syst Sci Syst Eng 18(1): 71–94

    Article  Google Scholar 

  • Trappey AJC, Trappey CV (2008) An R&D Knowledge Management Method for Patent Document Summarization. Ind Manag Data Syst 108(2): 245–257

    Article  Google Scholar 

  • Tseng YH, Lin CJ, Lin YI (2007) Text Mining Techniques for Patent Analysis. Inf Process Manag 43(5): 1216–1247

    Article  Google Scholar 

  • Vadlapudi R, Katragadda R (2010a) On automated evaluation of readability of summaries: capturing grammaticality, focus, structure and coherence. In: Proceedings of the NAACL HLT 2010 student research workshop. pp 7–12

  • Vadlapudi R, Katragadda R (2010b) Quantitative evaluation of grammaticality of summaries. In: Proceedings of the 11th international conference on computational linguistics and intelligent text processing, CICLing. pp 736–747

  • Van Dijk TA (1972) Some aspects of text grammars. A study in Theoretical Linguistics and Poetics, La Haya-parís, Mouton

  • Van Rijsbergen CJ (1981) Information retrieval. Elsevier, Amsterdam

    Google Scholar 

  • Wan X, Yang J, Xiao J (2007) Towards a unified approach based on affinity graph to various multi-document summarizations. In: Proceedings of the 11th European conference. pp 297–308

  • Wang C, Long L, Li L (2008) HowNet based evaluation for Chinese text summarization. In: Proceedings of the international conference on natural language processing and software engineering. pp 82–87

  • Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the empirical methods in natural language processing. pp 347–354

  • Witte R, Krestel R, Bergler S (2007) Generating update summaries for DUC 2007. In: The document understanding workshop (presented at the HLT/NAACL)

  • Wong KF, Wu M, Li W (2008) Extractive summarization using supervised and semi-supervised learning. In: Proceedings of the 22nd international conference on computational linguistics (Coling 2008). pp 985–992

  • Yu J, Reiter E, Hunter J, Mellish C (2007) Choosing the Content of Textual Summaries of Large Time-series Data Sets. Nat Lang Eng 13(1): 25–49

    Article  Google Scholar 

  • Zajic D, Dorr BJ, Lin J, Schwartz R (2007) Multi-candidate reduction: Sentence compression as a tool for document summarization tasks. Inf Process Manag 43(6): 1549–1570

    Article  Google Scholar 

  • Zajic DM, Dorr BJ, Lin J (2008) Single-document and multi-document summarization techniques for email threads using sentence compression. Inf Process Manag 44(4): 1600–1610

    Article  Google Scholar 

  • Zechner K, Waibel A (2000) DiaSumm: flexible summarization of spontaneous dialogues in unrestricted domains. In: Proceedings of the 18th conference on computational linguistics. pp 968–974

  • Zhou L, Ticrea M, Hovy E (2004) Multi-document biography summarization. In: Proceedings of the conference on empirical methods in natural language processing. pp 434–441

  • Zhou L, Lin CY, Munteanu DS, Hovy E (2006) ParaEval: using paraphrases to evaluate summaries automatically. In: Proceedings of the human language technology/North American association of computational linguistics conference. pp 447–454

  • Zhuang L, Jing F, Zhu XY (2006) Movie review mining and summarization. In: CIKM ’06: proceedings of the 15th ACM international conference on information and knowledge management. pp 43–50

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elena Lloret.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lloret, E., Palomar, M. Text summarisation in progress: a literature review. Artif Intell Rev 37, 1–41 (2012). https://doi.org/10.1007/s10462-011-9216-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-011-9216-z

Keywords

Navigation