ABSTRACT
We describe a task-based evaluation to determine whether multi-document summaries measurably improve user performance whe using online news browsing systems for directed research. We evaluated the multi-document summaries generated by Newsblaster, a robust news browsing system that clusters online news articles and summarizes multiple articles on each event. Four groups of subjects were asked to perform the same time-restricted fact-gathering tasks, reading news under different conditions: no summaries at all, single sentence summaries drawn from one of the articles, Newsblaster multi-document summaries, and human summaries. Our results show that, in comparison to source documents only, the quality of reports assembled using Newsblaster summaries was significantly better and user satisfaction was higher with both Newsblaster and human summaries.
- E. Amigo, J. Gonzalo, V. Peinado, A. Penas, and F. Verdejo. An empirical study of information synthesis tasks. In Proceedings of ACL-04, Barcelona, Spain, 2004. Google ScholarDigital Library
- J. W. Bodnar. Warning Analysis for the Information Age: Rethinking the Intelligence Process. Center for Strategic Intelligence Research, Joint Military Intelligence College, Washington, D.C., 2003.Google Scholar
- R. Brandow, K. Mitze, and L. Rau. Automatic condensation of electronic publications by sentence selection. Information Processing and Management, 31(5):675--685, 1995. Google ScholarDigital Library
- S. Colbath and F. Kubala. Tap-xl: An automated analyst's assistant. In Proceedings of HLT-NAACL 2003), Edmunton, Alberta, Ca., 2003. Google ScholarDigital Library
- H. Daume, A. Echihabi, D. Marcu, D. S. Munteanu, and R. Soricut. Gleans: A generator of logical extracts and abstracts for nice sumamries. In Proceedings of the Second Document Understanding Workshop (DUC-2002), Philadelphia, Pa., 2002.Google Scholar
- Proceeding of the second, third and forth document understanding conference, 2002, 2003, 2004.Google Scholar
- H. Halteren and S. Teufel. Examining the consensus between human summaries: initial experiments with factoid analysis. In HLT-NAACL DUC Workshop, 2003. Google ScholarDigital Library
- T. Hand. A proposal for task-based evaluation of text summarization systems. In Proceedings of ACL/EACL-97 Summarization Workshop, pages 31--36, Madrid, Spain, 1997.Google Scholar
- F. J. Hughes and D. A. Schum. Evidence marshaling and argument construction: Case study no. 4, the sign of the crescent (analysis), January 2003. Manuscript developed for exclusive use by the Joint Military Intelligence College; not for distribution.Google Scholar
- H. Jing, R. Barzilay, K. McKeown, and M. Elhadad. Summarization evaluation methods: Experiments and analysis. In AAAI Symposium on Intelligent Summarization, 1998.Google Scholar
- C.-Y. Lin and E. Hovy. From single to multi-document summarization: A prototype system and its evaluation. In Proceedings of the ACL, pages 457--464, 2002. Google ScholarDigital Library
- C.-Y. Lin and E. Hovy. Automatic evaluation of summaries using n-gram co-occurance statistics. In Proceedings of HLT-NAACL 2003, 2003. Google ScholarDigital Library
- I. Mani and E. Bloedorn. Multi-document summarization by graph search and matching. In Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI-97), pages 622--628, Providence, Rhode Island, 1997. AAAI. Google ScholarDigital Library
- K. R. McKeown, R. Barzilay, D. Evans, V. Hatzivassiloglou, J. L. Klavans, A. Nenkova, C. Sable, B. Schiffman, and S. Sigelman. Tracking and summarizing news on a daily basis with columbia's newsblaster. In Proceedings of 2002 Human Language Technology Conference (HLT), San Diego, CA, 2002. Google ScholarDigital Library
- A. Nenkova and R. Passonneau. Evaluating content selection in summarization: The pyramid method. In Proceedings of HLT/NAACL 2004, 2004.Google Scholar
- D. R. Radev, S. Blair-Goldensohn, Z. Zhang, and R. Sundara Raghavan. Newsinessence: A system for domain-independent, real-time news clustering and multi-document summarization. In Human Language Technology Conference (Demo Session), San Diego, CA, 2001. Google ScholarDigital Library
- K. Sparck-Jones and J. R. Galliers. Evaluating Natural Language Processing Systems: An Analysis and Review. Springer, Berlin, 1995. Lecture Notes in Artificial Intelligence 1083.Google Scholar
Index Terms
- Do summaries help?
Recommendations
Sentiment diversification for short review summarization
WI '17: Proceedings of the International Conference on Web IntelligenceWith the abundance of reviews published on the Web about a given product, consumers are looking for ways to view major opinions that can be presented in a quick and succinct way. Reviews contain many different opinions, making the ability to show a ...
SumCR: A new subtopic-based extractive approach for text summarization
In text summarization, relevance and coverage are two main criteria that decide the quality of a summary. In this paper, we propose a new multi-document summarization approach SumCR via sentence extraction. A novel feature called Exemplar is introduced ...
Sentence Retrieval with Sentiment-specific Topical Anchoring for Review Summarization
CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge ManagementWe propose Topic Anchoring-based Review Summarization (TARS), a two-step extractive summarization method, which creates review summaries from the sentences that represent the most important aspects of a review. In the first step, the proposed method ...
Comments