research-article

Update summarization based on novel topic distribution

Authors:
Josef Steinberger

University of West Bohemia, Pilsen, Czech Rep

University of West Bohemia, Pilsen, Czech Rep
View Profile

,
Karel Ježek

University of West Bohemia, Pilsen, Czech Rep

University of West Bohemia, Pilsen, Czech Rep
View Profile

DocEng '09: Proceedings of the 9th ACM symposium on Document engineeringSeptember 2009Pages 205–213https://doi.org/10.1145/1600193.1600239

Published:16 September 2009Publication History

DocEng '09: Proceedings of the 9th ACM symposium on Document engineering

Pages 205–213

ABSTRACT

This paper deals with our recent research in text summarization. The field has moved from multi-document summarization to update summarization. When producing an update summary of a set of topic-related documents the summarizer assumes prior knowledge of the reader determined by a set of older documents of the same topic. The update summarizer thus must solve a novelty vs. redundancy problem. We describe the development of our summarizer which is based on Iterative Residual Rescaling (IRR) that creates the latent semantic space of a set of documents under consideration. IRR generalizes Singular Value Decomposition (SVD) and enables to control the influence of major and minor topics in the latent space. Our sentence-extractive summarization method computes the redundancy, novelty and significance of each topic. These values are finally used in the sentence selection process. The sentence selection component prevents inner summary redundancy. The results of our participation in TAC evaluation seem to be promising.

References

Document understanding conference 2007: http://duc.nist.gov/.Google Scholar
Text analysis conference 2008: http://www.nist.gov/tac/tracks/2008/index.html.Google Scholar
R. Ando and L. Lee. Iterative residual rescaling: An analysis and generalization of lsi. In Proceeding of the 24th SIGIR, 2001. Google ScholarDigital Library
M. Berry, S. Dumais, and G. O'Brien. Using linear algebra for intelligent ir. SIAM Review, 37(4), 1995. Google ScholarDigital Library
F. Boudin, M. El-Beze, and J. Torres-Moreno. A scalable mmr approach to sentence scoring for multi-document update summarization. In Proceedings of the 22nd International Conference on Computational Linguistics, 2008.Google Scholar
J. Carbonell and J. Goldstein. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st International ACM SIGIR Conference on Research and Development in Information Retrieval, 1998. Google ScholarDigital Library
F. Choi, P. Wiemer-Hastings, and J. Moore. Latent semantic analysis for text segmentation. In Proceedings of EMNLP, 2001.Google Scholar
C. Ding. A probabilistic model for latent semantic indexing. Journal of the American Society for Information Science and Technology, 56(6), 2005. Google ScholarDigital Library
T. Dunning. Accurate methods for statistics of surprise and coincidence. Computational Linguistics, 19, 1993. Google ScholarDigital Library
G. Erkan and D. Radev. Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research (JAIR), 2004. Google ScholarDigital Library
Y. Gong and X. Liu. Generic text summarization using relevance measure and latent semantic analysis. In Proceedings of ACM SIGIR, 2002. Google ScholarDigital Library
B. Hachey, G. Murray, and D. Reitter. The embra system at duc 2005: Query-oriented multi-document summarization with a very large latent semantic space. In Proceedings of the Document Understanding Conference, 2005.Google Scholar
A. Hickl, K. Roberts, and F. Lacatusu. Lcc's gistexter at duc 2007: Machine reading for update summarization. In Proceedings of the Document Understanding Conference, 2007.Google Scholar
E. Hovy and C. Lin. Automated text summarization in summarist. In Proceedings of ACL/EACL workshop on intelligent scalable text summarization, 1997. Google ScholarDigital Library
E. Hovy, C.-Y. Lin, and L. Zhou. Evaluating duc 2005 using basic elements. In Proceedings of the Document Understanding Conference, 2005.Google Scholar
T. Landauer and S. Dumais. A solution to platos problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review, 104, 1997.Google Scholar
C.-H. Lee, H.-C. Yang, and S.-M. Ma. A novel multilingual text categorization system using latent semantic indexing. In Proceedings of the First International Conference on Innovative Computing, Information and Control. IEEE Computer Society, 2006. Google ScholarDigital Library
C. Lin. Rouge: A package for automatic evaluation of summaries. In Proceedings of the Workshop on Text Summarization Branches Out, 2004.Google Scholar
I. Mani and G. Wilson. Robust temporal processing of news. In 38th Annual Meeting on Association for Computational Linguistics, 2000. Google ScholarDigital Library
R. Mihalcea and P. Tarau. Text-rank - bringing order into texts. In Proceeding of the Conference on Empirical Methods in Natural Language Processing, 2004.Google Scholar
R. Mihalcea and P. Tarau. An algorithm for language independent single and multiple document summarization. In Proceedings of the International Joint Conference on Natural Language Processing, 2005.Google Scholar
G. Murray, S. Renals, and J. Carletta. Extractive summarization of meeting recordings. In Proceedings of Interspeech, 2005.Google Scholar
A. Nenkova and R. Passonneau. Evaluating content selection in summarization: The pyramid method. In Document Understanding Conference, 2005.Google Scholar
P. Over, H. Dang, and D. Harman. Duc in context. Information Processing and Management, 43(6), 2007. Google ScholarDigital Library
J. Steinberger and K. Ježek. Text summarization and singular value decomposition. In Lecture Notes in Computer Science 2457. Springer-Verlag Berlin Heidelberg, 2004.Google Scholar
J. Steinberger and K. Ježek. Sutler: Update summarizer based on latent topics. In Proceedings of TAC 2008, 2009.Google Scholar
J. Steinberger and M. Křišt'an. Lsa-based multi-document summarization. In Proceedings of 8th International Workshop on Systems and Control, 2007.Google Scholar
J. Steinberger, M. Poesio, M. Kabadjov, and K. Ježek. Two uses of anaphora resolution in summarization. Information Processing and Management, 43(6), 2007. Google ScholarDigital Library
R. Swan and J. Allan. Automatic generation of overview timelines. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, 2000. Google ScholarDigital Library
R. Witte, R. Krestel, and S. Bergler. Generating update summaries for duc 2007. In Proceedings of the Document Understanding Conference, 2007.Google Scholar
J. Yeh, H. Ke, W. Yang, and I. Meng. Text summarization using a trainable summarizer and latent semantic analysis. Special issue of Information Processing and Management on An Asian digital libraries perspective, 41(1), 2005. Google ScholarDigital Library
J. Zhang, X. Cheng, H. Xu, X. Wang, and Y. Zeng. Ictcas's ictgrasper at tac 2008: Summarizing dynamic information with signature terms based content filtering. In Proceedings of TAC 2008, 2009.Google Scholar

Index Terms

Update summarization based on novel topic distribution

Recommendations

Sentiment diversification for short review summarization
WI '17: Proceedings of the International Conference on Web Intelligence

With the abundance of reviews published on the Web about a given product, consumers are looking for ways to view major opinions that can be presented in a quick and succinct way. Reviews contain many different opinions, making the ability to show a ...
Read More
A Comparative Analysis on Hindi and English Extractive Text Summarization

Text summarization is the process of transfiguring a large documental information into a clear and concise form. In this article, we present a detailed comparative study of various extractive methods for automatic text summarization on Hindi and English ...
Read More
Topic-driven reader comments summarization
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management

Readers of a news article often read its comments contributed by other readers. By reading comments, readers obtain not only complementary information about this news article but also the opinions from other readers. However, the existing ranking ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
DocEng '09: Proceedings of the 9th ACM symposium on Document engineering
September 2009
264 pages
ISBN:9781605585758
DOI:10.1145/1600193
General Chair:
Uwe M. Borghoff
Universität der Bundeswehr München, Germany
,
Program Chair:
Boris Chidlovskii
Xerox Research Centre Europe, France
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 16 September 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
iterative residual rescaling
latent semantic analysis
summary evaluation
text summarization
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate178of537submissions,33%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 15
  Total Citations
  View Citations
- 345
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Update summarization based on novel topic distribution

DocEng '09: Proceedings of the 9th ACM symposium on Document engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Sentiment diversification for short review summarization

A Comparative Analysis on Hindi and English Extractive Text Summarization

Topic-driven reader comments summarization