skip to main content
10.1145/1076034.1076071acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Topic themes for multi-document summarization

Published: 15 August 2005 Publication History

Abstract

The problem of using topic representations for multi-document summarization (MDS) has received considerable attention recently. In this paper, we describe five different topic representations and introduce a novel representation of topics based on topic themes. We present eight different methods of generating MDS and evaluate each of these methods on a large set of topics used in past DUC workshops. Our evaluation results show a significant improvement in the quality of summaries based on topic themes over MDS methods that use other alternative topic representations.

References

[1]
R. Barzilay and L. Lee. Catching the Drift: Probabilistic Content Models, with Applications to Generation and Summarization. In HLT-NAACL 2004: Proceedings of the Main Conference, pages 113--120, 2004.
[2]
R. Barzilay, K. R. McKeown, and M. Elhadad. Information Fusion in the Context of Multi-Document Summarization. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, pages 550--557, College Park, Maryland, USA, June 16--20 1999.
[3]
R. Barzilay, K. R. McKeown, and M. Elhadad. Inferring Strategies for Sentence Ordering in Multidocument News Summarization. In Journal of Artificial Intelligence Research, pages 35--55, July 2002.
[4]
C. Buckley, M. Mitra, J. Walz, and C. Cardie. SMART High Precision: TREC 7. In Proceedings of the 7th Text REtrieval Conference (TREC-7), pages 285--298, 1998.
[5]
D. Gildea and M. Palmer. The necessity of syntactic parsing for predicate argument recognition. In Proceedings of the 40th Annual Conference of the Association for Computational Linguistics (ACL-02), pages 239--246, Philadelphia, PA, 2002.
[6]
S. Harabagiu. Incremental Topic Representations. In Proceedings of the 20th COLING Conference, Geneva, Switzerland, 2004.
[7]
S. Harabagiu and S. Maiorano. Multi-Document Summarization with GISTexter. In Proceedings of the Third LREC Conference 2002 (LREC 2002), Canary Islands, Spain, June 2002.
[8]
M. A. Hearst. Texttiling: segmenting text into multi-paragraph subtopic passages. Computational Linguistics, 23(1):33--64, 1997.
[9]
M. Kameyama. Recognizing referential links: An information extraction perspective. In Proceedings of the ACL/EACL '97 Workshop on Operational Factors in Practical, Robust Anaphora Resolution, pages 46--53, 1997.
[10]
C.-Y. Lin and E. Hovy. Identifying Topics by Position. In Proceedings of the 5th Conference on Applied Natural Language Processing, pages 283--290. Association for Computational Linguistics, March 31 - April 3 1997.
[11]
C.-Y. Lin and E. Hovy. The automated acquisition of topic signatures for text summarization. In Proceedings of the 18th COLING Conference, Saarbrücken, Germany, 2000.
[12]
C.-Y. Lin and E. Hovy. The potential and limitations of automatic sentence extraction for summarization. In D. Radev and S. Teufel, editors, HLT-NAACL 2003 Workshop: Text Summarization (DUC03), Edmonton, Alberta, Canada, May 31 - June 1 2003. Association for Computational Linguistics.
[13]
D. Marcu and A. Echihabi. An Unsupervised Approach to Recognizing Discourse Relations. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL-2002), Philadelphia, PA, July 2002.
[14]
G. A. Miller. WordNet: a lexical database for English. Communications of the Association for Computing Machinery, 38(11):39--41, 1995.
[15]
A. Moschitti and C. A. Bejan. A semantic kernel for predicate argument classification. In Proceedings of CoNLL-2004, pages 17--24. Boston, MA, USA, 2004.
[16]
V. Ng and C. Cardie. Improving Machine Learning Approaches to Coreference Resolution. In Proceedings of the 4Oth Meeting of the Association for Computational Linguistics, 2002.
[17]
G. Ngai and R. Florian. Transformation-Based Learning in the Fast Lane. In Proceedings of the 2nd Meeting of the North American Chapter of the Association for Computational Linguistics, pages 40--47, 2001.
[18]
E. Riloff. Automatically generating extraction patterns from untagged text. In AAAI/IAAI, Vol. 2, pages 1044--1049, 1996.
[19]
E. Riloff and M. Schmelzenbach. An Empirical Approach to Conceptual Case Frame Acquisition. In Proceedings of the Sixteenth Workshop on Very Large Corpora, 1998.

Cited By

View all
  • (2023)Hindi Text Summarization Using Sequence to Sequence Neural NetworkACM Transactions on Asian and Low-Resource Language Information Processing10.1145/362401322:10(1-18)Online publication date: 13-Oct-2023
  • (2023)WikiDes: A Wikipedia-based dataset for generating short descriptions from paragraphsInformation Fusion10.1016/j.inffus.2022.09.02290(265-282)Online publication date: Feb-2023
  • (2023)Binary Particle Swarm Optimization with an improved genetic algorithm to solve multi-document text summarization problem of Hindi documentsEngineering Applications of Artificial Intelligence10.1016/j.engappai.2022.105575117:PAOnline publication date: 1-Jan-2023
  • Show More Cited By

Index Terms

  1. Topic themes for multi-document summarization

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
      August 2005
      708 pages
      ISBN:1595930345
      DOI:10.1145/1076034
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 15 August 2005

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. summarization
      2. topic themes

      Qualifiers

      • Article

      Conference

      SIGIR05
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 792 of 3,983 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)7
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 13 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Hindi Text Summarization Using Sequence to Sequence Neural NetworkACM Transactions on Asian and Low-Resource Language Information Processing10.1145/362401322:10(1-18)Online publication date: 13-Oct-2023
      • (2023)WikiDes: A Wikipedia-based dataset for generating short descriptions from paragraphsInformation Fusion10.1016/j.inffus.2022.09.02290(265-282)Online publication date: Feb-2023
      • (2023)Binary Particle Swarm Optimization with an improved genetic algorithm to solve multi-document text summarization problem of Hindi documentsEngineering Applications of Artificial Intelligence10.1016/j.engappai.2022.105575117:PAOnline publication date: 1-Jan-2023
      • (2023)Query focused summarization via relevance distillationNeural Computing and Applications10.1007/s00521-023-08525-w35:22(16543-16557)Online publication date: 26-Apr-2023
      • (2022)AmpSum: Adaptive Multiple-Product Summarization towards Improving Recommendation CaptionsProceedings of the ACM Web Conference 202210.1145/3485447.3512018(2978-2988)Online publication date: 25-Apr-2022
      • (2022)Extractive summarization using concept‐space and keyword phraseExpert Systems10.1111/exsy.1311039:10Online publication date: 10-Aug-2022
      • (2022)Neural Text Segmentation and its Application to Sentiment AnalysisIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.298336034:2(828-842)Online publication date: 1-Feb-2022
      • (2022)A Probabilistic Topic Model based on Short Distance Co-occurrencesExpert Systems with Applications10.1016/j.eswa.2022.116518(116518)Online publication date: Jan-2022
      • (2022)Deep reinforcement and transfer learning for abstractive text summarizationComputer Speech and Language10.1016/j.csl.2021.10127671:COnline publication date: 1-Jan-2022
      • (2022)Time Series Analysis on Covid 19 Summarized Twitter Data Using Modified TextRankComputational Intelligence in Pattern Recognition10.1007/978-981-19-3089-8_2(11-23)Online publication date: 21-Jun-2022
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media