skip to main content
10.1145/2009916.2010016acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Evolutionary timeline summarization: a balanced optimization framework via iterative substitution

Published:24 July 2011Publication History

ABSTRACT

Classic news summarization plays an important role with the exponential document growth on the Web. Many approaches are proposed to generate summaries but seldom simultaneously consider evolutionary characteristics of news plus to traditional summary elements. Therefore, we present a novel framework for the web mining problem named Evolutionary Timeline Summarization (ETS). Given the massive collection of time-stamped web documents related to a general news query, ETS aims to return the evolution trajectory along the timeline, consisting of individual but correlated summaries of each date, emphasizing relevance, coverage, coherence and cross-date diversity. ETS greatly facilitates fast news browsing and knowledge comprehension and hence is a necessity. We formally formulate the task as an optimization problem via iterative substitution from a set of sentences to a subset of sentences that satisfies the above requirements, balancing coherence/diversity measurement and local/global summary quality. The optimized substitution is iteratively conducted by incorporating several constraints until convergence. We develop experimental systems to evaluate on 6 instinctively different datasets which amount to 10251 documents. Performance comparisons between different system-generated timelines and manually created ones by human editors demonstrate the effectiveness of our proposed framework in terms of ROUGE metrics.

References

  1. J. Allan, R. Gupta, and V. Khandelwal. Temporal summaries of new topics. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR'01, pages 10--18, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. H. L. Chieu and Y. K. Lee. Query based event extraction along a timeline. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR'04, pages 425--432, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. G. Erkan and D. Radev. Lexpagerank: Prestige in multi-document text summarization. In Proceedings of EMNLP, volume 4, 2004.Google ScholarGoogle Scholar
  4. \A. Feng and J. Allan. Finding and linking incidents in news. In Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, CIKM'07, pages 821--830, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. G. P. C. Fung, J. X. Yu, H. Liu, and P. S. Yu. Time-dependent event hierarchy construction. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '07, pages 300--309, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Goldstein, M. Kantrowitz, V. Mittal, and J. Carbonell. Summarizing text documents: sentence selection and evaluation metrics. In Proceedings of the 22nd SIGIR conference on Research and development in information retrieval, pages 121--128, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. X. Jin, S. Spangler, R. Ma, and J. Han. Topic initiator detection on the world wide web. In Proceedings of the 19th international conference on World wide web, WWW '10, pages 481--490, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. G. Kumaran and J. Allan. Text classification and named entities for new event detection. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '04, pages 297--304, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. L. Li, K. Zhou, G.-R. Xue, H. Zha, and Y. Yu. Enhancing diversity, coverage and balance for summarization through structure learning. In Proceedings of the 18th international conference on World wide web, WWW '09, pages 71--80, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. X. Li and W. B. Croft. Improving novelty detection for general topics using sentence level information patterns. In Proceedings of the 15th ACM international conference on Information and knowledge management, CIKM '06, pages 238--247, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. C.-Y. Lin and E. Hovy. From single to multi-document summarization: a prototype system and its evaluation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL '02, pages 457--464, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. C.-Y. Lin and E. Hovy. Automatic evaluation of summaries using n-gram co-occurrence statistics. In Proceedings of NAACL'03, pages 71--78, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Mihalcea and P. Tarau. A language independent algorithm for single and multiple document summarization. In Proceedings of IJCNLP, 2005.Google ScholarGoogle Scholar
  14. D. Radev, H. Jing, M. Sty, and D. Tam. Centroid-based summarization of multiple documents. Information Processing and Management, 40(6):919--938, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. R. Swan and J. Allan. Automatic generation of overview timelines. In Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '00, pages 49--56, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. X. Wan and J. Yang. Multi-document summarization using cluster-based link analysis. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR'08, pages 299--306, 2008 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. X. Wan, J. Yang, and J. Xiao. Single document summarization with document expansion. In AAAI, pages 931--936, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. Wang and T. Li. Document update summarization using incremental hierarchical clustering. In Proceedings of the 19th ACM international conference on Information and knowledge management, CIKM'10, pages 279--288, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. Yan, Y. Li, Y. Zhang, and X. Li. Event recognition from news webpages through latent ingredients extraction. Information Retrieval Technology, pages 490--501, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  20. C. C. Yang and X. Shi. Discovering event evolution graphs from newswires. In Proceedings of the 15th international conference on World Wide Web, WWW'06, pages 945--946, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. K. Zhang, J. Zi, and L. G. Wu. New event detection based on indexing-tree and named entity. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '07, pages 215--222, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Evolutionary timeline summarization: a balanced optimization framework via iterative substitution

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
          July 2011
          1374 pages
          ISBN:9781450307574
          DOI:10.1145/2009916

          Copyright © 2011 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 24 July 2011

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate792of3,983submissions,20%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader