Skip to main content

Hierarchical Graph Summarization: Leveraging Hybrid Information through Visible and Invisible Linkage

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7302))

Included in the following conference series:

Abstract

Graph-based ranking algorithm has been recently exploited for summarization by using sentence-to-sentence relationships. Given a document set with linkage information to summarize, different sentences belong to different documents or clusters (either visible cluster via anchor texts or invisible cluster by semantics), which enables a hierarchical structure. It is challenging and interesting to investigate the impacts and weights of source documents/clusters: sentence from important ones are deemed more salient than the others. This paper aims to integrate three types of hierarchical linkage into traditional graph-based methods by proposing Hierarchical Graph Summarization (HGS). We utilize a hierarchical language model to measure the sentence relationships in HGS. We develop experimental systems to compare 5 rival algorithms on 4 instinctively different datasets which amount to 5197 documents. Performance comparisons between different system-generated summaries and manually created ones by human editors demonstrate the effectiveness of our approach in ROUGE metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Allan, J., Gupta, R., Khandelwal, V.: Temporal summaries of new topics. In: Proceedings of the 24th Annual International ACM SIGIR Conference, pp. 10–18 (2001)

    Google Scholar 

  2. Erkan, G., Radev, D.R.: Lexpagerank: Prestige in multi-document text summarization. In: Proceedings of EMNLP 2004, pp. 1–7 (2004)

    Google Scholar 

  3. Fukumoto, F., Suzuki, Y.: Extracting key paragraph based on topic and event detection: towards multi-document summarization. In: NAACL-ANLP 2000, pp. 31–39 (2000)

    Google Scholar 

  4. Kumaran, G., Allan, J.: Text classification and named entities for new event detection. In: Proceedings of the 27th Annual International ACM SIGIR Conference, pp. 297–304 (2004)

    Google Scholar 

  5. Li, L., Zhou, K., Xue, G.-R., Zha, H., Yu, Y.: Enhancing diversity, coverage and balance for summarization through structure learning. In: WWW 2009, pp. 71–80 (2009)

    Google Scholar 

  6. Lin, C.-Y., Hovy, E.: Automatic evaluation of summaries using N-gram co-occurrence statistics. In: Proceedings of NAACL-HLT 2003, pp. 71–78 (2003)

    Google Scholar 

  7. Lin, C.-Y., Hovy, E.: From single to multi-document summarization: a prototype system and its evaluation. In: Proceedings of ACL 2002, pp. 457–464 (2002)

    Google Scholar 

  8. Mei, Q., Zhai, C.: Generating Impact-Based Summaries for Scientific Literature. In: Proceedings of ACL 2008, pp. 816–824 (2008)

    Google Scholar 

  9. Mihalcea, R., Tarau, P.: A language independent algorithm for single and multiple document summarization. In: Proceedings of IJCNLP 2005, pp. 19–24 (2005)

    Google Scholar 

  10. Shen, C., Wang, D., Li, T.: Topic aspect analysis for multi-document summarization. In: Proceedings of CIKM 2010, pp. 1545–1548 (2010)

    Google Scholar 

  11. Wan, X., Xiao, J.: Single document keyphrase extraction using neighborhood knowledge. In: Proceedings of AAAI 2008, pp. 855–860 (2008)

    Google Scholar 

  12. Wan, X.: An Exploration of document impact on graph-based multi-document summarization. In: Proceedings of EMNLP 2008, pp. 755–762 (2008)

    Google Scholar 

  13. Wan, X., Xiao, J.: Graph-based multi-modality learning for topic-focused multi-document summarization. In: Proceedings of IJCAI 2009, pp. 1586–1591 (2009)

    Google Scholar 

  14. Wang, D., Zhu, S., Li, T., Gong, Y.: Multi-document summarization using sentence-based topic models. In: Proceedings of ACL/AFNLP 2009 (Short Papers), pp. 297–300 (2009)

    Google Scholar 

  15. Wang, D., Li, T.: Document update summarization using incremental hierarchical clustering. In: Proceedings of CIKM 2010, pp. 279–288 (2010)

    Google Scholar 

  16. Yan, R., Wan, X., Otterbacher, J., Kong, L., Li, X., Zhang, Y.: Evolutionary timeline summarization: a balanced optimization framework via iterative substitution. In: Proceedings of the 34th Annual International ACM SIGIR Conference, pp. 745–754 (2011)

    Google Scholar 

  17. Yan, R., Nie, J.-Y., Li, X.: Summarize what you are interested in: an optimization framework for interactive personalized summarization. In: EMNLP 2011, pp. 1342–1351 (2011)

    Google Scholar 

  18. Zhai, C., Lafferty, J.D.: A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval. In: Proceedings of SIGIR 2001, pp. 334–342 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yan, R., Yuan, Z., Wan, X., Zhang, Y., Li, X. (2012). Hierarchical Graph Summarization: Leveraging Hybrid Information through Visible and Invisible Linkage. In: Tan, PN., Chawla, S., Ho, C.K., Bailey, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2012. Lecture Notes in Computer Science(), vol 7302. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30220-6_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-30220-6_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-30219-0

  • Online ISBN: 978-3-642-30220-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics