Skip to main content

GuideRank: A Guided Ranking Graph Model for Multilingual Multi-document Summarization

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10102))

Abstract

Multilingual multi-document summarization is a task to generate the summary in target language from a collection of documents in multiple source languages. A straightforward approach to this task is automatically translating the non-target language documents into target language and then applying monolingual summarization methods, but the summaries generated by this method is often poorly readable due to the low quality of machine translation. To solve this problem, we propose a novel graph model based on guided edge weighting method in which both informativeness and readability of summaries are taken into consideration fully. In methodology, our model attempts to choose from the target language documents the sentences which contain important shared information across languages, and also retains the salient sentences which cannot be covered by documents in other language. The experimental results on our manually labeled dataset (It will be released to the public.) show that our method significantly outperforms other baseline methods.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://nlp.csai.tsinghua.edu.cn/~ly/systems/TsinghuaAligner/TsinghuaAligner.html.

  2. 2.

    http://news.google.com/.

References

  1. Dalli, A., Catizone, R., Wilks, Y.: Clustering-based language independent multiple-document summarizer at MSE 2006. In: Proceedings of MSE (2006)

    Google Scholar 

  2. Daraksha Parveen, H.M.R., Strube, M.: Topical coherence for graph-based extractive summarization. In: EMNLP 2015 (2015)

    Google Scholar 

  3. Daumé III., H., Marcu, D.: Bayesian multidocument summarization at MSE. In: Proceedings of MSE (2005)

    Google Scholar 

  4. Erkan, G., Radev, D.R.: LexRank: graph-based lexical centrality as salience in text summarization. J. Qiqihar Jr. Teach. Coll. 22, 2004 (2011)

    Google Scholar 

  5. Giampiccolo, D., Magnini, B., Dagan, I., Dolan, B.: The third pascal recognizing textual entailment challenge. In: ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, pp. 1–9 (2007)

    Google Scholar 

  6. Giannakopoulos, G., El-Haj, M., Favre, B., Litvak, M., Steinberger, J., Varma, V.: TAC 2011 multiling pilot overview. Contribution in Book/report/proceedings (2011)

    Google Scholar 

  7. Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: Meeting on Association for Computational Linguistics, pp. 423–430 (2003)

    Google Scholar 

  8. Levy, O., Zesch, T., Dagan, I., Gurevych, I.: Recognizing partial textual entailment. In: Meeting of the Association for Computational Linguistics, pp. 451–455 (2013)

    Google Scholar 

  9. Levy, R., Manning, C.: Is it harder to parse Chinese, or the Chinese treebank? In: Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, pp. 439–446 (2003)

    Google Scholar 

  10. Lin, C.Y., Hovy, E.: Automatic evaluation of summaries using n-gram co-occurrence statistics. In: Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology (2003)

    Google Scholar 

  11. Mihalcea, R.: Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions, ACLdemo 2004 (2004)

    Google Scholar 

  12. Mihalcea, R., Tarau, P.: TextRank: bringing order into texts. UNT Scholarly Works, pp. 404–411 (2004)

    Google Scholar 

  13. Siddharthan, A., Evans, D.: Columbia University at MSE 2005 (2005)

    Google Scholar 

  14. Stern, A., Dagan, I.: BIUTEE: a modular open-source system for recognizing textual entailment. In: ACL 2012 System Demonstrations, pp. 73–78 (2012)

    Google Scholar 

  15. Tseng, H., Chang, P., Andrew, G., Jurafsky, D., Manning, C.: A conditional random field word segmenter (2005)

    Google Scholar 

  16. Wan, X.: Using bilingual information for cross-language document summarization. In: ACL 2011, pp. 1546–1555 (2011a)

    Google Scholar 

  17. Wan, X., Li, H., Xiao, J.: Cross-language document summarization based on machine translation quality prediction. In: Proceedings of the Meeting of the Association for Computational Linguistics, ACL 2010, Uppsala, Sweden, 11–16 July 2010, pp. 917–926 (2010)

    Google Scholar 

  18. Wan, X., Yang, J.: Improved affinity graph based multi-document summarization. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, New York, USA, 4–9 June 2006, pp. 181–184 (2006a)

    Google Scholar 

  19. Wan, X., Yang, J., Xiao, J.: Using cross-document random walks for topic-focused multi-document. In: 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006), Hong Kong, China, 18–22 December 2006, pp. 1012–1018 (2006b)

    Google Scholar 

  20. Wei, X., C.Y.: The THU/PolyU system at MSE 2006: an event-relevance based approach. In: Proceedings of MSE 2006 (2006)

    Google Scholar 

  21. Yao, J.G., Wan, X., Xiao, J.: Phrase-based compressive cross-language summarization. In: Conference on Empirical Methods in Natural Language Processing, pp. 1546–1555 (2015)

    Google Scholar 

  22. Zajic, D., Dorr, B., Lin, J., Schwartz, R., Zajic, D., Dorr, B., Lin, J.: UMD/BBN at MSE 2005. In: Proceedings of MSE (2005)

    Google Scholar 

Download references

Acknowledgments

The research work has been funded by the Natural Science Foundation of China under Grant No. 61333018 and supported by the Open Project Program of the State Key Laboratory of Mathematical Engineering and Advanced Computing.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chengqing Zong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Li, H., Zhang, J., Zhou, Y., Zong, C. (2016). GuideRank: A Guided Ranking Graph Model for Multilingual Multi-document Summarization. In: Lin, CY., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds) Natural Language Understanding and Intelligent Applications. ICCPOL NLPCC 2016 2016. Lecture Notes in Computer Science(), vol 10102. Springer, Cham. https://doi.org/10.1007/978-3-319-50496-4_54

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-50496-4_54

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-50495-7

  • Online ISBN: 978-3-319-50496-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics