GuideRank: A Guided Ranking Graph Model for Multilingual Multi-document Summarization

Li, Haoran; Zhang, Jiajun; Zhou, Yu; Zong, Chengqing

doi:10.1007/978-3-319-50496-4_54

GuideRank: A Guided Ranking Graph Model for Multilingual Multi-document Summarization

Haoran Li^18,19,
Jiajun Zhang^18,19,
Yu Zhou^18,19 &
…
Chengqing Zong^18,19

Conference paper
First Online: 02 December 2016

4618 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10102))

Abstract

Multilingual multi-document summarization is a task to generate the summary in target language from a collection of documents in multiple source languages. A straightforward approach to this task is automatically translating the non-target language documents into target language and then applying monolingual summarization methods, but the summaries generated by this method is often poorly readable due to the low quality of machine translation. To solve this problem, we propose a novel graph model based on guided edge weighting method in which both informativeness and readability of summaries are taken into consideration fully. In methodology, our model attempts to choose from the target language documents the sentences which contain important shared information across languages, and also retains the salient sentences which cannot be covered by documents in other language. The experimental results on our manually labeled dataset (It will be released to the public.) show that our method significantly outperforms other baseline methods.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Dalli, A., Catizone, R., Wilks, Y.: Clustering-based language independent multiple-document summarizer at MSE 2006. In: Proceedings of MSE (2006)
Google Scholar
Daraksha Parveen, H.M.R., Strube, M.: Topical coherence for graph-based extractive summarization. In: EMNLP 2015 (2015)
Google Scholar
Daumé III., H., Marcu, D.: Bayesian multidocument summarization at MSE. In: Proceedings of MSE (2005)
Google Scholar
Erkan, G., Radev, D.R.: LexRank: graph-based lexical centrality as salience in text summarization. J. Qiqihar Jr. Teach. Coll. 22, 2004 (2011)
Google Scholar
Giampiccolo, D., Magnini, B., Dagan, I., Dolan, B.: The third pascal recognizing textual entailment challenge. In: ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, pp. 1–9 (2007)
Google Scholar
Giannakopoulos, G., El-Haj, M., Favre, B., Litvak, M., Steinberger, J., Varma, V.: TAC 2011 multiling pilot overview. Contribution in Book/report/proceedings (2011)
Google Scholar
Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: Meeting on Association for Computational Linguistics, pp. 423–430 (2003)
Google Scholar
Levy, O., Zesch, T., Dagan, I., Gurevych, I.: Recognizing partial textual entailment. In: Meeting of the Association for Computational Linguistics, pp. 451–455 (2013)
Google Scholar
Levy, R., Manning, C.: Is it harder to parse Chinese, or the Chinese treebank? In: Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, pp. 439–446 (2003)
Google Scholar
Lin, C.Y., Hovy, E.: Automatic evaluation of summaries using n-gram co-occurrence statistics. In: Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology (2003)
Google Scholar
Mihalcea, R.: Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions, ACLdemo 2004 (2004)
Google Scholar
Mihalcea, R., Tarau, P.: TextRank: bringing order into texts. UNT Scholarly Works, pp. 404–411 (2004)
Google Scholar
Siddharthan, A., Evans, D.: Columbia University at MSE 2005 (2005)
Google Scholar
Stern, A., Dagan, I.: BIUTEE: a modular open-source system for recognizing textual entailment. In: ACL 2012 System Demonstrations, pp. 73–78 (2012)
Google Scholar
Tseng, H., Chang, P., Andrew, G., Jurafsky, D., Manning, C.: A conditional random field word segmenter (2005)
Google Scholar
Wan, X.: Using bilingual information for cross-language document summarization. In: ACL 2011, pp. 1546–1555 (2011a)
Google Scholar
Wan, X., Li, H., Xiao, J.: Cross-language document summarization based on machine translation quality prediction. In: Proceedings of the Meeting of the Association for Computational Linguistics, ACL 2010, Uppsala, Sweden, 11–16 July 2010, pp. 917–926 (2010)
Google Scholar
Wan, X., Yang, J.: Improved affinity graph based multi-document summarization. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, New York, USA, 4–9 June 2006, pp. 181–184 (2006a)
Google Scholar
Wan, X., Yang, J., Xiao, J.: Using cross-document random walks for topic-focused multi-document. In: 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006), Hong Kong, China, 18–22 December 2006, pp. 1012–1018 (2006b)
Google Scholar
Wei, X., C.Y.: The THU/PolyU system at MSE 2006: an event-relevance based approach. In: Proceedings of MSE 2006 (2006)
Google Scholar
Yao, J.G., Wan, X., Xiao, J.: Phrase-based compressive cross-language summarization. In: Conference on Empirical Methods in Natural Language Processing, pp. 1546–1555 (2015)
Google Scholar
Zajic, D., Dorr, B., Lin, J., Schwartz, R., Zajic, D., Dorr, B., Lin, J.: UMD/BBN at MSE 2005. In: Proceedings of MSE (2005)
Google Scholar

Download references

Acknowledgments

The research work has been funded by the Natural Science Foundation of China under Grant No. 61333018 and supported by the Open Project Program of the State Key Laboratory of Mathematical Engineering and Advanced Computing.

Author information

Authors and Affiliations

National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China
Haoran Li, Jiajun Zhang, Yu Zhou & Chengqing Zong
University of Chinese Academy of Sciences, Beijing, China
Haoran Li, Jiajun Zhang, Yu Zhou & Chengqing Zong

Authors

Haoran Li
View author publications
You can also search for this author in PubMed Google Scholar
Jiajun Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yu Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Chengqing Zong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chengqing Zong .

Editor information

Editors and Affiliations

Microsoft Research Asia, Beijing, China
Chin-Yew Lin
Brandeis University, Waltham, Massachusetts, USA
Nianwen Xue
Peking University, Beijing, China
Dongyan Zhao
Fudan University, Shanghai, China
Xuanjing Huang
Peking University, Beijing, China
Yansong Feng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, H., Zhang, J., Zhou, Y., Zong, C. (2016). GuideRank: A Guided Ranking Graph Model for Multilingual Multi-document Summarization. In: Lin, CY., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds) Natural Language Understanding and Intelligent Applications. ICCPOL NLPCC 2016 2016. Lecture Notes in Computer Science(), vol 10102. Springer, Cham. https://doi.org/10.1007/978-3-319-50496-4_54

Download citation

DOI: https://doi.org/10.1007/978-3-319-50496-4_54
Published: 02 December 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50495-7
Online ISBN: 978-3-319-50496-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics