TSGVi: a graph-based summarization system for Vietnamese documents

Nguyen-Hoang, Tu-Anh; Nguyen, Khai; Tran, Quang-Vinh

doi:10.1007/s12652-012-0143-x

TSGVi: a graph-based summarization system for Vietnamese documents

Original Research
Published: 27 June 2012

Volume 3, pages 305–313, (2012)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Tu-Anh Nguyen-Hoang¹,
Khai Nguyen¹ &
Quang-Vinh Tran¹

347 Accesses
22 Citations
Explore all metrics

Abstract

This paper proposes an automatic method to generate an extractive summary of multiple Vietnamese documents which are related to a common topic by modeling text documents as weighted undirected graphs. It initially builds undirected graphs with vertices representing the sentences of documents and edges indicate the similarity between sentences. Then, by adopting PageRank algorithm, we can generate salient scores for sentences. Sentences are ranked according to their salient scores and selected based on maximal marginal relevance to form the summaries. These summaries are combined and applied the same process one more time to form the final extractive summary of the document set. A series of experiments are performed on Vietnamese news articles and English data of DUC 2002, 2003, 2007. The results demonstrate the effectiveness of the proposed technique over reference systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Graph-based extractive text summarization based on single document

Article 25 July 2023

A Graph Based Approach on Extractive Summarization

CESumm: Semantic Graph-Based Approach for Extractive Text Summarization

Notes

References

Barzilay R, Elhadad M (1997) Using lexical chains for text summarization. In: In Proceedings of the ACL workshop on intelligent scalable text summarization, pp 10–17
Berger AL, Mittal VO (2000) Ocelot: a system for summarizing web pages. In: SIGIR, pp 144–151
Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine. Comput Netw 30(1-7):107–117
Google Scholar
Carbonell JG, Goldstein J (1998) The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: SIGIR, pp 335–336
Erkan G, Radev DR (2004) Lexrank: graph-based lexical centrality as salience in text summarization. J Artif Intell Res 22:457–479
Google Scholar
Goldstein J, Kantrowitz M, Mittal VO, Carbonell JG (1999) Summarizing text documents: sentence selection and evaluation metrics. In: SIGIR ’99: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval, August 15–19, 1999, Berkeley, CA, USA. ACM, New York, pp 121–128
Ha TL, Huynh TQ, Luong MC (2005) A primary studies on summarization of documents in Vietnamese. In: The First World Congress of the International Federation for Systems Research
Lin CY, Hovy EH (2003) Automatic evaluation of summaries using n-gram co-occurrence statistics. In: HLT-NAACL
Lin CY, Lin CY, Hovy E (2002) Automated multi-document summarization in neats. In: Proceedings of the human language technology conference (HLT2002), pp 23–27
Luhn HP (1958) The automatic creation of literature abstracts. IBM J Res Dev 2:159–165
Article MathSciNet Google Scholar
Mani I, Bloedorn E (1997) Multi-document summarization by graph search and matching. In: AAAI/IAAI, pp 622–628
McKeown KR, Barzilay R, Evans D, Hatzivassiloglou V, Klavans JL, Nenkova A, Sable C, Schiffman B, Sigelman S, Summarization M (2002) Tracking and summarizing news on a daily basis with columbia’s newsblaster
Mihalcea R, Tarau P (2004) Textrank: bringing order into text. In: EMNLP, pp 404–411
Mihalcea R, Tarau P (2005a) A language independent algorithm for single and multiple document summarization. In: Proceedings of IJCNLP’2005
Mihalcea R, Tarau P (2005b) Multi-document summarization with iterative graph-based algorithms. In: 1st International conference on intelligent analysis methods and tools (IA)
Mittal VO, Kantrowitz M, Goldstein J, Carbonell JG (1999) Selecting text spans for document summaries: heuristics and metrics. In: AAAI/IAAI, pp 467–473
Nguyen ML, Shimazu A, Phan XH, Ho TB, Horiguchi S (2005) Sentence extraction with support vector machine ensemble. In: The First World Congress of the international federation for systems research
Nguyen HTA, Nguyen HK, Tran QV (2010) An efficient Vietnamese text summarization approach based on graph model. In: RIVF
Nomoto T, Matsumoto Y (2001) A new approach to unsupervised text summarization. In: SIGIR, pp 26–34
Phuc D, Hung MX (2008) Using SOM based graph clustering for extracting main ideas from documents. In: RIVF, pp 209–214
Radev DR (2001) Experiments in single and multidocument summarization using mead. In: First document understanding conference
Salton G, Singhal A, Mitra M, Buckley C (1997) Automatic text structuring and summarization. Inf Process Manag 33(2):193–207
Article Google Scholar
Schiffman B, Mani I, Concepcion KJ (2001) Producing biographical summaries: combining linguistic knowledge with corpus statistics. In: ACL, pp 450–457
Wei F, Li W, Lu Q, He Y (2010) A document-sensitive graph model for multi-document summarization. Knowl Inf Syst 22(2):245–259
Article Google Scholar
Zha H (2002) Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering. In: SIGIR, pp 113–120

Download references

Acknowledgments

The authors would like to thank Prof. Kiem Hoang from the University of Information Technology, VNU, HCM City for his invaluable and insightful comments. The authors also thank the anonymous reviewers for their helpful comments.

Author information

Authors and Affiliations

Faculty of Information Technology, University of Science, VNU-HCM, Ho Chi Minh, Vietnam
Tu-Anh Nguyen-Hoang, Khai Nguyen & Quang-Vinh Tran

Authors

Tu-Anh Nguyen-Hoang
View author publications
You can also search for this author in PubMed Google Scholar
Khai Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Quang-Vinh Tran
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tu-Anh Nguyen-Hoang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nguyen-Hoang, TA., Nguyen, K. & Tran, QV. TSGVi: a graph-based summarization system for Vietnamese documents. J Ambient Intell Human Comput 3, 305–313 (2012). https://doi.org/10.1007/s12652-012-0143-x

Download citation

Received: 30 June 2011
Accepted: 05 June 2012
Published: 27 June 2012
Issue Date: December 2012
DOI: https://doi.org/10.1007/s12652-012-0143-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

TSGVi: a graph-based summarization system for Vietnamese documents

Abstract

Access this article

Similar content being viewed by others

Graph-based extractive text summarization based on single document

A Graph Based Approach on Extractive Summarization

CESumm: Semantic Graph-Based Approach for Extractive Text Summarization

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

TSGVi: a graph-based summarization system for Vietnamese documents

Abstract

Access this article

Similar content being viewed by others

Graph-based extractive text summarization based on single document

A Graph Based Approach on Extractive Summarization

CESumm: Semantic Graph-Based Approach for Extractive Text Summarization

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation