Graph-Based Methods for Multi-document Summarization: Exploring Relationship Maps, Complex Networks and Discourse Information

Ribaldo, Rafael; Akabane, Ademar Takeo; Rino, Lucia Helena Machado; Pardo, Thiago Alexandre Salgueiro

doi:10.1007/978-3-642-28885-2_30

Rafael Ribaldo²³,
Ademar Takeo Akabane²³,
Lucia Helena Machado Rino²⁴ &
…
Thiago Alexandre Salgueiro Pardo²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7243))

Included in the following conference series:

International Conference on Computational Processing of the Portuguese Language

1288 Accesses
8 Citations

Abstract

In this work we investigate the use of graphs for multi-document summarization. We adapt the traditional Relationship Map approach to the multi-document scenario and, in a hybrid approach, we consider adding CST (Cross-document Structure Theory) relations to this adapted model. We also investigate some measures derived from graphs and complex networks for sentence selection. We show that the superficial graph-based methods are promising for the task. More importantly, some of them perform almost as good as a deep approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Parallel Relationship Graph to Improve Multi-Document Summarization

Automatic Multi-Document Summarization Based on Keyword Density and Sentence-Word Graphs

Article 07 June 2018

CESumm: Semantic Graph-Based Approach for Extractive Text Summarization

References

Afantenos, S.D., Doura, I., Kapellou, E., Karkaletsis, V.: Exploiting Cross-Document Relations for Multi-document Evolving Summarization. In: Proceedings of the 3rd Hellenic Conference on Artificial Intelligence, Samos Island, Greece, May 5-8, pp. 410–419 (2004)
Google Scholar
Albert, R., Barabási, A.L.: Statistical mechanics of complex networks. Reviews of Modern Physics 74(1), 47–97 (2002)
Article MathSciNet MATH Google Scholar
Antiqueira, L.: Desenvolvimento de Técnicas Baseadas em Redes Complexas para Sumarização Extrativa de Textos. MSc Dissertation. Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo. São Carlos/SP, Brazil, p. 124 (March 2007)
Google Scholar
Antiqueira, L., Oliveira Jr., O.N., Costa, L.F., Nunes, M.G.V.: A Complex Network Approach to Text Summarization. Information Sciences 179(5), 584–599 (2009)
Article MATH Google Scholar
Cardoso, P.C.F., Maziero, E.G., Castro Jorge, M.L.R., Seno, E.M.R., Di Felippo, A., Rino, L.H.M., Nunes, M.G.V., Pardo, T.A.S.: CSTNews - A Discourse-Annotated Corpus for Single and Multi-Document Summarization of News Texts in Brazilian Portuguese. In: Proceedings of the 3rd RST Brazilian Meeting, Cuiabá/MT, Brazil, October 26, pp. 88–105 (2011)
Google Scholar
Cardoso, P.C.F., Pardo, T.A.S., Nunes, M.G.V.: Métodos para Sumarização Automática Multidocumento Usando Modelos Semântico-Discursivos. In: Proceedings of the 3rd RST Brazilian Meeting, Cuiabá/MT, Brazil, October 26, pp. 59–74 (2011)
Google Scholar
Castro Jorge, M.L.R., Pardo, T.A.S.: Experiments with CST-based Multidocument Summarization. In: Proceedings of the ACL Workshop TextGraphs-5: Graph-based Methods for Natural Language Processing, Uppsala, Sweden, July 16, pp. 74–82 (2010)
Google Scholar
Castro Jorge, M.L.R.: Sumarização automática multidocumento: seleção de conteúdo com base no Modelo CST (Cross-document Structure Theory). MSc Dissertation. Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo. São Carlos/SP, Brazil, p. 86 (April 2010)
Google Scholar
Castro Jorge, M.L.R., Agostini, V., Pardo, T.A.S.: Multi-document Summarization Using Complex and Rich Features. In: Anais do VIII Encontro Nacional de Inteligência Artificial, Natal/RN, Brazil, July 19-22, pp. 1–12 (2011)
Google Scholar
Castro Jorge, M.L.R., Pardo, T.A.S.: A Generative Approach for Multi-Document Summarization using the Noisy Channel Model. In: Proceedings of the 3rd RST Brazilian Meeting, Cuiabá/MT, Brazil, October 26, pp. 75–87 (2011)
Google Scholar
Erkan, G., Radev, D.: LexRank: Graph-based Lexical Centrality as Salience in Text Summarization. Journal of Artificial Intelligence Research 22(1), 457–479 (2004)
Google Scholar
Gantz, J., Reinsel, D.: Extracting Values from Chaos. IDC IView (June 2011)
Google Scholar
Leite, D.S.: Um Estudo Comparativo de Modelos Baseados em Estatísticas Textuais, Grafos e Aprendizado de Máquina para Sumarização Automática de Textos em Português. MSc Dissertation. Departamento de Computação, Universidade Federal de São Carlos. São Carlos/SP, Brazil, p. 231 (December 2010)
Google Scholar
Lima, J.B.P., Pardo, T.A.S.: Ordenação de Sentenças em Sumários Multidocumento: Uma Abordagem Utilizando Relações CST. In: Proceedings of the 2nd STIL Student Workshop on Information and Human Language Technology, Cuiabá/MT, Brazil, October 24-25, pp. 1–3 (2011)
Google Scholar
Lin, C.Y., Hovy, E.: Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, Edmonton, Canada, May 27 - June 1, pp. 71–78 (2003)
Google Scholar
Louis, A., Joshi, A., Nenkova, A.: Discourse indicators for content selection in summarization. In: Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialog, Tokyo, Japan, September 24-25, pp. 147–156 (2010)
Google Scholar
Mani, I.: Automatic Summarization. John Benjamins Publishing Co., Amsterdam (2001)
MATH Google Scholar
Mani, I., Bloedorn, E.: Summarizing Similarities and Differences Among Related Documents. Information Retrieval 1(1-2), 35–67 (1997)
Google Scholar
Maziero, E.G., Castro Jorge, M.L.R., Pardo, T.A.S.: Identifying Multidocument Relations. In: Proceedings of the 7th International Workshop on Natural Language Processing and Cognitive Science, Funchal/Madeira, Portugal, June 8-12, pp. 60–69 (2010)
Google Scholar
Maziero, E.G., Pardo, T.A.S.: Multi-Document Discourse Parsing Using Traditional and Hierarchical Machine Learning. In: Proceedings of the 8th Brazilian Symposium in Information and Human Language Technology, Cuiabá/MT, Brazil, October 24-26, pp. 1–10 (2011)
Google Scholar
Mihalcea, R., Radev, D.: Graph-based Natural Language Processing and Information Retrieval. Cambridge University Press (2011)
Google Scholar
Mihalcea, R., Tarau, P.: An Algorithm for Language Independent Single and Multiple Document Summarization. In: Proceedings of the 2nd International Joint Conference on Natural Language Processing, Jeju Island, Korea, October 11-13 (2005)
Google Scholar
Pardo, T.A.S., Rino, L.H.M.: TeMário: Um Corpus para Sumarização Automática de Textos. Technical Report NILC-TR-03-09. Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo. São Carlos/SP, Brazil, p. 13 (October 2003)
Google Scholar
Pardo, T.A.S., Rino, L.H.M., Nunes, M.G.V.: GistSumm: A Summarization Tool Based on a New Extractive Method. In: Proceedings of the 6th Workshop on Computational Processing of the Portuguese Language - Written and Spoken, Faro, Portugal, June 26-27, pp. 210–218 (2003)
Google Scholar
Pardo, T.A.S.: GistSumm - GIST SUMMarizer: Extensões e Novas Funcionalidades. Technical Report NILC-TR-05-05. Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo. São Carlos/SP, Brazil, p. 8 (February 2005)
Google Scholar
Radev, D.R.: A common theory of information fusion from multiple text sources, step one: Cross-document structure. In: Proceedings of the 1st ACL SIGDIAL Workshop on Discourse and Dialogue, Hong Kong, China, October 7-8 (2000)
Google Scholar
Radev, D.R., Jung, H., Budzikowska, M.: Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation and user studies. In: Proceedings of the ANLP/NAACL Workshop on Automatic Summarization, Seattle, USA, April 30, pp. 21–30 (2000)
Google Scholar
Radev, D.R., Blair-Goldensohn, S., Zhang, Z.: Experiments in single and multidocument summarization using MEAD. In: Proceedings of the 1st DUC Workshop on Text Summarization, New Orleans, USA, September 13-14 (2001)
Google Scholar
Radev, D.R., Blair-Goldensohn, S., Zhang, Z., Raghavan, R.S.: NewsInEssence: A system for domain-independent, real-time news clustering and multi-document summarization. In: Proceedings of the 1st International Conference on Human Language Technology Research, San Diego, USA, March 18-21 (2001)
Google Scholar
Salton, G.: Automatic text processing. Addison-Wesley Longman Publishing Co., Inc., Boston (1988)
Google Scholar
Salton, G., Singhal, A., Mitra, M., Buckley, C.: Automatic Text Structuring And Summarization. Information Processing & Management 33(2), 193–207 (1997)
Article Google Scholar
Wan, X.: An Exploration of Document Impact on Graph-Based Multi-Document Summarization. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Waikiki, USA, October 25-27, pp. 755–762 (2008)
Google Scholar
Watts, D.J., Strogatz, S.H.: Collective dynamics of ’small-world’ networks. Nature 393, 440–442 (1998)
Article Google Scholar
Zhang, Z., Blair-Goldensohn, S., Radev, D.R.: Towards CST-enhanced summarization. In: Proceedings of the 18th National Conference on Artificial Intelligence, Edmonton, Canada, July 28 - August 1, pp. 439–446 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo, Brazil
Rafael Ribaldo, Ademar Takeo Akabane & Thiago Alexandre Salgueiro Pardo
Departamento de Computação, Universidade Federal de São Carlos, Brazil
Lucia Helena Machado Rino

Authors

Rafael Ribaldo
View author publications
You can also search for this author in PubMed Google Scholar
Ademar Takeo Akabane
View author publications
You can also search for this author in PubMed Google Scholar
Lucia Helena Machado Rino
View author publications
You can also search for this author in PubMed Google Scholar
Thiago Alexandre Salgueiro Pardo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

UFSCAR, Rod. Washington Luís, 13565-905, São Carlos, Brazil
Helena Caseli
UFRGS, Av. Bento Gonçalves, 9500, 91501-970, Porto Alegre, Brazil
Aline Villavicencio
DETI/IEETA, Universidade de Aveiro, Campus Universitário de Santiago, 3810-193, Aveiro, Portugal
António Teixeira
UC/ IT, DEEC, Universidade de Coimbra, Polo 2, 3030-290, Coimbra, Portugal
Fernando Perdigão

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ribaldo, R., Akabane, A.T., Rino, L.H.M., Pardo, T.A.S. (2012). Graph-Based Methods for Multi-document Summarization: Exploring Relationship Maps, Complex Networks and Discourse Information. In: Caseli, H., Villavicencio, A., Teixeira, A., Perdigão, F. (eds) Computational Processing of the Portuguese Language. PROPOR 2012. Lecture Notes in Computer Science(), vol 7243. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28885-2_30

Download citation

DOI: https://doi.org/10.1007/978-3-642-28885-2_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28884-5
Online ISBN: 978-3-642-28885-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics