The two-stage unsupervised approach to multidocument summarization

Alyguliyev, R. M.

doi:10.3103/S0146411609050083

The two-stage unsupervised approach to multidocument summarization

Published: 14 November 2009

Volume 43, pages 276–284, (2009)
Cite this article

Automatic Control and Computer Sciences Aims and scope Submit manuscript

R. M. Alyguliyev¹

82 Accesses
5 Citations
Explore all metrics

Abstract

This paper suggests an approach for creating a summary for a set of documents with revealing the topics and extracting informative sentences. The topics are determined through clustering of sentences, and the informative sentences are extracted using the ranking algorithm. The result of the summarization has been shown depends on the clustering method, the ranking algorithm, and the similarity measure. The experiments on an open benchmark datasets DUC2001 and DUC2002 have showed that the suggested clustering methods and the ranking algorithm show better results than the known k-means method and the ranking algorithms PageRank and HITS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on neural topic models: methods, applications, and challenges

Article Open access 25 January 2024

K-Means algorithm based on multi-feature-induced order

Article 09 April 2024

Recent automatic text summarization techniques: a survey

Article 29 March 2016

References

Harabagiu, S., Hickl, A., and Lacatusu, V., Satisfying Information Needs with Multi-Document Summaries, Information Processing and Management, 2007, vol. 43, no. 6, pp. 1619–1642.
Article Google Scholar
Jones, K., Automatic Summarizing: the State of the Art, Information Processing and Management, 2007, vol. 43, no. 6, pp. 1449–1481.
Article Google Scholar
Moens, M.-F., Angheluta, R., and Dumortier, J., Generic Technologies for Single- and Multi-Document Summarization, Information Processing and Management, 2005, vol. 41, no. 3, pp. 569–586.
Article Google Scholar
Zajic, D., Dorr, B.J., Lin, J., and Schwartz, R., Multi-Candidate Reduction: Sentence Compression as a Tool for Document Summarization Tasks, Information Processing and Management, 2007, vol. 43, no. 6, pp. 1549–1570.
Article Google Scholar
Zhang, Y., Zincir-Heywood, N., and Milios, E., World Wide Web Site Summarization, International Journal of Web Intelligence and Agents Systems, 2004, vol. 2, no. 1, pp. 39–53.
MATH Google Scholar
Antiqueira, L., Oliveira, O., Costa, L., and Nunes, M., A Complex Network Approach to Text Summarization, Information Sciences, 2009, vol. 179, no. 5, pp. 584–599.
Article MATH Google Scholar
Diao, Q. and Shan, J., A New Web Page Summarization Method, in Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’06), Washington USA, 2006, pp. 639–640.
Erkan, G. and Radev, D., Lexrank: Graph-Based Centrality as Salience in Text Summarization. Journal of Artificial Intelligence Research, 2004, vol. 22, pp. 457–479.
Google Scholar
Otterbacher, J., Erkan, G., and Radev, D., Biased LexRank: Passage Retrieval Using Random Walks with Question-Based Priors, Information Processing and Management, 2009, vol. 45, no. 1, pp. 42–54.
Article Google Scholar
Zhang, J., Xu, H., and Cheng, X., GSPSummary: a Graph-Based Sub-Topic Partition Algorithm for Summarization, in Proceedings of the 2008 Asia Information Retrieval Symposium, Harbin, China, 2008, pp. 321–334.
Liu, Y., Wang, X., Zhang, J., and Xu, H., Personalized PageRank Based Multi-Document Summarization, in Proceedings of the First IEEE International Workshop on Semantic Computing and Systems (WSCS2008), Huangshan, China, 2008, pp. 169–173.
Zhang, J., Cheng, X., Wu, G., and Xu, H., AdaSum: an Adaptive Model for Summarization, in Proceedings of the ACM 17th Conference on Information and Knowledge Management (CIKM’08), Napa Valley, USA, 2008, pp. 901–909.
Yeh, J.-Y., Ke, H.-R., and Yang, W.-P., iSpreadRank: Ranking Sentences for Extraction-Based Summarization Using Feature Weight Propagation in the Sentence Similarity Network, Expert Systems with Applications, 2008, vol. 35, no. 3, pp. 1451–1462.
Article Google Scholar
Diligenti, M, Gori, M., and Maggini, M., A Unified Probabilistic Framework for Web Page Scoring Systems, IEEE Transactions on Knowledge and Data Engineering, 2004, vol. 16, no. 1, pp. 4–16.
Article MathSciNet Google Scholar
Wan, X., Yang, J., and Xiao, J., Manifold-Ranking Based Topic-Focused Multi-Document Summarization, in Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI-2007), Hyderabad, India, 2007, pp. 2903–2908.
Tarasov, S.D., The Algorithm of Ranking Connected Structures for the Task of Automatic Composition of Review Summaries of Bulletin Subjects, in Trudy konferentsii 11-oi natsional’noi konferentsii po iskusstvennomu intellektu s mezhdunarodnym uchastiyem (KII-2008) (Proceedings of the 11-th National Conference on Artificial Intellect with International Participation (KII-2008), Dubna, Russia, vol. 2, pp. 204–211.
Wan, X. and Yang, J., Multi-Document Summarization Using Cluster-Based Link Analysis in Proceedings of the 31-st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’08), Singapore, 2008, pp. 299–306.
Aliguliyev, R.M., A New Sentence Similarity Measure and Sentence Based Extractive Technique for Automatic Text Summarization, Expert Systems with Applications, 2009, vol. 36, no. 4, pp. 7764–7772.
Article Google Scholar
Aliguliyev R.M., Clustering Techniques and Discrete Particle Swarm Optimization Algorithm for Multi-Document Summarization, Computational Intelligence, 2009, vol. 25, no. 4.
Strehl, A. and Ghosh, J., Value-Based Customer Grouping from Large Retail Data-Sets, in Proceedings of the SPIE Conference on Data Mining and Knowledge Discovery, Orlando, USA, 2000, vol. 4057, pp. 33–42.
Padmanabhan, D., Desikan, P., and Srivastava, J., WICER: a Weighted Inter-Cluster Edge Ranking for Clustered Graphs, in Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI’2005), Compiegne, France, 2005, pp. 522–528.
Lin, C.-Y., ROUGE: a Package for Automatic Evaluation Summaries, in Proceedings of the Workshop on Text Summarization Branches Out, Barcelona, Spain, 2004, pp. 74–81.
http://duc.mst.gov

Download references

Author information

Authors and Affiliations

Institute of Information Technologies, National Academy of Sciences of Azerbaijan, ul. F. Agaeva 9, Baku, Az-1141, Azerbaijan
R. M. Alyguliyev

Authors

R. M. Alyguliyev
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to R. M. Alyguliyev.

Additional information

Original Russian Text © R.M. Alyguliyev, 2009, published in Avtomatika i Vychislitel’naya Tekhnika, 2009, No. 5, pp. 72–82.

About this article

Cite this article

Alyguliyev, R.M. The two-stage unsupervised approach to multidocument summarization. Aut. Conrol Comp. Sci. 43, 276–284 (2009). https://doi.org/10.3103/S0146411609050083

Download citation

Published: 14 November 2009
Issue Date: October 2009
DOI: https://doi.org/10.3103/S0146411609050083

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The two-stage unsupervised approach to multidocument summarization

Abstract

Access this article

Similar content being viewed by others

A survey on neural topic models: methods, applications, and challenges

K-Means algorithm based on multi-feature-induced order

Recent automatic text summarization techniques: a survey

References

Author information

Authors and Affiliations

Corresponding author

Additional information

About this article

Cite this article

Key words

Navigation

The two-stage unsupervised approach to multidocument summarization

Abstract

Access this article

Similar content being viewed by others

A survey on neural topic models: methods, applications, and challenges

K-Means algorithm based on multi-feature-induced order

Recent automatic text summarization techniques: a survey

References

Author information

Authors and Affiliations

Corresponding author

Additional information

About this article

Cite this article

Share this article

Key words

Search

Navigation