Abstract
Graph-based ranking methods have been successfully applied to multi-document summarization by adopting various link analysis algorithms such as PageRank and HITS to incorporate diverse relationships into the process of sentence evaluation. Both the homogeneous relationships between sentences and the heterogeneous relationships between sentences and documents have been investigated in the past. However, for query-focused multi-document summarization, the other three kinds of relationships (i.e. the relationships between documents, the relationships between the given query and documents, and the sentence-to-document correlation strength) are seldom considered when computing the sentence’s importance. In order to address the limitations, this study proposes a novel Co-HITS-Ranking based approach to query-biased summarization, which can fuse all of the above relationships, either homogeneous or heterogeneous, in a unified two-layer graph model with the assumption that significant sentences and significant documents can be self boosted and mutually boosted. In the model, the manifold-ranking algorithm is employed to assign the initial biased information richness scores for sentences and documents individually only based on the local recommendations between homogeneous objects. Then by adopting the Co-HITS-Ranking algorithm, the initial biased information richness scores of sentences and documents are naturally incorporated in a mutual reinforcement framework to co-rank heterogeneous objects jointly. The final score of each sentence can be obtained through an iteratively updating process. Experimental results on the DUC datasets demonstrate the good effectiveness of the proposed approach.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Saggion, H., Bontcheva, K., Cunningham, H.: Robust Generic and Query-Based Summarization. In: 10th Conference of the European Chapter of the Association for Computational Linguistics, pp. 235–238 (2003)
Zhao, L., Wu, L.D., Huang, X.J.: Using Query Expansion in Graph-Based Approach for Query, Focused Multi, Document Summarization. Information Processing and Management 45, 35–41 (2009)
Wei, F.R., Li, W.J., Lu, Q., He, Y.X.: A Cluster-Sensitive Graph Model for Query-Oriented Multi-Document Summarization. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 446–453. Springer, Heidelberg (2008)
Erkan, G., Radev, D.R.: LexRank: Graph-Based Centrality as Salience in Text Summarization. Journal of Artificial Intelligence Research 22, 457–479 (2004)
Mihalcea, R., Tarau, P.: TextRank–Bringing Order into Text. In: Conference on Empirical Methods in Natural Language Processing, pp. 404–411 (2004)
Haveliwala, T.H.: Topic-Sensitive PageRank. In: 11th International Conference on World Wide Web, pp. 517–526. ACM, New York (2002)
Wan, X.J., Yang, J.W., Xiao, J.G.: Manifold-Ranking Based Topic-Focused Multi-Document Summarization. In: 20th International Joint Conference on Artificial Intelligence, pp. 2903–2908. Morgan Kaufmann Publishers Inc, San Francisco (2007)
Zhou, D., Weston, J., Gretton, A., Bousquet, O., Schölkopf, B.: Ranking on Data Manifolds. In: Advances in Neural Information Processing Systems, vol. 16, pp. 169–176. MIT Press, Cambridge (2004)
Deng, H.B., Lyu, M.R., King, I.: A Generalized Co-HITS Algorithm and Its Application to Bipartite Graphs. In: 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 239–248. ACM, New York (2009)
Lin, C.Y., Hovy, E.: Automatic Evaluation of Summaries Using N-Gram Cooccurrence Statistics. In: Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pp. 71–78 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hu, P., Ji, D., Teng, C. (2010). Co-HITS-Ranking Based Query-Focused Multi-document Summarization. In: Cheng, PJ., Kan, MY., Lam, W., Nakov, P. (eds) Information Retrieval Technology. AIRS 2010. Lecture Notes in Computer Science, vol 6458. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17187-1_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-17187-1_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17186-4
Online ISBN: 978-3-642-17187-1
eBook Packages: Computer ScienceComputer Science (R0)