Abstract
Automatic document summarization is always attractive to computer science researchers. A novel approach is proposed to address this topic and mainly focuses on the summarization of plain documents. Conventional summarization methods do not fully use the inter-sentence relevance that is not preserved during the processing. In contrast, to tackle the problem and incorporate the latent relations among sentences, our approach constructs relevance structures at sentence-level for plain documents and each sentence is scored with a significance value. Accordingly, important sentences “present” themselves automatically, and the summary paragraph is then generated by selecting top-k scored sentences. Convergence of the algorithm is proved, and experiment, which is conducted on two data sets (DUC 2006 and DUC 2007), shows that the proposed model gives convincing results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 335–336. ACM (1998)
Conroy, J.M., O’leary, D.P.: Text summarization via hidden markov models. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 406–407. ACM (2001)
Dou, S., Sun, J.-T., Li, H., Yang, Q., Chen, Z.: Document summarization using conditional random fields. In: Proceedings of IJCAI, vol. 7, pp. 2862–2867 (2007)
Erkan, G., Radev, D.R.: LexRank: Graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research 22(1), 457–479 (2004)
Gong, Y., Liu, X.: Generic text summarization using relevance measure and latent semantic analysis. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 19–25. ACM (2001)
He, Z., Chen, C., Bu, J., Wang, C., Zhang, L., Cai, D., He, X.: Document summarization based on data reconstruction. In: Twenty-Sixth AAAI Conference on Artificial Intelligence (2012)
Jones, K.S.: Automatic summarising: The state of the art. Information Processing & Management 43(6), 1449–1481 (2007)
Lin, C.-Y., Hovy, E.: Automatic evaluation of summaries using n-gram co-occurrence statistics. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol. 1, pp. 71–78. Association for Computational Linguistics (2003)
Luhn, H.P.: The automatic creation of literature abstracts. IBM Journal of Research and Development 2(2), 159–165 (1958)
Ma, T., Wan, X.: Multi-document Summarization Using Minimum Distortion. In: 2010 IEEE 10th International Conference on Data Mining, pp. 354–363. IEEE (2010)
Mani, I., Bloedorn, E.: Multi-document summarization by graph search and matching. In: AAAI 1997 (1997)
Mani, I., Klein, G., House, D., Hirschman, L., Firmin, T., Sundheim, B.: SUMMAC: a text summarization evaluation. Natural Language Engineering 8(01), 43–68 (2002)
Mihalcea, R., Tarau, P.: A language independent algorithm for single and multiple document summarization. In: Proceedings of IJCNLP, vol. 5 (2005)
Porter, M.F.: An algorithm for suffix stripping. Program: Electronic Library and Information Systems 14(3), 130–137 (1993)
Radev, D.R., Jing, H., Stys, M., Tam, D.: Centroid-based summarization of multiple documents. Information Processing & Management 40(6), 919–938 (2004)
Salton, G., McGill, M.J.: Introduction to modern information retrieval, vol. 1. McGraw-Hill (1983)
Wan, X., Yang, J.: Improved affinity graph based multi-document summarization. In: Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers, pp. 181–184. Association for Computational Linguistics (2006)
Wan, X., Yang, J.: Multi-document summarization using cluster-based link analysis. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 299–306. ACM (2008)
Wang, D., Li, T., Zhu, S., Ding, C.: Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 307–314. ACM (2008)
Wasson, M.: Using leading text for news summaries: Evaluation results and implications for commercial summarization applications. In: Proceedings of the 17th International Conference on Computational Linguistics, vol. 2, pp. 1364–1368. Association for Computational Linguistics (1998)
Yin, W., Pei, Y., Zhang, F., Huang, L.: Query-focused multi-document summarization based on query-sensitive feature space. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM 2012, pp. 1652–1656. ACM, New York (2012)
Zha, H.: Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 113–120. ACM (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, X., Zhu, S., Xie, H., Li, Q. (2013). Document Summarization via Self-Present Sentence Relevance Model. In: Meng, W., Feng, L., Bressan, S., Winiwarter, W., Song, W. (eds) Database Systems for Advanced Applications. DASFAA 2013. Lecture Notes in Computer Science, vol 7826. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37450-0_24
Download citation
DOI: https://doi.org/10.1007/978-3-642-37450-0_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37449-4
Online ISBN: 978-3-642-37450-0
eBook Packages: Computer ScienceComputer Science (R0)