Abstract
A Multi-document Rhetorical Structure (MRS) is proposed for multi-document automatic summarization task. In this structure, interrelationship between text units, including the correlation between units calculated by hierarchical topic tree, the rhetorical relationship and temporal relationship, were represented at different levels of granularity. MRS simplified traditional multi-document representation in cross structure theory and supplement change and distribution information of events topics which cannot be obtained in information fusion theory. Concretely, a series of algorithms including building MRS, multi-document information fusion based MRS and summarization generation are proposed. The capability of concurrently fuse multiple knowledge sources of MRS strategies is testified by sets of experiments and shows good result.
Similar content being viewed by others
References
Radev, D. R., Jing, H., & Budzikowska, M. (2000). Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies. In ANLP/NAACL workshop on summarization, Seattle, WA, April 2000.
McKeown, K. R., & Radev, D. R. (1995). Generating summaries of multiple news articles. In Proceedings of 18th annual international ACM SIGIR conference on research and development in information retrieval, Seattle, Washington (pp. 74–82).
Stefanovic, N., & Pavel, L. (2011). A Lyapunov-Krasovskii stability analysis for game-theoretic based power control in optical links. Telecommunications Systems, 47(1–2), 19–33.
Boros, E., Kantora, P. B., & Neu, D. J. (2001). A clustering based approach to creating multi-document summaries. In Proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval, New Orleans, LA.
Guo, Y., & Stylios, G. (2003). A new multi-document summarization system. In Proceedings of the document understanding conference.
Radev, D. R. (2000). A common theory of information fusion from multiple text sources step one: cross-document structure. In Proceedings, 1st ACL SIGDIAL workshop on discourse and dialogue, Hong Kong, October 2000.
Zhang, Z., et al. (2002). Towards CST-Enhanced Summarization. In Proceedings of AAAI-2002.
Zhang, Z., et al. (2003). Learning crossdocument structural relationships using boosting. In Proceedings of the twelfth international conference on information and knowledge management CIKM 2003, New Orleans, Louisiana, USA (pp. 124–130).
Zha, H. (2002). Generic summarization and key phrase extraction using mutual reinforcement principle and sentence clustering. In Proceedings of the 25th annual international ACM SIGIR conference on research and development in information retrieval, Tampere, Finland.
Xu, Y.-D., et al. (2005). Using multiple features and statistical model to calculate text units similarity. In Proceedings of the fourth international conference on machine learning and cybernetics IEEE, Guangzhou, 19–21 August 2005.
Stefanovic, N., & Pavel, L. (2011). A Lyapunov-Krasovskii stability analysis for game-theoretic based power control in optical links. Telecommunications Systems, 47(1–2), 19–33.
Xu, Y.-D., & Quan, G.-R. (2009). Research on text hierarchical topic identification algorithm based on the dynamic diverse thresholds clustering. In The international conference on Asian language processing (pp. 206–210).
Liu, T., et al. (1999). Research on automatic abstracting based on text multi-level dependency structure. Journal of Computer Research and Development, 36(4), 479–488.
Xu, Y.-D., et al. (2007). Extraction and semantic computing of Chinese textual time information. Journal of Harbin Institute of Technology, 39(3), 438–442.
Marcu, D., & Gerber, L. (2001). An inquiry into the nature of multidocument abstracts, extracts, and their evaluation. In Proceedings of the NAACL-2001 workshop on automatic summarization.
Acknowledgements
This work was supported by Project 60803092 of the National Science Foundation of China. Promotive research fund for excellent young and middle-aged scientists of Shandong Province (2010BSA10014) and WeiHai City Science & Technology Fund Planning Project (2010-3-96).
Author information
Authors and Affiliations
Corresponding author
Additional information
The authors gratefully acknowledge the support of Project 60803092 of the National Science Foundation of China. Promotive research fund for excellent young and middle-aged scientists of Shandong Province (2010BSA10014) and WeiHai City Science & Technology Fund Planning Project (2010-3-96).
Rights and permissions
About this article
Cite this article
Xu, YD., Zhang, XD., Quan, GR. et al. MRS for multi-document summarization by sentence extraction. Telecommun Syst 53, 91–98 (2013). https://doi.org/10.1007/s11235-013-9681-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11235-013-9681-6