ABSTRACT
Real time document summarization is a critical need nowadays, owing to the large volume of information available for our reading, and our inability to deal with this entirely due to limitations of time and resources. Oftentimes, information is available in multiple sources, offering multiple contexts and viewpoints on a single topic of interest. Automated multi-document summarization (MDS) techniques aim to address this problem. However, current techniques for automated MDS suffer from low precision and accuracy with reference to a given subject matter, when compared to those summaries prepared by humans and takes large time to create the summary when the input given is too huge. In this paper, we propose a hybrid MDS technique combining feature based algorithms and dynamic programming for generating a summary from multiple documents based on user provided query. Further, in real-world scenarios, Web search serves up a large number of URLs to users, and the work of making sense of these with reference to a particular query is left to the user. In this context, an efficient parallelized MDS technique based on Hadoop is also presented, for serving a concise summary of multiple Webpage contents for a given user query in reduced time duration.
- K.S. Jones, Automatic summarizing: the state of the art, Inf. Process. Manage.43 (6) (2007) 1449--1481. Google ScholarDigital Library
- I. Mani, M.T. Maybury, Advances in Automatic Text Summarization, MIT Press,Cambridge, 1999, pp. 442. Google ScholarDigital Library
- J. Tang, L. Yao, D. Chen, Multi-topic based query-oriented summarization, in:Proceedings of the 9th SIAM International Conference on Data Mining, Nevada, USA, 30 April-2 May, 2009, pp. 1148--1159.Google Scholar
- X. Cai, W. Li, A spectral analysis approach to document summarization: clustering and ranking sentences simultaneously, Inf. Sci. 181 (18) (2011) 3816--3827.Google ScholarCross Ref
- M. Kutlu, C. Cigir, I. Cicekli, Generic text summarization for Turkish, Comput. J.53 (8) (2010) 1315--1323. Google ScholarDigital Library
- J. Carbonell, J. Goldstein, The use of MMR, diversity-based re-ranking for reordering documents and producing summaries, in: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia, 24-28 August, 1998, pp. 335--336. Google ScholarDigital Library
- Y. Ouyang, W. Li, S. Li, Q. Lu, Intertopic information mining for query-based summarization, J. Am. Soc. Inf. Sci. Technol. 61 (5) (2010) 1062--1072. Google ScholarDigital Library
- M. Kutlu, C. Cigir, I. Cicekli, Generic text summarization for Turkish, Comput. J. 53 (8) (2010) 1315--1323. Google ScholarDigital Library
- Mihalcea, Rada. "Graph-based ranking algorithms for sentence extraction, applied to text summarization." Proceedings of the ACL 2004 on Interactive poster and demonstration sessions. Association for Computational Linguistics, 2004. Google ScholarDigital Library
- Kyoomarsi, Farshad, et al. "Optimizing text summarization based on fuzzy logic." Seventh IEEE/ACIS International Conference on Computer and Information Science. IEEE, 2008. Google ScholarDigital Library
- K. Kaikhah, "Automatic Text Summarization with Neural Networks", Second IEEE International Conference on Intelligent Systems, JUNE 2004, pp. 40--44Google Scholar
- Amini, Massih R., Nicolas Usunier, and Patrick Gallinari. "Automatic text summarization based on word-clusters and ranking algorithms." Advances in Information Retrieval. Springer Berlin Heidelberg, 2005. 142--156. Google ScholarDigital Library
- Luo, Yihui, and Shuchu Xiong. "An improvement on approximate dynamic programming for multi-document summarization." Security, Pattern Analysis, and Cybernetics (SPAC), 2014 International Conference on. IEEE, 2014.Google Scholar
- Zhong, Sheng-hua, et al. "Query-oriented unsupervised multi-document summarization via deep learning model." Expert Systems with Applications 42.21 (2015): 8146--8155. Google ScholarDigital Library
- Lee, Hyeokju, Joon Her, and Sung-Ryul Kim. "Implementation of a large-scalable social data analysis system based on Map Reduce." Computers, Networks, Systems and Industrial Engineering (CNSI), 2011 ACIS/JNU International Conference on. IEEE, 2011. Google ScholarDigital Library
- Query-oriented Unsupervised Multi-document Summarization on Big Data
Recommendations
Latent dirichlet allocation based multi-document summarization
AND '08: Proceedings of the second workshop on Analytics for noisy unstructured text dataExtraction based Multi-Document Summarization Algorithms consist of choosing sentences from the documents using some weighting mechanism and combining them into a summary. In this article we use Latent Dirichlet Allocation to capture the events being ...
Research on Multi-document Summarization Based on LDA Topic Model
IHMSC '14: Proceedings of the 2014 Sixth International Conference on Intelligent Human-Machine Systems and Cybernetics - Volume 02Compared with VSM (Vector Space Model) and graph-ranking models, LDA (Latent Dirichlet Allocation) Model can discover latent topics in the corpus and latent topics are beneficial to use sentence-ranking mechanisms to form a good summary. In the paper, ...
Exploring actor---object relationships for query-focused multi-document summarization
Most research on multi-document summarization explores methods that generate summaries based on queries regardless of the users' preferences. We note that, different users can generate somewhat different summaries on the basis of the same source data ...
Comments