Abstract
The automation of the process of summarizing documents plays a major rule in many applications. Automatic Text Summarization has been focused on retaining the essential information without affecting the document quality. This paper proposes a new multi-document summarization method that combines topic model and fuzzy logic model. The proposed method extracts some relevant topic words from source documents. The extracted words are used as elements of fuzzy sets. Meanwhile, each sentence on the source document is used to generate a fuzzy relevance rule that measures the importance of each sentence. A fuzzy inference system is used to generate the final summarization. Our summarization results are evaluated against some well-known summary systems and performed well in divergences and similarities.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Voorhees, E.M., Harman, D.K.: The Eight Text Retrieval Conference (TREC-8). In: National Institute of Standards and Technology (NIST) (1999)
DUC.: The Document Understanding Conference (2001-2007), http://duc.nist.gov
TAC.: Text Analysis Conference (2008-present), http://www.nist.gov/tac/
Fukushima, T., Okumura, M.: Text summarization challenge: text summarization in Japan. In: NAACL 2001 Workshop Automatic Summarization, pp. 51–59 (2001)
Zadeh, L.A.: Fuzzy sets. In: Yager, R.R., Ovchinnikov, S., Tong, R.M., Nguyen, H.T. (eds.) Fuzzy Sets and Applications: Selected Papers by L.A. Zadeh, pp. 29–44. Wiley and Sons (1987); Originally published in Information and Control, vol. 8, pp. 338–353. Academic Press, New York (1965)
Witte, R., Bergler, S.: Fuzzy coreference resolution for Summarization. In: 2003 International Symposium on Reference Resolution and Its Applications to Question Answering and Summarization (ARQAS), pp. 43–50. Universit Ca Foscari, Venice (2003)
Lin, C.-Y.: ROUGE: a Package for Automatic Evaluation of Summaries. In: Proceedings of the Workshop on Text Summarization Branches Out (WAS 2004), Barcelona, Spain, July 25-26 (2004)
Suanmali, L., Salim, N., Binwahlan, M.S.: Fuzzy Logic Based Method for Improving Text Summarization. International Journal of Computer Science and Information Security (IJCSIS) 2(1) (2009)
Ravindra, G., Balakrishnan, N., Ramakrishnan, K.R.: Automatic Evaluation of Extract Summaries Using Fuzzy F-score Measure. In: 5th International Conference on Knowledge Based Computer Systems, December 19-22, pp. 487–497 (2004)
Gillick, D.: Sentence Boundary Detection and the Problem with the U.S. The Association for Computational Linguistics, 241–244 (2009)
Reynar, J.C., Ratnaparkhi, A.: A Maximum Entropy Approach to Identifying Sentence Boundaries. In: 5th Conference on Applied Natural Language Processing, Washington, D.C., March 31-April 3 (1997)
Porter, M.F.: An Algorithm for Suffix Stripping. Program 14(3), 130–137 (1980)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet Allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
Newman, D.: Topic modeling tool, http://code.google.com/p/topic-modeling-tool
Zadeh, L.A.: Fuzzy Sets. Information and Control 8(3), 338–353 (1965)
Zadeh, L.A.: Fuzzy sets as a basis for a theory of possibility. In: Fuzzy Sets and Systems, pp. 9–34. Elsevier, Amsterdam (1999)
Louis, A., Nenkova, A.: Summary Evaluation without Human Models. In: Text Analysis Conference (TAC) (2008)
McKeown, K., Barzilay, R., Chen, J., Elson, D.K., Evans, D.K., Klavans, J., Nenkova, A., Schiffman, B., Sigelman, S.: Columbia’s Newsblaster: New Features and Future Directions. In: HLT-NAACL (2003)
Timothy, D.R., Allison, T., Blair-goldensohn, S., Blitzer, J., Elebi, A., Dimitrov, S., Drabek, E., Hakim, A., Lam, W., Liu, D., Otterbacher, J., Qi, H., Saggion, H., Teufel, S., Winkel, A., Zhang, Z.: MEAD a platform for multidocument multilingual text summarization. In: International Conference on Language Resources and Evaluation (2004)
Conroy, J.M., Schlesinger, J.D., O’Leary, D.P.: Topic-Focused Multi-Document Summarization Using an Approximate Oracle Score. In: The ACL 2006 / COLING 2006 (2006)
SIMetrix: Summary Input similarity Metrics, http://www.cis.upenn.edu/~lannie/IEval2.html
Summerscales, R.L., Argamon, S., Bai, S., Huperff, J., Schwartzff, A.: Automatic Summarization of Results from Clinical Trials. In: BIBM, pp. 372–377 (2011)
Kiritchenko, S., Bruijn, B., Carini, S., Martin, J., Sim, I.: Exact: automatic extraction of clinical trial characteristics from journal publications. BMC Med. Inform. Decis. Mak. 10(1), 56 (2010)
Zadeh, L.A.: The Concept of a Linguistic Variable and Its Application to Approximate Reasoning1. Information Sciences 8, 199–249 (1975)
Neto, L., Santos, A.D., Kaestner, C.A.A., Freitas, A.A.: Document Clustering and Text Summarization. In: 4th Int. Conf. Practical Applications of Knowledge Discovery and Data Mining (PADD 2000), pp. 41–55. The Practical Application Company, London (2000)
Salton, G., Buckley, C.: Term-weighting Approaches in Automatic Text Retrieval. Information Processing and Management 24, 513–523 (1988); Reprinted in: Sparck Jones, K., Willet, P.: Readings in Information Retrieval, pp. 323–328. Morgan Kaufmann (1997)
Jaccard, P.: Etude comparative de la distribution florale dans une portion des Alpes et des Jura. Bulletin de la Soci. Vaudoise des Sciences Naturelles 37, 547–579 (1901)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, New York (1991)
Dhillon, I., Mallela, S., Kumar, R.: Enhanced word clustering for hierarchical classfication. In: Proc. of 8th ACM Intl. Conf. on Knowledge Discovery and Data Mining (2002)
Kullback, S., Leibler, R.A.: On Information and Sufficiency. Annals of Mathematical Statistics 22(1), 79–86 (1951)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lee, S., Belkasim, S., Zhang, Y. (2013). Multi-document Text Summarization Using Topic Model and Fuzzy Logic. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2013. Lecture Notes in Computer Science(), vol 7988. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39712-7_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-39712-7_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39711-0
Online ISBN: 978-3-642-39712-7
eBook Packages: Computer ScienceComputer Science (R0)