Abstract
Various applications concerning multi document has emerged recently. Information across topically related documents can often be linked. Cross-document Structure Theory (CST) analyzes the relationships that exist between sentences across related documents. However, most of the existing works rely on human experts to identify the CST relationships.In this work, we aim to automatically identify some of the CST relations using supervised learning method. We propose Genetic-CBR approach which incorporates genetic algorithm (GA) to improve the case base reasoning (CBR) classification. GA is used to scale the weights of the data features used by the CBR classifier. We perform the experiments using the datasets obtained from CSTBank corpus. Comparison with other learning methods shows that the proposed method yields better results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Radev, D.R.: A Common Theory of Information Fusion from Multiple Text Sources Step One: Cross-Document Structure. In: Proceeding of SIGDIAL, vol. 10, pp. 74–83 (2000)
Zhang, Z., Blair-Goldensohn, S., Radev, D.R.: Towards CST-Enhanced Summarization. In: Proceedings of AAAI/IAAI, pp. 439–446 (2002)
Zhang, Z., Otterbacher, J., Radev, D.R.: Learning cross-document structural relationships using boosting. In: Proceedings of CIKM, pp. 124–130 (2003)
Miyabe, Y., Takamura, H., Okumura, M.: Identifying cross-document relations between sentences. In: Proceedings of IJCNLP, pp. 141–148 (2008)
Zahri, N.A.H.B., Fukumoto, F.: Multi-document Summarization Using Link Analysis Based on Rhetorical Relations between Sentences. In: Proceedings of CICLing, vol. 2, pp. 328–338 (2011)
Erkan, G., Radev, D.R.: LexPageRank: Prestige in multi-document text summarization. In: Proceedings of EMNLP, pp. 365–371 (2004)
Jorge, M.L.C., Pardo, T.S.: Experiments with CST-based Multidocument Summarization. In: Workshop on Graph-based Methods for Natural Language Processing, pp. 74–82. ACL, Uppsala (2010)
Aamodt, A., Plaza, E.: Case-based reasoning: foundational issues, methodological variations and system approaches. AI Communications 7, 39–59 (1994)
Paszkowicz, W.: Genetic Algorithms, A Nature-inspired Tool: Survey of Applications in Materials Science and Related Fields. In: Mat. Man. Proc., vol. 24, pp. 174–197 (2009)
Scott, M.T.: An introduction to genetic algorithms. Journal of Computing Sciences in Colleges 20, 115–123 (2004)
Anita, T., Rucha, D.: Article: Genetic Algorithm - Survey Paper. In: IJCA Proceedings on NCRTC, vol. 5, pp. 25–29. Foundation of Computer Science, New York (2012)
Kotsiantis, S.B.: Supervised Machine Learning: A Review of Classification Techniques. Informatica Slovenia 31, 249–268 (2007)
CSTBank PhaseI, http://tangra.si.umich.edu/clair/CSTBank/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kumar, Y.J., Salim, N., Abuobieda, A. (2012). A Genetic-CBR Approach for Cross-Document Relationship Identification. In: Hassanien, A.E., Salem, AB.M., Ramadan, R., Kim, Th. (eds) Advanced Machine Learning Technologies and Applications. AMLTA 2012. Communications in Computer and Information Science, vol 322. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35326-0_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-35326-0_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35325-3
Online ISBN: 978-3-642-35326-0
eBook Packages: Computer ScienceComputer Science (R0)