Abstract
One approach to deal with large query answers or large collections of text documents is to impose some kind of structure to the collection for instance by a grouping into clusters of somehow related or close items. Another approach is to consider characteristics of the collection for instance by considering central and/or as a frequent keywords possibly taken from a background vocabulary or a more thorough structuring of background knowledge, like taxonomies or ontologies. In this paper we present a preliminary approach to combine these directions. More specifically we address an approach where conceptual summaries can be provided as answers to queries or survey over a document collection. The general idea is to apply a background knowledge ontology in connection with a combined clustering and generalization of keywords.
Preliminary experiments with Wordnet as background knowledge and excerpts from Semcor as data are presented and discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Andreasen, T., Bulskov, H.: On Browsing Domain Ontologies for Information Base Content. In: Melin, P., Castillo, O., Aguilar, L.T., Kacprzyk, J., Pedrycz, W. (eds.) IFSA 2007. LNCS (LNAI), vol. 4529, Springer, Heidelberg (2007)
Bulskov, H., Andreasen, T., Terney, T.V.: Conceptual Summaries as Query Answers. In: Proceedings NAFIPS 2007 (2007)
Nilsson, J.F.: A logico-algebraic framework for ontologies – ONTOLOG. In: Jensen, P.A., Skadhauge, P. (eds.) First International OntoQuery Workshop, University of Southern Denmark (2001)
Miller, G.A., Chodorow, M., Landes, S., Leacock, C., Thomas, R.G.: Using a semantic concordance for sense identification. In: Proc. of the ARPA Human Language Technology Workshop, pp. 240–243 (1994)
Miller, G.A.: Wordnet: a lexical database for english. Commun. ACM 38(11), 39–41 (1995)
Bulskov, H., Knappe, R., Andreasen, T.: On querying ontologies and databases. In: Christiansen, H., Hacid, M.-S., Andreasen, T., Larsen, H.L. (eds.) FQAS 2004. LNCS (LNAI), vol. 3055, Springer, Heidelberg (2004)
Rada, R., Mili, H., Bicknell, E., Blettner, M.: Development and application of a metric on semantic nets. IEEE Transactions on Systems, Man, and Cybernetics 19(1), 17–30 (1989)
Resnik, P.: Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language (1999)
Andreasen, T., Knappe, R., Bulskov, H.: Domain-specific similarity and retrieval. In: Proceedings IFSA 2005, pp. 496–502. Tsinghua University Press (2005)
Andreasen, T., Jensen, P.A., Nilsson, J.F., Paggio, P., Pedersen, B.S., Thomsen, H.E.: Content-based text querying with ontological descriptors. Data Knowledge Engineering 48(2), 199–219 (2004)
Yager, R.R., Petry, F.E.: A Multicriteria Approach to Data Summarization Using Concept Hierarchies. IEEE Trans. on Fuzzy Sys. 14(6) (2006)
Lee, D., Kim, M.: Database Summarization using fuzzy ISA hierarchies. IEEE Trans. on Sys. Man and Cyb. 27(1) (1997)
Kim, D.-W., Lee, K.H., Lee, D.: Fuzzy clustering of categorical data using fuzzy centroids. Elsevier Sciencedirect (2004)
Huang, Z., Ng, M.K.: A Fuzzy k-Modes Algorithm for Clustering Categorical Data Zhexue Ieee Trans. Ieee Trans. on Fuzzy Sys. 7, 4 (1999)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Andreasen, T., Bulskov, H., Terney, T.V. (2008). Ontological Summaries through Hierarchical Clustering. In: An, A., Matwin, S., Raś, Z.W., Ślęzak, D. (eds) Foundations of Intelligent Systems. ISMIS 2008. Lecture Notes in Computer Science(), vol 4994. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68123-6_54
Download citation
DOI: https://doi.org/10.1007/978-3-540-68123-6_54
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68122-9
Online ISBN: 978-3-540-68123-6
eBook Packages: Computer ScienceComputer Science (R0)