Abstract
A method for supporting WWW retrieval by constructing a flexible category structure adaptable to the user's search intention is proposed. The method uses categorization viewpoints as a priori knowledge, where a categorization viewpoint is a finite set of consistent category names. A set of documents retrieved by initial keywords is decomposed by categorization viewpoints and each decomposition is scored by clearness or entropy. The user selects an appropriate decomposition by considering the score. The decomposition is recursively performed until a category structure of reasonable size is obtained. Experimental results show that the sets of documents decomposed by the proposed method have higher precision than those decomposed by clustering (K-means). It is also shown that both the scores based on clearness and entropy of the decomposition have relatively high correlation with the precision.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Anick, P. G. and Tipirneni, S.: The Paraphrase Search Assistant: Terminological Feedback for Iterative Information Seeking, in SIGIR '99, pp.153–159, 1999.
Dreilinger, D. and Howe, A. E.: Experiences with Selecting Search Engines using Metasearch, ACM Trans. Information Systems, Vol. 15,No.3, pp.195–222, 1997.
Fishkin, K. and Stone, M. C.: Enhanced Dynamic Queries via Movable Filters, in CHI '95, pp.415–420, 1995.
Golovchinsky, G.: Queries? Links? Is there a difference?, in CHI 97, pp.407–414, 1997.
Grossman, D. A. and Frieder, O.: Information Retrieval: Algorithms and Heuristics, pp.134–142, Kluwer Academic Publishers, 1998.
Harada, M.: Freya version 0.92, 1998, http://odin.ingrid.org/freya/.
Kawano, H. and Hasegawa, T.: Data Mining Technology for WWW Resource Retrieval, in IPSJ SIG Notes, DBS108, pp.33–40, 1996.
Kitani, T., et al.: BMIR-J2-ATest Collection for Evaluation of Japanese Information Retrieval Systems, in IPSJ SIG Notes, DBS114, pp.15–22, 1998.
Matsumoto, Y., Kitauchi, A., Yamashita, T., Hirano, Y., Imaichi, O. and Imamura, T.: Japanese MorphologicalAnalysis System ChaSen Manual,Technical ReportNAIST-IS-TR97007, Nara Institute of Science and Technology, 1997.
Pirolli, P., Shank, P., Hearst, M. and Diehl, C.: Scatter/Gather Browsing Communicates the Topic Structure of a Very Large Text Collection, in CHI 96, pp.213–220, 1996.
Pollitt, A. S.: The key role of classification and indexing in view-based searching, in Proc. 63rd IFLA General Conf., 1997.
Robertson, G. G., Card, S. K. and Mackinlay, J. D.: Information Visualization using 3D Interactive Animation, Comm. ACM, Vol. 36,No. 4, pp.57–71, 1993.
Salton, G., Singhal, A., Buckley, C. and Mitra, M.: Automatic Text Decomposition Using Text Segments and Text Themes, in Hypertext '96, pp.53–65, 1996.
Sanderson, M. and Croft, B.: Deriving concept hierarchies from text, in SIGIR '99, pp.206–213, 1999.
Tou, J. T. and Gonzalez, R. C.: Pattern Recognition Principles, pp.89–97, Addison-Wesley, 1974.
Voorhees, E. M. and Harman, D. K.: Evaluation Techniques and Measures, in The Seventh Text REtrieval Conference (TREC 7), p.A-1, National Institute of Standards and Technology (NIST), 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Takata, Y., Nakagawa, K., Seki, H. (2000). Flexible Category Structure for Supporting WWW Retrieval. In: Liddle, S.W., Mayr, H.C., Thalheim, B. (eds) Conceptual Modeling for E-Business and the Web. ER 2000. Lecture Notes in Computer Science, vol 1921. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45394-6_15
Download citation
DOI: https://doi.org/10.1007/3-540-45394-6_15
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41073-7
Online ISBN: 978-3-540-45394-9
eBook Packages: Springer Book Archive