Abstract
Information retrieval systems employ the classification of documents into various categories to facilitate retrieval. The problem of categorization depends on the successful solution to three subproblems: creation of categories, determining the relationship between categories, and maintenance of the categorization system. In existing document categorization systems, the categories are formed by using hit and trial methods. This increases the initial setup period for the system. The initial setup time is further affected by an empirical assignment of relationships between categories.
In this paper, we propose a solution to the problem of developing categories by the application of techniques originating in knowledge acquisition. The approach is based on capturing the knowledge of a user to ensure continuity with the existing categorization system. The use of Personal Construct Theory for knowledge elicitation helps in making explicit the subconscious hierarchical relationships between various categories as perceived by the user.
This research is supported by the NSF grant IRI-8805875.
Preview
Unable to display preview. Download preview PDF.
References
J. H. Boose. Expertise Transfer for Expert System Design. Elsevier-Science Publishers, New York, 1986.
Carnegie Group, Pittsburgh, PA. Text Categorization Shell: Technical Brief, 1989. 13 p.
A. Hart. Knowledge Acquisition for Expert Systems. McGraw-Hill, New York, NY, 1986.
G. A. Kelly. The Psychology of Personal Constructs. Norton Publishers, 1955.
C. J. van Rijsbergen. Information Retrieval. Butterworth Publishers, Boston, MA, 2 edition, 1980.
C. J. van Rijsbergen, D. J. Harper, and M. F. Porter. The selection of good search terms. Information Processing and Management, 17:77–91, 1981.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1991 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bhatia, S.K., Deogun, J.S., Raghavan, V.V. (1991). Formation of categories in document classification systems. In: Sherwani, N.A., de Doncker, E., Kapenga, J.A. (eds) Computing in the 90's. Great Lakes CS 1989. Lecture Notes in Computer Science, vol 507. Springer, New York, NY. https://doi.org/10.1007/BFb0038478
Download citation
DOI: https://doi.org/10.1007/BFb0038478
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-97628-0
Online ISBN: 978-0-387-34815-5
eBook Packages: Springer Book Archive