Skip to main content

Formation of categories in document classification systems

  • Track 2: Artificial Intelligence
  • Conference paper
  • First Online:
Computing in the 90's (Great Lakes CS 1989)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 507))

Included in the following conference series:

Abstract

Information retrieval systems employ the classification of documents into various categories to facilitate retrieval. The problem of categorization depends on the successful solution to three subproblems: creation of categories, determining the relationship between categories, and maintenance of the categorization system. In existing document categorization systems, the categories are formed by using hit and trial methods. This increases the initial setup period for the system. The initial setup time is further affected by an empirical assignment of relationships between categories.

In this paper, we propose a solution to the problem of developing categories by the application of techniques originating in knowledge acquisition. The approach is based on capturing the knowledge of a user to ensure continuity with the existing categorization system. The use of Personal Construct Theory for knowledge elicitation helps in making explicit the subconscious hierarchical relationships between various categories as perceived by the user.

This research is supported by the NSF grant IRI-8805875.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. J. H. Boose. Expertise Transfer for Expert System Design. Elsevier-Science Publishers, New York, 1986.

    Google Scholar 

  2. Carnegie Group, Pittsburgh, PA. Text Categorization Shell: Technical Brief, 1989. 13 p.

    Google Scholar 

  3. A. Hart. Knowledge Acquisition for Expert Systems. McGraw-Hill, New York, NY, 1986.

    Google Scholar 

  4. G. A. Kelly. The Psychology of Personal Constructs. Norton Publishers, 1955.

    Google Scholar 

  5. C. J. van Rijsbergen. Information Retrieval. Butterworth Publishers, Boston, MA, 2 edition, 1980.

    Google Scholar 

  6. C. J. van Rijsbergen, D. J. Harper, and M. F. Porter. The selection of good search terms. Information Processing and Management, 17:77–91, 1981.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Naveed A. Sherwani Elise de Doncker John A. Kapenga

Rights and permissions

Reprints and permissions

Copyright information

© 1991 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bhatia, S.K., Deogun, J.S., Raghavan, V.V. (1991). Formation of categories in document classification systems. In: Sherwani, N.A., de Doncker, E., Kapenga, J.A. (eds) Computing in the 90's. Great Lakes CS 1989. Lecture Notes in Computer Science, vol 507. Springer, New York, NY. https://doi.org/10.1007/BFb0038478

Download citation

  • DOI: https://doi.org/10.1007/BFb0038478

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-0-387-97628-0

  • Online ISBN: 978-0-387-34815-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics