Abstract
In present, Information retrieval systems which are simply expressed with combination between keywords and phrase search according to the direct keyword matching method to get the information which users need. But Web documents retrieval systems serve too many documents because of term ambiguity. Also it often happens that words with several meanings occur in a document, but in a rather different context from that expected by the querying person. So the user should need extra time and effort to get more close documents. To overcome these problems, in this paper we propose an information retrieval system based on the content, which connects documents according to the degree of semantic link which it express fuzzy value by fuzzy function. Also we propose an algorithm which it produce the hierarchical structure using the degree of concepts and contents among documents. As result, we are able to select and to provide user-interested documents.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baeza-ates, R., Ribeiro-Neto, B.: Modern Information Retrieval, pp. 230–255 (1998)
Wallis, P., Tom, J.A.: Relevance judgements for assessing recall. Information Processing and Management 32, 273–286 (1998)
Klir, G.J., Yuan, B.:Fuzzy Sets and Fuzzy Logic Theory and Applications (1998)
Koczy, L.T.:Information retrieval by fuzzy relations and hierarchical co-occurrence (1997)
Baranyi, P., Gedeon, T.D., Koczy, L.T.:Improved fuzzy and neural network algorithms for frequency prediction in document filtering. TR 97-02 (1997)
Koczy, L.T., Gedeon, T.D., Koczy, J.A.: The construction of fuzzy relational maps in information retrieval. IETR 98-01 (1998)
Koczy, L.T., Gedeon, T.: Information retrieval by fuzzy relations and hierarchical cooccurrence, Part I. TR99-01, Dept. of Info. Eng., School of Comp. Sci. & Eng. UNSW (1999)
Eun, Hye-jue: An Algorithm of Documents classification and Query Extension using fuzzy function. Journal of KISS: Software and applications 28(2) (2001)
Blosseville, M., Hebrail, G., Monteil, M., Penot, N.: Automatic document classification: natural language processing, statistical analysis, and expert system techniques used together. In: SIGIR (1999)
Jacobs, P.: Using statistical methods to improve knowledge-based news categorization. IEEE Expert (2000)
Hoch, R.: Using Information Retrieval techniques for text classification in document analysis. In: SIGIR (1999)
Guha, S.: A Robust Clustering Algorithm for categorical Attributes. Information Systems 25(5), 345–366 (2000)
Oard, D.W.: Support for interactive document selection in cross language information retrieval. Information Processing and Management 35 (1999)
Boley, D.: Document Categorization and Query Generation on the World Wide Web using WebACE. Artificial Intellignece Review 13, 365–391 (1999)
Joachims, T.: Text Categorization with vector support machine : learning with many relevant features. Technical report 23, University of Dortsmund, LS VIII (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Han, SW., Eun, HJ., Kim, YS., Kóczy, L.T. (2004). A Document Classification Algorithm Using the Fuzzy Set Theory and Hierarchical Structure of Document. In: Laganá, A., Gavrilova, M.L., Kumar, V., Mun, Y., Tan, C.J.K., Gervasi, O. (eds) Computational Science and Its Applications – ICCSA 2004. ICCSA 2004. Lecture Notes in Computer Science, vol 3043. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24707-4_16
Download citation
DOI: https://doi.org/10.1007/978-3-540-24707-4_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22054-1
Online ISBN: 978-3-540-24707-4
eBook Packages: Springer Book Archive