Abstract
This paper propose a method which improves performance of kNN based text classification by using well estimated parameters. Some variants of the kNN method with different decision functions, k values, and feature sets are proposed and evaluated to find out adequate parameters. Our experimental results show that kNN method with carefully chosen parameters are very significant in improving the performance and reducing size of feature set. We carefully conclude that it is very worthy of tuning parameters of kNN method to increase performance rather than having hard time in developing a new learning method.
This Work was Supported by Hanshin University Research Grant in (2004).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Lee, D.G.: A High Speed Index Term Extracting System Considering the Morphological Configuration of Noun. Master thesis of Dept. Computer Science and Engineering, Korea University (2000)
Lewis, D.D., Schapire, R.E., Callan, J.P., Papka, R.: Training algorithms for text categorization. In: Proc. of the Third Annual Symposium on Document Analysis and Information Retrieval (1994)
Lewis, D.D.: Feature Selection and Feature Extraction of Text Categorization. In: Proc. of Speech and Natural Language Workshop, pp. 212–217 (1992)
Mitchell, T.M.: Machine learning. McGraw Hill, New York (1996)
Salton, G.: Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, Reading (1989)
Tzeras, K., Hartman, S.: Automatic indexing based on bayesian inference networks. In: Proc. of the 16th Annual Int. ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 22–34 (1993)
Yang, Y.: Expert network: Effective and efficient learning from human decisions in text categorization and retrieval. In: 17th Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 13–22 (1994)
Yang, Y.: An evaluation of statistical approaches to text categorization. Journal of Information Access, 99–95 (1996)
Yang, Y., Pedersen, J.P.: A comparative study on feature selection in text categorization. In: Fisher Jr, D.H. (ed.) The Fourteenth Int. Conf. on Machine Learning, pp. 412–420. Morgan Kaufmann, San Francisco (1997)
Wiener, E., Pedersen, J.O., Weigend, A.S.: A neural network approach to topic spotting. In: Proc. of the Fourth Annual Symposium on Document Analysis and Information Retrieval (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lim, H.S. (2004). Improving kNN Based Text Classification with Well Estimated Parameters. In: Pal, N.R., Kasabov, N., Mudi, R.K., Pal, S., Parui, S.K. (eds) Neural Information Processing. ICONIP 2004. Lecture Notes in Computer Science, vol 3316. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30499-9_79
Download citation
DOI: https://doi.org/10.1007/978-3-540-30499-9_79
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23931-4
Online ISBN: 978-3-540-30499-9
eBook Packages: Springer Book Archive