Abstract
The k-Nearest-Neighbors (kNN) method for classification is simple but effective in many cases. The success of kNN in classification depends on the selection of a “good value” for k. In this paper, we proposed a contextual probability-based classification algorithm (CPC) which looks at multiple sets of nearest neighbors rather than just one set of k nearest neighbors for classification to reduce the bias of k. The proposed formalism is based on probability, and the idea is to aggregate the support of multiple neighborhoods for various classes to better reveal the true class of each new instance. To choose a series of more relevant neighborhoods for aggregation, three neighborhood selection methods: distance-based, symmetric-based, and entropy-based neighborhood selection methods are proposed and evaluated respectively. The experimental results show that CPC obtains better classification accuracy than kNN and is indeed less biased by k after saturation is reached. Moreover, the entropy-based CPC obtains the best performance among the three proposed neighborhood selection methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Hand, D., Mannila, H., Smyth, P.: Principles of Data Mining. The MIT Press, Cambridge (2001)
Sebastiani, F.: Machine Learning in Automatic Text Categorization. ACM Computing Survey 34(1), 1–47 (2002)
Ripley, B.: Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge (1996)
Mitchell, T.: Machine Learning. MIT Press and McGraw-Hill, Cambridge (1997)
Wang, H.: Contextual Probability. Journal of Telecommunications and Information Technology 4(3), 92–97 (2003)
Guan, J., Bell, D.: Generalization of the Dempster-Shafer Theory. In: Proc. IJCAI 1993, pp. 592–597 (1993)
Shafer, G.: A Mathematical Theory of Evidence. Princeton University Press, Princeton (1976)
Feller, W.: An Introduction to Probability Theory and Its Applications. Wiley, Chichester (1968)
Michie, D., Spiegelhalter, D.J., Taylor, C.C.: Machine Learning, Neural and Statistical Classification. Ellis Horwood (1994)
Wang, H., Duntsch, I., Bell, D.: Data Reduction Based on Hyper Relations. In: Proc. of KDD 1998, New York, pp. 349–353 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Guo, G., Wang, H., Bell, D., Liao, Z. (2004). Contextual Probability-Based Classification. In: Atzeni, P., Chu, W., Lu, H., Zhou, S., Ling, TW. (eds) Conceptual Modeling – ER 2004. ER 2004. Lecture Notes in Computer Science, vol 3288. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30464-7_25
Download citation
DOI: https://doi.org/10.1007/978-3-540-30464-7_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23723-5
Online ISBN: 978-3-540-30464-7
eBook Packages: Springer Book Archive