Abstract
The nearest neighbor classifier is a powerful, straightforward, and very popular approach to solving many classification problems. It also enables users to easily incorporate weights of training instances into its model, allowing users to highlight more promising examples. Instance weighting schemes proposed to date were based either on attribute values or external knowledge. In this paper, we propose a new way of weighting instances based on network analysis and centrality measures. Our method relies on transforming the training dataset into a weighted signed network and evaluating the importance of each node using a selected centrality measure. This information is then transferred back to the training dataset in the form of instance weights, which are later used during nearest neighbor classification. We consider four centrality measures appropriate for our problem and empirically evaluate our proposal on 30 popular, publicly available datasets. The results show that the proposed instance weighting enhances the predictive performance of the nearest neighbor algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Source code in R and reproducible test scripts available at: http://www.cs.put.poznan.pl/dbrzezinski/software.php.
References
Fix, E., Hodges, J.L.: Discriminatory analysis, nonparametric discrimination: consistency properties. US Air Force Sch. Aviat. Med. Technical Report 4(3), 477+ (1951)
Freeman, L.C.: Centrality in social networks conceptual clarification. Soc. Netw. 1(3), 215–239 (1978)
Dudani, S.A.: The distance-weighted k-nearest-neighbor rule. IEEE Trans. Syst. Man Cybern. SMC-6(3), 325–327 (1976)
Gou, J., Du, L., Zhang, Y., Xiong, T.: A new distance-weighted k-nearest neighbor classifier. J. Inf. Comput. Sci. 9(6), 1429–1436 (2012)
Samworth, R.J.: Optimal weighted nearest neighbour classifiers. Ann. Stat. 40(5), 2733–2763 (2012)
Kaur, M., Singh, S.: Analyzing negative ties in social networks: a survey. Egypt. Inform. J. 17(1), 21–43 (2016)
Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393(6684), 409–410 (1998)
Bonacich, P.: Factoring and weighting approaches to status scores and clique identification. J. Math. Sociol. 2(1), 113–120 (1972)
Katz, L.: A new status index derived from sociometric analysis. Psychometrika 18(1), 39–43 (1953)
Bonacich, P.: Power and centrality: a family of measures. Am. J. Sociol. 92(5), 1170–1182 (1987)
Everett, M., Borgatti, S.: Networks containing negative ties. Soc. Netw. 38, 111–120 (2014)
Costantini, G., Perugini, M.: Generalization of clustering coefficients to signed correlation networks. PLoS ONE 9(2), 1–10 (2014)
Bonacich, P., Lloyd, P.: Calculating status with negative relations. Soc. Netw. 26(4), 331–338 (2004)
Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison Wesley, Boston (2005)
Japkowicz, N., Shah, M.: Evaluating Learning Algorithms: A Classification Perspective. Cambridge University Press, Cambridge (2011)
Lichman, M.: UCI machine learning repository (2013)
Acknowledgments
This research is partly funded by the Polish National Science Center under Grant No. 2015/19/B/ST6/02637. D. Brzezinski acknowledges the support of an FNP START scholarship.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Piernik, M., Brzezinski, D., Morzy, T., Morzy, M. (2017). Using Network Analysis to Improve Nearest Neighbor Classification of Non-network Data. In: Kryszkiewicz, M., Appice, A., Ślęzak, D., Rybinski, H., Skowron, A., Raś, Z. (eds) Foundations of Intelligent Systems. ISMIS 2017. Lecture Notes in Computer Science(), vol 10352. Springer, Cham. https://doi.org/10.1007/978-3-319-60438-1_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-60438-1_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-60437-4
Online ISBN: 978-3-319-60438-1
eBook Packages: Computer ScienceComputer Science (R0)