Using Network Analysis to Improve Nearest Neighbor Classification of Non-network Data

Piernik, Maciej; Brzezinski, Dariusz; Morzy, Tadeusz; Morzy, Mikolaj

doi:10.1007/978-3-319-60438-1_11

Maciej Piernik¹⁹,
Dariusz Brzezinski¹⁹,
Tadeusz Morzy¹⁹ &
…
Mikolaj Morzy¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10352))

Included in the following conference series:

International Symposium on Methodologies for Intelligent Systems

1674 Accesses

Abstract

The nearest neighbor classifier is a powerful, straightforward, and very popular approach to solving many classification problems. It also enables users to easily incorporate weights of training instances into its model, allowing users to highlight more promising examples. Instance weighting schemes proposed to date were based either on attribute values or external knowledge. In this paper, we propose a new way of weighting instances based on network analysis and centrality measures. Our method relies on transforming the training dataset into a weighted signed network and evaluating the importance of each node using a selected centrality measure. This information is then transferred back to the training dataset in the form of instance weights, which are later used during nearest neighbor classification. We consider four centrality measures appropriate for our problem and empirically evaluate our proposal on 30 popular, publicly available datasets. The results show that the proposed instance weighting enhances the predictive performance of the nearest neighbor algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Source code in R and reproducible test scripts available at: http://www.cs.put.poznan.pl/dbrzezinski/software.php.

References

Fix, E., Hodges, J.L.: Discriminatory analysis, nonparametric discrimination: consistency properties. US Air Force Sch. Aviat. Med. Technical Report 4(3), 477+ (1951)
MATH Google Scholar
Freeman, L.C.: Centrality in social networks conceptual clarification. Soc. Netw. 1(3), 215–239 (1978)
Article Google Scholar
Dudani, S.A.: The distance-weighted k-nearest-neighbor rule. IEEE Trans. Syst. Man Cybern. SMC-6(3), 325–327 (1976)
Google Scholar
Gou, J., Du, L., Zhang, Y., Xiong, T.: A new distance-weighted k-nearest neighbor classifier. J. Inf. Comput. Sci. 9(6), 1429–1436 (2012)
Google Scholar
Samworth, R.J.: Optimal weighted nearest neighbour classifiers. Ann. Stat. 40(5), 2733–2763 (2012)
Article MathSciNet MATH Google Scholar
Kaur, M., Singh, S.: Analyzing negative ties in social networks: a survey. Egypt. Inform. J. 17(1), 21–43 (2016)
Article Google Scholar
Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393(6684), 409–410 (1998)
Article Google Scholar
Bonacich, P.: Factoring and weighting approaches to status scores and clique identification. J. Math. Sociol. 2(1), 113–120 (1972)
Article Google Scholar
Katz, L.: A new status index derived from sociometric analysis. Psychometrika 18(1), 39–43 (1953)
Article MATH Google Scholar
Bonacich, P.: Power and centrality: a family of measures. Am. J. Sociol. 92(5), 1170–1182 (1987)
Article Google Scholar
Everett, M., Borgatti, S.: Networks containing negative ties. Soc. Netw. 38, 111–120 (2014)
Article Google Scholar
Costantini, G., Perugini, M.: Generalization of clustering coefficients to signed correlation networks. PLoS ONE 9(2), 1–10 (2014)
Article Google Scholar
Bonacich, P., Lloyd, P.: Calculating status with negative relations. Soc. Netw. 26(4), 331–338 (2004)
Article Google Scholar
Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison Wesley, Boston (2005)
Google Scholar
Japkowicz, N., Shah, M.: Evaluating Learning Algorithms: A Classification Perspective. Cambridge University Press, Cambridge (2011)
Book MATH Google Scholar
Lichman, M.: UCI machine learning repository (2013)
Google Scholar

Download references

Acknowledgments

This research is partly funded by the Polish National Science Center under Grant No. 2015/19/B/ST6/02637. D. Brzezinski acknowledges the support of an FNP START scholarship.

Author information

Authors and Affiliations

Institute of Computing Science, Poznan University of Technology, ul. Piotrowo 2, 60-965, Poznan, Poland
Maciej Piernik, Dariusz Brzezinski, Tadeusz Morzy & Mikolaj Morzy

Authors

Maciej Piernik
View author publications
You can also search for this author in PubMed Google Scholar
Dariusz Brzezinski
View author publications
You can also search for this author in PubMed Google Scholar
Tadeusz Morzy
View author publications
You can also search for this author in PubMed Google Scholar
Mikolaj Morzy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maciej Piernik .

Editor information

Editors and Affiliations

Warsaw University of Technology, Warsaw, Poland
Marzena Kryszkiewicz
University of Bari Aldo Moro, Bari, Italy
Annalisa Appice
Institute of Informatics, University of Warsaw, Warsaw, Poland
Dominik Ślęzak
Faculty of Electronics & Information, Warsaw University of Technology, Warsaw, Poland
Henryk Rybinski
Institute of Mathematics, Warsaw University, Warsaw, Poland
Andrzej Skowron
Department of Computer Science, University of North Carolina at Charlotte, North Carolina, USA
Zbigniew W. Raś

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Piernik, M., Brzezinski, D., Morzy, T., Morzy, M. (2017). Using Network Analysis to Improve Nearest Neighbor Classification of Non-network Data. In: Kryszkiewicz, M., Appice, A., Ślęzak, D., Rybinski, H., Skowron, A., Raś, Z. (eds) Foundations of Intelligent Systems. ISMIS 2017. Lecture Notes in Computer Science(), vol 10352. Springer, Cham. https://doi.org/10.1007/978-3-319-60438-1_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-60438-1_11
Published: 14 June 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-60437-4
Online ISBN: 978-3-319-60438-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics