Skip to main content

Using Network Analysis to Improve Nearest Neighbor Classification of Non-network Data

  • Conference paper
  • First Online:
Book cover Foundations of Intelligent Systems (ISMIS 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10352))

Included in the following conference series:

  • 1674 Accesses

Abstract

The nearest neighbor classifier is a powerful, straightforward, and very popular approach to solving many classification problems. It also enables users to easily incorporate weights of training instances into its model, allowing users to highlight more promising examples. Instance weighting schemes proposed to date were based either on attribute values or external knowledge. In this paper, we propose a new way of weighting instances based on network analysis and centrality measures. Our method relies on transforming the training dataset into a weighted signed network and evaluating the importance of each node using a selected centrality measure. This information is then transferred back to the training dataset in the form of instance weights, which are later used during nearest neighbor classification. We consider four centrality measures appropriate for our problem and empirically evaluate our proposal on 30 popular, publicly available datasets. The results show that the proposed instance weighting enhances the predictive performance of the nearest neighbor algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Source code in R and reproducible test scripts available at: http://www.cs.put.poznan.pl/dbrzezinski/software.php.

References

  1. Fix, E., Hodges, J.L.: Discriminatory analysis, nonparametric discrimination: consistency properties. US Air Force Sch. Aviat. Med. Technical Report 4(3), 477+ (1951)

    MATH  Google Scholar 

  2. Freeman, L.C.: Centrality in social networks conceptual clarification. Soc. Netw. 1(3), 215–239 (1978)

    Article  Google Scholar 

  3. Dudani, S.A.: The distance-weighted k-nearest-neighbor rule. IEEE Trans. Syst. Man Cybern. SMC-6(3), 325–327 (1976)

    Google Scholar 

  4. Gou, J., Du, L., Zhang, Y., Xiong, T.: A new distance-weighted k-nearest neighbor classifier. J. Inf. Comput. Sci. 9(6), 1429–1436 (2012)

    Google Scholar 

  5. Samworth, R.J.: Optimal weighted nearest neighbour classifiers. Ann. Stat. 40(5), 2733–2763 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  6. Kaur, M., Singh, S.: Analyzing negative ties in social networks: a survey. Egypt. Inform. J. 17(1), 21–43 (2016)

    Article  Google Scholar 

  7. Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393(6684), 409–410 (1998)

    Article  Google Scholar 

  8. Bonacich, P.: Factoring and weighting approaches to status scores and clique identification. J. Math. Sociol. 2(1), 113–120 (1972)

    Article  Google Scholar 

  9. Katz, L.: A new status index derived from sociometric analysis. Psychometrika 18(1), 39–43 (1953)

    Article  MATH  Google Scholar 

  10. Bonacich, P.: Power and centrality: a family of measures. Am. J. Sociol. 92(5), 1170–1182 (1987)

    Article  Google Scholar 

  11. Everett, M., Borgatti, S.: Networks containing negative ties. Soc. Netw. 38, 111–120 (2014)

    Article  Google Scholar 

  12. Costantini, G., Perugini, M.: Generalization of clustering coefficients to signed correlation networks. PLoS ONE 9(2), 1–10 (2014)

    Article  Google Scholar 

  13. Bonacich, P., Lloyd, P.: Calculating status with negative relations. Soc. Netw. 26(4), 331–338 (2004)

    Article  Google Scholar 

  14. Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison Wesley, Boston (2005)

    Google Scholar 

  15. Japkowicz, N., Shah, M.: Evaluating Learning Algorithms: A Classification Perspective. Cambridge University Press, Cambridge (2011)

    Book  MATH  Google Scholar 

  16. Lichman, M.: UCI machine learning repository (2013)

    Google Scholar 

Download references

Acknowledgments

This research is partly funded by the Polish National Science Center under Grant No. 2015/19/B/ST6/02637. D. Brzezinski acknowledges the support of an FNP START scholarship.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maciej Piernik .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Piernik, M., Brzezinski, D., Morzy, T., Morzy, M. (2017). Using Network Analysis to Improve Nearest Neighbor Classification of Non-network Data. In: Kryszkiewicz, M., Appice, A., Ślęzak, D., Rybinski, H., Skowron, A., Raś, Z. (eds) Foundations of Intelligent Systems. ISMIS 2017. Lecture Notes in Computer Science(), vol 10352. Springer, Cham. https://doi.org/10.1007/978-3-319-60438-1_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-60438-1_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-60437-4

  • Online ISBN: 978-3-319-60438-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics