Abstract
In instance-based machine learning, algorithms often suffer from prohibitive computational costs and storage space. To overcome such problems, various instance reduction techniques have been developed to remove noises and/or redundant instances. Condensation approach is the most frequently used method, and it aims to remove the instances far away from the decision surface. Edition method is another popular one, and it removes noises to improve the classification accuracy. Drawbacks of these existing techniques include parameter dependency and relatively low accuracy and reduction rate. To solve these drawbacks, the constraint nearest neighbor-based instance reduction (CNNIR) algorithm is proposed in this paper. We firstly introduce the concept of natural neighbor and apply it into instance reduction to eliminate noises and search core instances. Then, we define a constraint nearest neighbor chain which only consists of three instances. It is used to select border instances which can construct a rough decision boundary. After that, a specific strategy is given to reduce the border set. Finally, reduced set is obtained by merging border and core instances. Experimental results show that compared with existing algorithms, the proposed algorithm effectively reduces the number of instances and achieves higher classification accuracy. Moreover, it does not require any user-defined parameters.




Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Angiulli F (2007) Fast nearest neighbor condensation for large data sets classification. IEEE Trans Knowl Data Eng 19(11):1450–1464
Bhattacharya B, Mukherjee K, Toussaint G (2005) Geometric decision rules for instance-based learning problems. In: International conference on pattern recognition and machine intelligence. Springer, pp 60–69
Cavalcanti GDC, Ren TI, Pereira CL (2013) Atisa: adaptive threshold-based instance selection algorithm. Expert Syst Appl 40(17):6894–6900
Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27
Fayed HA, Atiya AF (2009) A novel template reduction approach for the \(k\)-nearest neighbor method. IEEE Trans Neural Netw 20(5):890–896
Hamidzadeh J (2015) Irdds: Instance reduction based on distance-based decision surface. J AI Data Min 3(2):121–130
Hamidzadeh J, Monsefi R, Yazdi HS (2015) Instance reduction algorithm using hyperrectangle. Pattern Recognit 48(5):1878–1889
Hart P (1968) The condensed nearest neighbor rule (corresp.). IEEE Trans Inf Theory 14(3):515–516
Huang J, Zhu Q, Yang L, Feng J (2016) A non-parameter outlier detection algorithm based on natural neighbor. Knowl-Based Syst 92:71–77
Huang J, Zhu Q, Yang L, Quanwang W (2017) Qcc: a novel clustering algorithm based on quasi-cluster centers. Mach Learn 106:337–357
Li J, Wang Y (2015) A new fast reduction technique based on binary nearest neighbor tree. Neurocomputing 149:1647–1657
Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml. Accessed 2016
Lumini A, Nanni L (2006) A clustering method for automatic biometric template selection. Pattern Recognit 39(3):495–497
Marchiori E (2008) Hit miss networks with applications to instance selection. J Mach Learn Res 9(Jun):997–1017
Marchiori E (2009) Graph-based discrete differential geometry for critical instance filtering. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 63–78
Marchiori E (2010) Class conditional nearest neighbor for large margin instance selection. IEEE Trans Pattern Anal Mach Intell 32(2):364–370
Mollineda RA, Ferri FJ, Vidal E (2002) An efficient prototype merging strategy for the condensed 1-nn rule through class-conditional hierarchical clustering. Pattern Recognit 35(12):2771–2782
Nikolaidis K, Goulermas JY, Wu QH (2011) A class boundary preserving algorithm for data condensation. Pattern Recognit 44(3):704–715
Nikolaidis K, Rodriguez-Martinez E, Goulermas JY, Wu QH (2012) Spectral graph optimization for instance reduction. IEEE Trans Neural Netw Learn Syst 23(7):1169–1175
Olvera-Lopez JA, Carrasco-Ochoa JA, Martnez-Trinidad JF (2010) A new fast prototype selection method based on clustering. Form Pattern Anal Appl 13(2):131–141
Wilson DL (1972) Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans Syst Man Cybern SMC 2(3):408–421
Yang L, Zhu Q, Huang J, Cheng D (2017) Adaptive edited natural neighbor algorithm. Neurocomputing 230:427–433
Zhu Q, Feng J, Huang J (2016) Natural neighbor: a self-adaptive neighborhood method without parameter \(k\). Pattern Recognit Lett 80:30–36
Acknowledgements
This work was supported by the National Natural Science Foundation of China (Nos. 61802360 and 61502060), the Project of Chongqing Education Commission (No. KJZH17104), the Fundamental Research Funds for the Central Universities (No. 2018NQN05) and the China Postdoctoral Science Foundation (No. 2016M602651).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Human and animal rights
This article does not contain any studies with human participants and animals performed by any of the authors.
Informed consent
Informed consent was obtained from all individual participants included in the study.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yang, L., Zhu, Q., Huang, J. et al. Constraint nearest neighbor for instance reduction. Soft Comput 23, 13235–13245 (2019). https://doi.org/10.1007/s00500-019-03865-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-019-03865-z