Abstract
The nearest-neighbour (NN) classifier has long been used in pattern recognition, exploratory data analysis, and data mining problems. A vital consideration in obtaining good results with this technique is the choice of distance function, and correspondingly which features to consider when computing distances between samples. In this chapter, a new ensemble technique is proposed to improve the performance of NNclassifiers.The proposed approach combinesmultiple NNclassifiers, where each classifier uses a different distance function and potentially a different set of features (feature vector). These feature vectors are determined for each distance metric using a Simple Voting Scheme incorporated in Tabu Search (TS). The proposed ensemble classifier with different distance metrics and different feature vectors (TS–DF/NN) is evaluated using various benchmark data sets from the UCI Machine Learning Repository. Results have indicated a significant increase in the performance when compared with various well-known classifiers. The proposed ensemble method is also compared with an ensemble classifier using different distance metrics but with the same feature vector (with or without Feature Selection (FS)).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
T.M. Cover, and P.E. Hart (1967). Nearest Neighbor Pattern Classification. IEEE Transactions on Information Theory. 13(1), 21–27
C. Domeniconi, J. Peng, and D. Gunopulos (2002). Locally Adaptive Metric Nearest-Neighbor Classification. IEEE Transactions on Pattern Analysis and Machine Intelligence. 24(9), 1281–1285
D. Michie, D.J. Spiegelhalter and C.C. Taylor (1994). Machine Learning, Neural and Statistical Classification. Ellis Horwood
C. Blake, E. Keogh, and C. J. Merz. UCI Repository of machine learning databases, University of California, Irvine
F. Glover (1989). Tabu Search I. ORSA Journal on Computing, 1(3), 190–206
F. Glover (1990). Tabu Search II. ORSA Journal on Computing, 2(1), 4–32
M.L. Raymer et al (2000). Dimensionality Reduction using Genetic Algorithms. IEEE Transactions on Evolutionary Computation, 4(2), 164–171
M.A. Tahir et al (2006). Novel Round-Robin Tabu Search Algorithm for Prostate Cancer Classification and Diagnosis using Multispectral Imagery. IEEE Transactions on Information Technology in Biomedicine, 10(4), 782–793
A.K. Jain, and R.P.W. Duin, and J. Mao (2000). Statistical Pattern Recognition: A Review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(1), 4–37
Y. Bao and N. Ishii and X. Du (2004). Combining Multiple k-Nearest Neighbor Classifiers Using Different Distance Functions. Lecture Notes in Computer Science (LNCS 3177), 5th International Conference on Intelligent Data Engineering and Automated Learning (IDEAL 2004), Exeter, UK.
M. Kudo and J. Sklansky (2000). Comparison of Algorithms that Select Features for Pattern Classifiers. Pattern Recognition, 33, 25–41
H. Zhang and G. Sun (2002). Feature Selection using Tabu Search Method. Pattern Recognition„ 35, 701–711
M.A. Tahir, A. Bouridane, and F. Kurugollu (2007). Simultaneous Feature Selection and Feature Weighting using Hybrid Tabu Search/K-Nearest Neighbor Classifier. Pattern Recognition Letters, 28, 2007
D. Korycinski, M. Crawford, J. W Barnes, and J. Ghosh (2003). Adaptive Feature Selection for Hyperspectral Data Analysis using a Binary Hierarchical Classifier and Tabu Search. Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, IGARSS
S.D. Bay (1998). Combining Nearest Neighbor Classifiers Through Multiple Feature Subsets Proceedings of the Fifteenth International Conference on Machine Learning, 37–45
F. Glover, E. Taillard, and D. de Werra (1993). A User’s Guide to Tabu Search. Annals of Operations Research, 41, 3–28
S.M. Sait and H. Youssef (1999). General Iterative Algorithms for Combinatorial Optimization. IEEE Computer Society
S. Raudys and A. Jain (1991). Small Sample Effects in Statistical Pattern Recognition: Recommendations for Practitioners IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(3), 252–264
R. Paredes and E. Vidal (2006). Learning Weighted Metrics to Minimize Nearest-Neighbor Classification Error IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(7), 1100–1110
J.R. Quinlan (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann
R. Kohavi (1995). The Power of Decision Tables. Proceedings of the 8th European Conference on Machine Learning
L. Breiman (2001). Random Forests. Machine Learning, 45(1), 5–32
R. Duda and P. Hart (1973). Pattern Classification and Scene Analysis Wiley, New York.
L. Breiman (1996). Bagging Predictors. Machine Learning, 24(2), 123–140
Y. Freund and R.E. Schapire (1996). Experiments with a New Boosting Algorithm. Proceedings of International Conference on Machine Learning, 148–156
I.H. Witten and E. Frank (2005). Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn, Morgan Kaufmann, San Francisco
J. Wang, P. Neskovic, and L. Cooper (2007). Improving Nearest Neighbor Rule with a Simple Adaptive Distance Measure Pattern Recognition Letters, 28, 207–213
M.A. Tahir and J. Smith (2006). Improving Nearest Neighbor Classifier using Tabu Search and Ensemble Distance Metrics. Proceedings of the IEEE International Conference on Data Mining (ICDM)
E. Amaldi, and V. Kann (1998). On the Approximability of Minimizing Nonzero Variables or Unsatisfied Relations in Linear Systems. Theoretical Computer Science, 209, 237–260
S. Davies, and S. Russell (1994). NP-completeness of Searches for Smallest Possible Feature Sets. In Proceedings of the AAAI Fall Symposium on Relevance, AAAI Press, 37–39
A.K. Jain and D. Zongker (1997). Feature Selection: Evaluation, Application, and Small Sample Performance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(2), 153–158
P. Pudil, J. Novovicova, and J. Kittler (1994). Floating Search Methods in Feature Selection. Pattern Recognition Letters, 15, 1119–1125
W. Siedlecki and J. Sklansy (1989). A Note on Genetic Algorithms for Large-scale Feature Selection. Pattern Recognition Letters, 10(11), 335–347
S.B. Serpico, and L. Bruzzone (2001). A New Search Algorithm for Feature Selection in Hyperspectral Remote Sensing Images. IEEE Transactions on Geoscience and Remote Sensing, 39(7), 1360–1367
A.W. Whitney (1971). A Direct Method of Nonparametric Measurement Selection. IEEE Transactions on Computers, 20(9), 1100–1103
S. Yu, S.D. Backer, and P. Scheunders (2002). Genetic Feature Selection Combined with Composite Fuzzy Nearest Neighbor Classifiers for Hyperspectral Satellite Imagery. Pattern Recognition Letters, 23, 183–190
O. Okun and H. Proosalut (2005). Multiple Views in Ensembles of Nearest Neighbor Classifiers. In Proceedings of the ICML Workshop on Learning with Multiple Views, Bonn, Germany, 51–58
D.H. Wolpert (1992). Stacked Generalization, Neural Networks, 5, 241–259
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Tahir, M., Smith, J. (2007). Feature Selection for Heterogeneous Ensembles of Nearest-neighbour Classifiers Using Hybrid Tabu Search. In: Siarry, P., Michalewicz, Z. (eds) Advances in Metaheuristics for Hard Optimization. Natural Computing Series. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72960-0_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-72960-0_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72959-4
Online ISBN: 978-3-540-72960-0
eBook Packages: Computer ScienceComputer Science (R0)