Feature Selection for Heterogeneous Ensembles of Nearest-neighbour Classifiers Using Hybrid Tabu Search

Tahir, Muhammad A.; Smith, James E.

doi:10.1007/978-3-540-72960-0_4

Muhammad A. Tahir³ &
James E. Smith³

Part of the book series: Natural Computing Series ((NCS))

1480 Accesses

Abstract

The nearest-neighbour (NN) classifier has long been used in pattern recognition, exploratory data analysis, and data mining problems. A vital consideration in obtaining good results with this technique is the choice of distance function, and correspondingly which features to consider when computing distances between samples. In this chapter, a new ensemble technique is proposed to improve the performance of NNclassifiers.The proposed approach combinesmultiple NNclassifiers, where each classifier uses a different distance function and potentially a different set of features (feature vector). These feature vectors are determined for each distance metric using a Simple Voting Scheme incorporated in Tabu Search (TS). The proposed ensemble classifier with different distance metrics and different feature vectors (TS–DF/NN) is evaluated using various benchmark data sets from the UCI Machine Learning Repository. Results have indicated a significant increase in the performance when compared with various well-known classifiers. The proposed ensemble method is also compared with an ensemble classifier using different distance metrics but with the same feature vector (with or without Feature Selection (FS)).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

T.M. Cover, and P.E. Hart (1967). Nearest Neighbor Pattern Classification. IEEE Transactions on Information Theory. 13(1), 21–27
Article MATH Google Scholar
C. Domeniconi, J. Peng, and D. Gunopulos (2002). Locally Adaptive Metric Nearest-Neighbor Classification. IEEE Transactions on Pattern Analysis and Machine Intelligence. 24(9), 1281–1285
Article Google Scholar
D. Michie, D.J. Spiegelhalter and C.C. Taylor (1994). Machine Learning, Neural and Statistical Classification. Ellis Horwood
Google Scholar
C. Blake, E. Keogh, and C. J. Merz. UCI Repository of machine learning databases, University of California, Irvine
Google Scholar
F. Glover (1989). Tabu Search I. ORSA Journal on Computing, 1(3), 190–206
MATH Google Scholar
F. Glover (1990). Tabu Search II. ORSA Journal on Computing, 2(1), 4–32
MATH Google Scholar
M.L. Raymer et al (2000). Dimensionality Reduction using Genetic Algorithms. IEEE Transactions on Evolutionary Computation, 4(2), 164–171
Article Google Scholar
M.A. Tahir et al (2006). Novel Round-Robin Tabu Search Algorithm for Prostate Cancer Classification and Diagnosis using Multispectral Imagery. IEEE Transactions on Information Technology in Biomedicine, 10(4), 782–793
Article MathSciNet Google Scholar
A.K. Jain, and R.P.W. Duin, and J. Mao (2000). Statistical Pattern Recognition: A Review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(1), 4–37
Article Google Scholar
Y. Bao and N. Ishii and X. Du (2004). Combining Multiple k-Nearest Neighbor Classifiers Using Different Distance Functions. Lecture Notes in Computer Science (LNCS 3177), 5th International Conference on Intelligent Data Engineering and Automated Learning (IDEAL 2004), Exeter, UK.
Google Scholar
M. Kudo and J. Sklansky (2000). Comparison of Algorithms that Select Features for Pattern Classifiers. Pattern Recognition, 33, 25–41
Article Google Scholar
H. Zhang and G. Sun (2002). Feature Selection using Tabu Search Method. Pattern Recognition„ 35, 701–711
Article MATH Google Scholar
M.A. Tahir, A. Bouridane, and F. Kurugollu (2007). Simultaneous Feature Selection and Feature Weighting using Hybrid Tabu Search/K-Nearest Neighbor Classifier. Pattern Recognition Letters, 28, 2007
Article Google Scholar
D. Korycinski, M. Crawford, J. W Barnes, and J. Ghosh (2003). Adaptive Feature Selection for Hyperspectral Data Analysis using a Binary Hierarchical Classifier and Tabu Search. Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, IGARSS
Google Scholar
S.D. Bay (1998). Combining Nearest Neighbor Classifiers Through Multiple Feature Subsets Proceedings of the Fifteenth International Conference on Machine Learning, 37–45
Google Scholar
F. Glover, E. Taillard, and D. de Werra (1993). A User’s Guide to Tabu Search. Annals of Operations Research, 41, 3–28
Article MATH Google Scholar
S.M. Sait and H. Youssef (1999). General Iterative Algorithms for Combinatorial Optimization. IEEE Computer Society
Google Scholar
S. Raudys and A. Jain (1991). Small Sample Effects in Statistical Pattern Recognition: Recommendations for Practitioners IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(3), 252–264
Article Google Scholar
R. Paredes and E. Vidal (2006). Learning Weighted Metrics to Minimize Nearest-Neighbor Classification Error IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(7), 1100–1110
Article Google Scholar
J.R. Quinlan (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann
Google Scholar
R. Kohavi (1995). The Power of Decision Tables. Proceedings of the 8th European Conference on Machine Learning
Google Scholar
L. Breiman (2001). Random Forests. Machine Learning, 45(1), 5–32
Article MATH Google Scholar
R. Duda and P. Hart (1973). Pattern Classification and Scene Analysis Wiley, New York.
MATH Google Scholar
L. Breiman (1996). Bagging Predictors. Machine Learning, 24(2), 123–140
MATH MathSciNet Google Scholar
Y. Freund and R.E. Schapire (1996). Experiments with a New Boosting Algorithm. Proceedings of International Conference on Machine Learning, 148–156
Google Scholar
I.H. Witten and E. Frank (2005). Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn, Morgan Kaufmann, San Francisco
Google Scholar
J. Wang, P. Neskovic, and L. Cooper (2007). Improving Nearest Neighbor Rule with a Simple Adaptive Distance Measure Pattern Recognition Letters, 28, 207–213
Article Google Scholar
M.A. Tahir and J. Smith (2006). Improving Nearest Neighbor Classifier using Tabu Search and Ensemble Distance Metrics. Proceedings of the IEEE International Conference on Data Mining (ICDM)
Google Scholar
E. Amaldi, and V. Kann (1998). On the Approximability of Minimizing Nonzero Variables or Unsatisfied Relations in Linear Systems. Theoretical Computer Science, 209, 237–260
Article MATH MathSciNet Google Scholar
S. Davies, and S. Russell (1994). NP-completeness of Searches for Smallest Possible Feature Sets. In Proceedings of the AAAI Fall Symposium on Relevance, AAAI Press, 37–39
Google Scholar
A.K. Jain and D. Zongker (1997). Feature Selection: Evaluation, Application, and Small Sample Performance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(2), 153–158
Article Google Scholar
P. Pudil, J. Novovicova, and J. Kittler (1994). Floating Search Methods in Feature Selection. Pattern Recognition Letters, 15, 1119–1125
Article Google Scholar
W. Siedlecki and J. Sklansy (1989). A Note on Genetic Algorithms for Large-scale Feature Selection. Pattern Recognition Letters, 10(11), 335–347
Article MATH Google Scholar
S.B. Serpico, and L. Bruzzone (2001). A New Search Algorithm for Feature Selection in Hyperspectral Remote Sensing Images. IEEE Transactions on Geoscience and Remote Sensing, 39(7), 1360–1367
Article Google Scholar
A.W. Whitney (1971). A Direct Method of Nonparametric Measurement Selection. IEEE Transactions on Computers, 20(9), 1100–1103
Article MATH MathSciNet Google Scholar
S. Yu, S.D. Backer, and P. Scheunders (2002). Genetic Feature Selection Combined with Composite Fuzzy Nearest Neighbor Classifiers for Hyperspectral Satellite Imagery. Pattern Recognition Letters, 23, 183–190
Article MATH Google Scholar
O. Okun and H. Proosalut (2005). Multiple Views in Ensembles of Nearest Neighbor Classifiers. In Proceedings of the ICML Workshop on Learning with Multiple Views, Bonn, Germany, 51–58
Google Scholar
D.H. Wolpert (1992). Stacked Generalization, Neural Networks, 5, 241–259
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, University of the West of England, Bristol, UK
Muhammad A. Tahir & James E. Smith

Authors

Muhammad A. Tahir
View author publications
You can also search for this author in PubMed Google Scholar
James E. Smith
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Laboratory LiSSi, University of Paris 12, 61 Avenue du Général de Gaulle, 94010, Créteil, France
Patrick Siarry
School of Computer Science, University of Adelaide, SA 5005, Adelaide, Australia
Zbigniew Michalewicz

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Tahir, M., Smith, J. (2007). Feature Selection for Heterogeneous Ensembles of Nearest-neighbour Classifiers Using Hybrid Tabu Search. In: Siarry, P., Michalewicz, Z. (eds) Advances in Metaheuristics for Hard Optimization. Natural Computing Series. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72960-0_4

Download citation

DOI: https://doi.org/10.1007/978-3-540-72960-0_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72959-4
Online ISBN: 978-3-540-72960-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics