Skip to main content

Feature Selection for Heterogeneous Ensembles of Nearest-neighbour Classifiers Using Hybrid Tabu Search

  • Chapter
Advances in Metaheuristics for Hard Optimization

Part of the book series: Natural Computing Series ((NCS))

  • 1480 Accesses

Abstract

The nearest-neighbour (NN) classifier has long been used in pattern recognition, exploratory data analysis, and data mining problems. A vital consideration in obtaining good results with this technique is the choice of distance function, and correspondingly which features to consider when computing distances between samples. In this chapter, a new ensemble technique is proposed to improve the performance of NNclassifiers.The proposed approach combinesmultiple NNclassifiers, where each classifier uses a different distance function and potentially a different set of features (feature vector). These feature vectors are determined for each distance metric using a Simple Voting Scheme incorporated in Tabu Search (TS). The proposed ensemble classifier with different distance metrics and different feature vectors (TS–DF/NN) is evaluated using various benchmark data sets from the UCI Machine Learning Repository. Results have indicated a significant increase in the performance when compared with various well-known classifiers. The proposed ensemble method is also compared with an ensemble classifier using different distance metrics but with the same feature vector (with or without Feature Selection (FS)).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. T.M. Cover, and P.E. Hart (1967). Nearest Neighbor Pattern Classification. IEEE Transactions on Information Theory. 13(1), 21–27

    Article  MATH  Google Scholar 

  2. C. Domeniconi, J. Peng, and D. Gunopulos (2002). Locally Adaptive Metric Nearest-Neighbor Classification. IEEE Transactions on Pattern Analysis and Machine Intelligence. 24(9), 1281–1285

    Article  Google Scholar 

  3. D. Michie, D.J. Spiegelhalter and C.C. Taylor (1994). Machine Learning, Neural and Statistical Classification. Ellis Horwood

    Google Scholar 

  4. C. Blake, E. Keogh, and C. J. Merz. UCI Repository of machine learning databases, University of California, Irvine

    Google Scholar 

  5. F. Glover (1989). Tabu Search I. ORSA Journal on Computing, 1(3), 190–206

    MATH  Google Scholar 

  6. F. Glover (1990). Tabu Search II. ORSA Journal on Computing, 2(1), 4–32

    MATH  Google Scholar 

  7. M.L. Raymer et al (2000). Dimensionality Reduction using Genetic Algorithms. IEEE Transactions on Evolutionary Computation, 4(2), 164–171

    Article  Google Scholar 

  8. M.A. Tahir et al (2006). Novel Round-Robin Tabu Search Algorithm for Prostate Cancer Classification and Diagnosis using Multispectral Imagery. IEEE Transactions on Information Technology in Biomedicine, 10(4), 782–793

    Article  MathSciNet  Google Scholar 

  9. A.K. Jain, and R.P.W. Duin, and J. Mao (2000). Statistical Pattern Recognition: A Review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(1), 4–37

    Article  Google Scholar 

  10. Y. Bao and N. Ishii and X. Du (2004). Combining Multiple k-Nearest Neighbor Classifiers Using Different Distance Functions. Lecture Notes in Computer Science (LNCS 3177), 5th International Conference on Intelligent Data Engineering and Automated Learning (IDEAL 2004), Exeter, UK.

    Google Scholar 

  11. M. Kudo and J. Sklansky (2000). Comparison of Algorithms that Select Features for Pattern Classifiers. Pattern Recognition, 33, 25–41

    Article  Google Scholar 

  12. H. Zhang and G. Sun (2002). Feature Selection using Tabu Search Method. Pattern Recognition„ 35, 701–711

    Article  MATH  Google Scholar 

  13. M.A. Tahir, A. Bouridane, and F. Kurugollu (2007). Simultaneous Feature Selection and Feature Weighting using Hybrid Tabu Search/K-Nearest Neighbor Classifier. Pattern Recognition Letters, 28, 2007

    Article  Google Scholar 

  14. D. Korycinski, M. Crawford, J. W Barnes, and J. Ghosh (2003). Adaptive Feature Selection for Hyperspectral Data Analysis using a Binary Hierarchical Classifier and Tabu Search. Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, IGARSS

    Google Scholar 

  15. S.D. Bay (1998). Combining Nearest Neighbor Classifiers Through Multiple Feature Subsets Proceedings of the Fifteenth International Conference on Machine Learning, 37–45

    Google Scholar 

  16. F. Glover, E. Taillard, and D. de Werra (1993). A User’s Guide to Tabu Search. Annals of Operations Research, 41, 3–28

    Article  MATH  Google Scholar 

  17. S.M. Sait and H. Youssef (1999). General Iterative Algorithms for Combinatorial Optimization. IEEE Computer Society

    Google Scholar 

  18. S. Raudys and A. Jain (1991). Small Sample Effects in Statistical Pattern Recognition: Recommendations for Practitioners IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(3), 252–264

    Article  Google Scholar 

  19. R. Paredes and E. Vidal (2006). Learning Weighted Metrics to Minimize Nearest-Neighbor Classification Error IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(7), 1100–1110

    Article  Google Scholar 

  20. J.R. Quinlan (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann

    Google Scholar 

  21. R. Kohavi (1995). The Power of Decision Tables. Proceedings of the 8th European Conference on Machine Learning

    Google Scholar 

  22. L. Breiman (2001). Random Forests. Machine Learning, 45(1), 5–32

    Article  MATH  Google Scholar 

  23. R. Duda and P. Hart (1973). Pattern Classification and Scene Analysis Wiley, New York.

    MATH  Google Scholar 

  24. L. Breiman (1996). Bagging Predictors. Machine Learning, 24(2), 123–140

    MATH  MathSciNet  Google Scholar 

  25. Y. Freund and R.E. Schapire (1996). Experiments with a New Boosting Algorithm. Proceedings of International Conference on Machine Learning, 148–156

    Google Scholar 

  26. I.H. Witten and E. Frank (2005). Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn, Morgan Kaufmann, San Francisco

    Google Scholar 

  27. J. Wang, P. Neskovic, and L. Cooper (2007). Improving Nearest Neighbor Rule with a Simple Adaptive Distance Measure Pattern Recognition Letters, 28, 207–213

    Article  Google Scholar 

  28. M.A. Tahir and J. Smith (2006). Improving Nearest Neighbor Classifier using Tabu Search and Ensemble Distance Metrics. Proceedings of the IEEE International Conference on Data Mining (ICDM)

    Google Scholar 

  29. E. Amaldi, and V. Kann (1998). On the Approximability of Minimizing Nonzero Variables or Unsatisfied Relations in Linear Systems. Theoretical Computer Science, 209, 237–260

    Article  MATH  MathSciNet  Google Scholar 

  30. S. Davies, and S. Russell (1994). NP-completeness of Searches for Smallest Possible Feature Sets. In Proceedings of the AAAI Fall Symposium on Relevance, AAAI Press, 37–39

    Google Scholar 

  31. A.K. Jain and D. Zongker (1997). Feature Selection: Evaluation, Application, and Small Sample Performance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(2), 153–158

    Article  Google Scholar 

  32. P. Pudil, J. Novovicova, and J. Kittler (1994). Floating Search Methods in Feature Selection. Pattern Recognition Letters, 15, 1119–1125

    Article  Google Scholar 

  33. W. Siedlecki and J. Sklansy (1989). A Note on Genetic Algorithms for Large-scale Feature Selection. Pattern Recognition Letters, 10(11), 335–347

    Article  MATH  Google Scholar 

  34. S.B. Serpico, and L. Bruzzone (2001). A New Search Algorithm for Feature Selection in Hyperspectral Remote Sensing Images. IEEE Transactions on Geoscience and Remote Sensing, 39(7), 1360–1367

    Article  Google Scholar 

  35. A.W. Whitney (1971). A Direct Method of Nonparametric Measurement Selection. IEEE Transactions on Computers, 20(9), 1100–1103

    Article  MATH  MathSciNet  Google Scholar 

  36. S. Yu, S.D. Backer, and P. Scheunders (2002). Genetic Feature Selection Combined with Composite Fuzzy Nearest Neighbor Classifiers for Hyperspectral Satellite Imagery. Pattern Recognition Letters, 23, 183–190

    Article  MATH  Google Scholar 

  37. O. Okun and H. Proosalut (2005). Multiple Views in Ensembles of Nearest Neighbor Classifiers. In Proceedings of the ICML Workshop on Learning with Multiple Views, Bonn, Germany, 51–58

    Google Scholar 

  38. D.H. Wolpert (1992). Stacked Generalization, Neural Networks, 5, 241–259

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Tahir, M., Smith, J. (2007). Feature Selection for Heterogeneous Ensembles of Nearest-neighbour Classifiers Using Hybrid Tabu Search. In: Siarry, P., Michalewicz, Z. (eds) Advances in Metaheuristics for Hard Optimization. Natural Computing Series. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72960-0_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-72960-0_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-72959-4

  • Online ISBN: 978-3-540-72960-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics