Skip to main content

Weighting features

  • Scientific Sessions
  • Conference paper
  • First Online:
Case-Based Reasoning Research and Development (ICCBR 1995)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1010))

Included in the following conference series:

Abstract

Many case-based reasoning algorithms retrieve cases using a derivative of the k-nearest neighbor (k-NN) classifier, whose similarity function is sensitive to irrelevant, interacting, and noisy features. Many proposed methods for reducing this sensitivity parameterize k-NN's similarity function with feature weights. We focus on methods that automatically assign weight settings using little or no domain-specific knowledge. Our goal is to predict the relative capabilities of these methods for specific dataset characteristics. We introduce a five-dimensional framework that categorizes automated weight-setting methods, empirically compare methods along one of these dimensions, summarize our results with four hypotheses, and describe additional evidence that supports them. Our investigation revealed that most methods correctly assign low weights to completely irrelevant features, and methods that use performance feedback demonstrate three advantages over other methods (i.e., they require less pre-processing, better tolerate interacting features, and increase learning rate).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Aha, D. W. (1990). A study of instance-based learning algorithms for supervised learning tasks: Mathematical, empirical, and psychological evaluations (TR 90-42). Irvine, CA: University of California, Department of Information and Computer Science.

    Google Scholar 

  • Aha, D. W. (1991). Incremental constructive induction: An instance-based approach. In Proceedings of the Eighth International Workshop on Machine Learning (pp. 117–121). Evanston, IL: Morgan Kaufmann.

    Google Scholar 

  • Aha, D. W., & Bankert, R. L. (1994). Feature selection for case-based classification of cloud types: An empirical comparison. In D. W. Aha (Ed.) Case-Based Reasoning: Papers from the 1994 Workshop (TR WS-94-01). Menlo Park, CA: AAAI Press.

    Google Scholar 

  • Aha, D. W., & Goldstone, R. L. (1992). Concept learning and flexible weighting. In Proceedings of the Fourteenth Annual Conference of the Cognitive Science Society (pp. 534–539). Bloomington, IN: Lawrence Erlbaum.

    Google Scholar 

  • Cain, T., Pazzani, M. J., & Silverstein, G. (1991). Using domain knowledge to influence similarity judgement. In Proceedings of the Case-Based Reasoning Workshop (pp. 191–202). Washington, DC: Morgan Kaufmann.

    Google Scholar 

  • Cardie, C. (1993). Using decision trees to improve case-based learning. In Proceedings of the Tenth International Conference on Machine Learning (pp. 25–32). Amherst, MA: Morgan Kaufmann.

    Google Scholar 

  • Creecy, R. H., Masand, B. M., Smith, S. J., & Waltz, D. L. (1992). Trading MIPS and memory for knowledge engineering. Communications of the ACM, 35, 48–64.

    Google Scholar 

  • Daelemans, W., van den Bosch, A. (1992). Generalization performance of backpropagation learning on a syllabification task. In Proceedings of TWLT3: Connectionism and Natural Language Processing (pp. 27–37). Enschede, The Netherlands: Unpublished.

    Google Scholar 

  • Devijver, P. A., & Kittler, J. (1982). Pattern recognition: A statistical approach. Englewood Cliffs, NJ: Prentice-Hall.

    Google Scholar 

  • Domingos, P. Context-sensitive feature selection for lazy learners. To appear in Artificial Intelligence Review.

    Google Scholar 

  • Fayyad, U. M., & Irani, K. B. (1993). Multi-interval discretization of continuous-valued attributes for classification learning. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence (pp. 1022–1029). Chambery, France: Morgan Kaufmann.

    Google Scholar 

  • Kelly, J. D., Jr., & Davis, L. (1991). A hybrid genetic algorithm for classification. In Proceedings of the Twelfth International Joint Conference on Artificial Intelligence (pp. 645–650). Sydney, Australia: Morgan Kaufmann.

    Google Scholar 

  • Kira, K., & Rendell, L. A. (1992). A practical approach to feature selection. In Proceedings of the Ninth International Conference on Machine Learning (pp. 249–256). Aberdeen, Scotland: Morgan Kaufmann.

    Google Scholar 

  • Kohavi, R., Langley, P., & Yun, Y. (1995). Heuristic search for feature weights in instance-based learning. Unpublished manuscript.

    Google Scholar 

  • Kononenko, I. (1994). Estimating attributes: Analysis and extensions of RELIEF. In Proceedings of the 1994 European Conference on Machine Learning (pp. 171–182). Catania, Italy: Springer Verlag.

    Google Scholar 

  • Lowe, D. (1995). Similarity metric learning for a variable-kernal classifier. Neural Computation, 7, 72–85.

    Google Scholar 

  • Mohri, T., & Tanaka, H. (1994). An optimal weighting criterion of case indexing for both numeric and symbolic attributes. In D. W. Aha (Ed.), Case-Based Reasoning: Papers from the 1994 Workshop (TR WS-94-01). Menlo Park, CA: AAAI Press.

    Google Scholar 

  • Moore, A. W., & Lee, M. S. (1994). Efficient algorithms for minimizing cross validation error. In Proceedings of the Eleventh International Conference on Machine Learning (pp. 190–198). New Brunswick, NJ: Morgan Kaufmann.

    Google Scholar 

  • Murphy, P. (1995). UCI Repository of machine learning databases [Machine-readable data repository @ics.uci.edu]. Irvine, CA: University of California, Department of Information and Computer Science.

    Google Scholar 

  • Ricci, F., & Avesani, P. (1995). Learning a local similarity metric for case-based reasoning. To appear in Proceedings of the First International Conference on Case-Based Reasoning. Sesimbra, Portugal: Springer-Verlag.

    Google Scholar 

  • Salzberg, S. L. (1991). A nearest hyperrectangle learning method. Machine Learning, 6, 251–276.

    Google Scholar 

  • Shannon, C. E. (1948). A mathematical theory of communication. Bell Systems Technology Journal, 27, 379–423.

    Google Scholar 

  • Skalak, D. (1994). Prototype and feature selection by sampling and random mutation hill climbing algorithms. In Proceedings of the Eleventh International Machine Learning Conference (pp. 293–301). New Brunswick, NJ: Morgan Kaufmann.

    Google Scholar 

  • Stanfill, C., & Waltz, D. (1986). Toward memory-based reasoning. Communications of the ACM, 29, 1213–1228.

    Google Scholar 

  • Wettschereck, D. (1994). A study of distance-based machine learning algorithms. Doctoral dissertation, Department of Computer Science, Oregon State University, Corvallis, OR.

    Google Scholar 

  • Wettschereck, D., Aha, D. W. & Mohri, T (1995). A review and comparative evaluation of feature weighting methods for lazy learning algorithms (TR AIC-95-012). Washington, DC: Naval Research Laboratory, Navy Center for Applied Research in Artificial Intelligence.

    Google Scholar 

  • Wettschereck, D., & Dietterich, T. G. (1995). An experimental comparison of the nearest neighbor and nearest hyperrectangle algorithms. Machine Learning, 19, 5–28.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Manuela Veloso Agnar Aamodt

Rights and permissions

Reprints and permissions

Copyright information

© 1995 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wettschereck, D., Aha, D.W. (1995). Weighting features. In: Veloso, M., Aamodt, A. (eds) Case-Based Reasoning Research and Development. ICCBR 1995. Lecture Notes in Computer Science, vol 1010. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60598-3_31

Download citation

  • DOI: https://doi.org/10.1007/3-540-60598-3_31

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-60598-0

  • Online ISBN: 978-3-540-48446-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics