Abstract
One important aspect of Case-Based Reasoning (CBR) is Case Selection or Editing – selection for inclusion (or removal) of cases from a case base. This can be motivated either by space considerations or quality considerations. One of the advantages of CBR is that it is equally useful for boolean, nominal, ordinal, and numeric prediction tasks. However, many case selection research efforts have focused on domains with nominal or boolean predictions. Most case selection methods have relied on such problem structure. In this paper, we present details of a systematic sequence of experiments with variations on CBR case selection. In this project, the emphasis has been on case quality – an attempt to filter out cases that may be noisy or idiosyncratic – that are not good for future prediction. Our results indicate that Case Selection can significantly increase the percentage of correct predictions at the expense of an increased risk of poor predictions in less common cases.
Manuscript received October 12, 2009.
Both authors are with La Salle University, Philadelphia, PA 19141 USA (corresponding author M.A. Redmond phone: 215-951-1096; e-mail: redmond@ lasalle.edu).
1 Numeric prediction is called “regression” by some researchers, after the statistical technique long used for it
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aamodt, A., Plaza, E., Case-Based Reasoning : Foundational Issues, Methodological Variations, and System Approaches; In Artificial Intelligence Communications. IOS Press. 7:1, pp. 39-59. (1994).
Kolodner, J. Case-Based Reasoning. Los Altos, CA: Morgan Kaufmann. (1993)
Leake, D. Case Based Reasoning: Experiences, Lessons, and Future Directions. Menlo Park, California: AAAI Press/MIT Press. (1996)
Watson, I.D. Applying Case-Based Reasoning: Techniques for Enterprise Systems. Los Altos, California: Morgan Kaufmann,. (1997)
Aha, D.W., Kibler, D., Albert, M.K. Instance Based Learning Algorithms. Machine Learning. Vol 6, pp. 37-66 (1991).
Wilson, D.L., Asymptotic Properties of Nearest Neighbor Rules Using Edited Data. IEEE Transactions on Systems, Man, and Cybernetics, vol 2, no 3, pp. 408-421 (1972)
Tomek, I., An Experiment with the Edited Nearest-Neighbor Rule. IEEE Transactions on Systems, Man, and Cybernetics, vol 6, no 6, pp. 448-452 (1976)
Wilson, D.R., Martinez, T.R., Instance Pruning Techniques. In Fisher, D. (ed) Machine Learning: Proceedings of the Fourteenth International Conference, Mogan Kaufmann Publishers, San Francisco, CA, pp. 404-411 (1997)
Wilson, D.R., Martinez, T.R., Reduction Techniques for Instance-Based Learning Algorithms. Machine Learning. Kluwer Academic Publishers. Vol 38:3, pp. 257-286 (2000).
Brighton, H., Mellish, C. Advances in Instance Selection for Instance-Based Learning Algorithms. Data Mining and Knowledge Discovery, Kluwer Academic Publishers, vol 6, pp. 153-172 (2002)
Smyth, B., McKenna, E., Building Compact Competent Case-Bases. In Proceedings of the Third International Conference on Case-Based Reasoning. Springer, pp. 329-342 (1999)
Morring, B.D., Martinez, T.R., Weighted Instance Typicality Search (WITS): A Nearest Neighbor Data Reduction Algorithm. In Intelligent Data Analysis, vol 8, no 1, pp. 61-78. (2004)
Massie, S., Craw, S., Wiratunga, N. When Similar Problems Don’t Have Similar Solutions. In Weber, R.O. and Richter, M.M. (eds) Proceedings of the 7 th International Conference on Case-Based Reasoning, Springer-Verlag, pp 92-106 (2007)
Pasquier, F.X., Delany, S.J., and Cunningham, P. Blame-Based Noise Reduction: An Alternative Perspective on Noise Reduction for Lazy Learning. Computer Science Technical Report, TCD-CS-2005-29, Trinity College Dublin, Department of Computer Science (2005)
Redmond, M. A. and A. Baveja: A Data-Driven Software Tool for Enabling Cooperative Information Sharing Among Police Departments. European Journal of Operational Research 141 (2002) 660-678.
U. S. Department of Commerce, Bureau of the Census, Census Of Population And Housing 1990 United States: Summary Tape File 1a & 3a (Computer Files), U.S. Department Of Commerce, Bureau Of The Census Producer, Washington, DC and Inter-university Consortium for Political and Social Research Ann Arbor, Michigan. (1992)
U.S. Department of Justice, Bureau of Justice Statistics, Law Enforcement Management And Administrative Statistics (Computer File) U.S. Department Of Commerce, Bureau Of The Census Producer, Washington, DC and Inter-university Consortium for Political and Social Research Ann Arbor, Michigan. (1992)
U.S. Department of Justice, Federal Bureau of Investigation, Crime in the United States (Computer File) (1995)
Asuncion, A. & Newman, D.J. UCI Machine Learning Repository [http://www.ics.uci.edu/˜mlearn/MLRepository.html]. Irvine, CA: University of California, School of Information and Computer Science. (2007).
Cost S., and S. Salzberg: A weighted nearest neighbor algorithm for learning with symbolic features. Machine Learning 10 (1993) 57-58.
Aha, D.W. Lazy Learning. Artificial Intelligence Review, vol 1, pp. 1-5 (1997)
Witten, I.H. and Frank, E. Data mining: practical machine learning tools and techniques, Morgan Kaufmann. (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media B.V.
About this paper
Cite this paper
Redmond, M.A., Highley, T. (2010). Empirical Analysis of Case-Editing Approaches for Numeric Prediction. In: Sobh, T., Elleithy, K. (eds) Innovations in Computing Sciences and Software Engineering. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-9112-3_14
Download citation
DOI: https://doi.org/10.1007/978-90-481-9112-3_14
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-9111-6
Online ISBN: 978-90-481-9112-3
eBook Packages: Computer ScienceComputer Science (R0)