Abstract
In their unmodified form, lazy-learning algorithms may have difficulty learning and tracking time-varying input/output function maps such as those that occur in concept shift. Extensions of these algorithms, such as Time-Windowed forgetting (TWF), can permit learning of time-varying mappings by deleting older exemplars, but have decreased classification accuracy when the input-space sampling distribution of the learning set is time-varying. Additionally, TWF suffers from lower asymptotic classification accuracy than equivalent non-forgetting algorithms when the input sampling distributions are stationary. Other shift-sensitive algorithms, such as Locally-Weighted forgetting (LWF) avoid the negative effects of time-varying sampling distributions, but still have lower asymptotic classification in non-varying cases. We introduce Prediction Error Context Switching (PECS) which allows lazy-learning algorithms to have good classification accuracy in conditions having a time-varying function mapping and input sampling distributions, while still maintaining their asymptotic classification accuracy in static tasks. PECS works by selecting and re-activating previously stored instances based on their most recent consistency record. The classification accuracy and active learning set sizes for the above algorithms are compared in a set of learning tasks that illustrate the differing time-varying conditions described above. The results show that the PECS algorithm has the best overall classification accuracy over these differing time-varying conditions, while still having asymptotic classification accuracy competitive with unmodified lazy-learners intended for static environments.
Similar content being viewed by others
References
Aha, D. W., Kibler, D. & Albert, M. K. (1991). Instance-Based Learning Algorithms. Machine Learning 6: 37–66.
Atkeson, C. G. (1991). Using Locally Weighted Regression for Robot Learning. In IEEE Conference on Robotics and Automation, 958–963. IEEE Press.
Blum, A. (1995). Empirical Support for Winnow and Weighted Majority Based Algorithms: Results on a Calendar Scheduling Domain. Proceedings of the Twelfth International Conference on Machine Learning, 64–72. Morgan Kaufmann.
Duda P. O. & Hart P. E. (1973). Pattern Classification and Scence Analysis. Wiley and Sons: New York.
Friedman, J. H., Bentley, J. L. & Finkel, R. A. (1977). An Algorithm for Finding Best Matches in Logarithmic Expected Time. ACM Transactions on Mathematical Software 3: 209–226.
Hembold, D. & Long, P. (1991). Tracking Drifting Concepts Using Random Samples. In Fourth Workship on Computational Learning Theory, 12–23.
Kalman, R. E. (1960). A New Approach to Linear Filtering and Prediction Problems. Transactions ASME, Journal of Basic Engineering: 35–45.
Kuh, A., Petsche, T. & Rivest, R. L. (1991). Learning Time-Varying Concepts. In Advances in Neural Information Processing Systems 3, 183–189. Morgan Kaufmann.
Larson, R. & Marx, M. (1986). An Introduction to Mathematical Statistics. Prentice Hall: Englewood Cliffs, NJ.
Maes, P. & Brooks, R. A. (1990). Learning to Coordinate Behaviors. In AAAI-90, 796–802.
McCloskey, M. & Cohen, N. J. (1989). The Psychology of Learning and Motivation, chapter Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem. Academic Press: New York.
Moore, A. W. (1990). Acquisition of Dynamic Control Knowledge for a Robotic Manipulator. In Proceedings of the Seventh International Conference on Machine Learning, 244–252. Morgan Kaufmann.
Moore, A. W. (1991). Fast, Robust Adaptive Control by Learning Only Forward Models. In Advances in Neural Information Processing Systems. Morgan Kaufman.
Ratcliff, R. (1990). Connectionist Models of Recognition Memory: Constraints Imposed by Learning and Forgetting Functions. Psychological Reviews 97: 285–308.
Salganicoff, M. (1992). Learning and Forgetting for Perception-Action: A Projection-Pursuit and Density-Adaptive Approach. PhD thesis, University of Pennsylvania, Philadelphia, PA, Dept. of Computer and Information Science.
Salganicoff, M. (1993). Density-Adaptive Learning and Forgetting. In Proceedings of the Tenth International Conference on Machine Learning. San Francisco, CA: Morgan Kaufman.
Schlimmer, J. C. & Granger, R. H. (1986). Incremental Learning from Noisy Data. Machine Learning 1: 317–354.
Valiant, L. G. (1984). A Theory of the Learnable. Communications of the ACM 27(11): 1134–1142.
Widmer, G. & Kubat, M. (1992). Learning Flexible Concepts from Streams of Examples: FLORA2. Proceedings of the Tenth European Conference on Artificial Intelligence, 463–467. John Wiley & Sons.
Widmer, G. & Kubat, M. (1992). Effective Learning in Dynamic Environments by Explicit Context Tracking (Technical Report 93–35). Vienna, Austria: Austrian Institute for Artifical Intelligence.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Salganicoff, M. Tolerating Concept and Sampling Shift in Lazy Learning Using Prediction Error Context Switching. Artificial Intelligence Review 11, 133–155 (1997). https://doi.org/10.1023/A:1006515405170
Issue Date:
DOI: https://doi.org/10.1023/A:1006515405170