skip to main content
10.1145/12808.12842acmconferencesArticle/Chapter ViewAbstractPublication PagesismisConference Proceedingsconference-collections
Article
Free Access

Memory length as a feedback parameter in learning systems

Authors Info & Claims
Published:01 December 1986Publication History

ABSTRACT

In a classic learning experiment, a higher vertebrate is presented with two levers. To start, a reward is given if the left lever is pushed, no reward is given if the right lever is pushed. After a certain period of time, T, the meaning of the two levers is interchanged. Now a reward is given if the right lever is pushed and no reward is given if the left lever is pushed. This happens for the same period of time T. Once again there is an interchange in the meaning of the two levers for the period of time T. This alternation is repeated a number of times.

Higher vertebrates exhibit learning in this situation. With each repeated period of time T, they gradually adapt their response to the correct lever. In other words, at first they are slow to change from one lever to another, but they gradually learn that the reward at each lever is being changed with period T.

The above experiment strongly suggests that memory retention is being adjusted to the length of time that a reward is given at each lever. While it is difficult to determine the exact mechanism by which this is done, there is an easy feedback control system which models this behavior. This is shown below.

In the system of Figure 1, if the error signal is too large, say exceeds a specified critical error, there is a decrease, s, in the memory length. If the error signal is low, say falls below the critical error, there is an increase, p, in the memory length. As described, this is a binary alternative in the change of memory length, as shown in Figure 2. It is of course possible to have more than two alternatives in the changes of memory length.

In [1] the above system was applied to the tracking of a maneuvering aircraft, where it is assumed that the aircraft follows a periodic linear spline function. That is, the aircraft in the absence of noise follows the path shown in Figure 3.

There are a certain number of samples of the aircraft position during each time interval T. The noise disturbances at each sample are chosen to be independent samples of normally distributed noise of mean 0 and constant standard of deviation σ.

To simplify, let the filter model be a straight line, least squares model. Other models are possible, for example a straight line, weighted least squares model.

Let E be the error signal. In [1], p = 0 when |E| ≤ Κσ, s = 1 when |E| > Κσ, where Κ is a constant and Κσ is the critical error. While recursive filter models are given in [1], the operation may be viewed here as a sequence of nonrecursive linear regressions over f sample points, where f is the adjusting memory length.

In Figure 4, if the error signal is less than Κσ, the memory length increases from 7 to 8, fitting points 11 to 18. This corresponds to p = 0 because there is no penetration into past history prior to point 11.

In Figure 4, if the error signal is greater than Κσ, the memory length decreases from 7 to 6, fitting points 13 to 18. This corresponds to s = 1 because there is a shrinkage of memory length by 1.

Other values for p and s are possible. In the above we have seen a binary alternative where either f increases by 1 (p = 0) or f decreases by 1 (s = 1).

Many questions remain on the choice of alternatives. In what follows, we answer the questions below.

  • Is there any improvement in going from 2 alternatives, where f increases or decreases by 1, to 3 alternatives, where f increases by 1, decreases by 1, or remains the same?

  • Is there any improvement in using 2 alternatives corresponding to p = 1 (f increases by 2) and s = 2 (f decreases by 2)?

  • Is there any improvement in using 2 alternatives corresponding to high values of s and p?

First, is there any improvement in allowing f to remain unchanged, as well as increasing or decreasing by one? We have made a number of simulations to test this, and find no discernible improvement over the binary alternative (just increasing or decreasing by one). From the standpoint of models for learning systems, this result is not surprising.

Second, the system in which p = 1 and s = 2 shows improvement over the system in which p = 0 and s = 1. The latter may be found in [1]. The improved performance of the former is shown in Figure 5.

Apparently in Figure 5, the improvement stems from the fact that f increases or decreases by a greater amount. For p = 1, the model penetrates into past history by 1, increasing f by 2 rather than 1. For s = 2, the model shrinks f by 2, rather than 1.

Note that in Figure 5 the learning is rapid, in that the overshoot at each knot of the spline function rapidly decreases, through the sequence at knots B, C, D, E, F.

As can be seen in Figure 5, the period T is rapidly learned by the filter output. This is apparent by viewing the filter output from knot E to knot F. In fact, the filter value of f in steady state has a root-mean-square error of only 3.5 sample points with essentially no bias error.

Third, could high increases or decreases in f lead to improved performance? The answer is a definite no. The performance in such cases is very poor. Apparently the best performance may be found around p = 0 and s = 1, or p = 1 and s = 2, as noted above.

References

  1. 1.G. Epstein,"Adaptive Memory Trackers" 1971 Fall Joint Computer Conference Proceedings, pp. 663-668, November 1971.Google ScholarGoogle Scholar

Index Terms

  1. Memory length as a feedback parameter in learning systems

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              ISMIS '86: Proceedings of the ACM SIGART international symposium on Methodologies for intelligent systems
              December 1986
              450 pages
              ISBN:0897912063
              DOI:10.1145/12808

              Copyright © 1986 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 1 December 1986

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • Article
            • Article Metrics

              • Downloads (Last 12 months)8
              • Downloads (Last 6 weeks)1

              Other Metrics

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader