short-paper

Benefits of combining dimensional attention and working memory for partially observable reinforcement learning problems

Authors:
Ngozi Omatu

Middle Tennessee State University

Middle Tennessee State University
View Profile

,
Joshua L. Phillips

Middle Tennessee State University

Middle Tennessee State University
View Profile

ACM SE '21: Proceedings of the 2021 ACM Southeast ConferenceApril 2021Pages 209–213https://doi.org/10.1145/3409334.3452072

Published:10 May 2021Publication History

ACM SE '21: Proceedings of the 2021 ACM Southeast Conference

Pages 209–213

ABSTRACT

Neuroscience provides a rich source of inspiration for new types of algorithms and architectures to employ when building AI and the resulting biologically-plausible approaches that provide formal, testable models of brain function. The working memory toolkit (WMtk), was developed to assist the integration of an artificial neural network (ANN)-based computational neuroscience model of working memory into reinforcement learning (RL) agents, mitigating the details of ANN design and providing a simple symbolic encoding interface. While the WMtk allows RL agents to perform well in partially-observable domains, it requires prefiltering of sensory information by the programmer: a task often delegated to dimensional attention mechanisms in other cognitive architectures. To fill this gap, we develop and test a biologically-plausible dimensional attention filter for the WMtk and validate model performance using a partially-observable 1D maze task. We show that the attention filter improves learning behavior in two ways by: 1) speeding up learning in the short-term, early in training and 2) developing emergent alternative strategies which optimize performance over the long-term.

References

A. Baddeley. 1986. Working Memory. Oxford University Press. Google ScholarCross Ref
P. S. Churchland and T. J. Sejnowski. 1988. Perspectives on Cognitive Neuroscience. Vol. 242. Science. Google ScholarCross Ref
A. Conway and R. Engle. 1996. Individual Difference in Working Memory Capacity: More Evidence for a General Capacity Theory. 4, 6 (1996), 577. Google ScholarCross Ref
G.M. DuBois and J. L. Phillips. 2017. Working Memory Concept Encoding Using Holographic Reduced Representations. Modern Artificial Intelligence and Cognitive Science, Fort Wayne, US, 137--144.Google Scholar
D. Hebb. 1949. The Organization of Behavior: A Neuropsychological Theory. 35, 5 (1949), 335. Google ScholarCross Ref
J. J. Hopfield. 1982. Neural Networks and Physical Systems with Emergent Collective Computational Abilities. Proceedings of the National Academy 79, 8 (1982), 2554--2558. arXiv:https://www.pnas.org/content/79/8/2554.full.pdf. Google ScholarCross Ref
T. Kriete, D. C. Noelle, J. D. Cohen, and R. C. O'Reilly. 2013. Indirection and Symbol-like Processing in the Prefrontal Cortex and Basal Ganglia. Proceedings of the National Academy of Sciences 110, 41 (oct 2013), 16390--16395. Google ScholarCross Ref
J. K. Kruschke. 1992. ALCOVE: An Exemplar-based Connectionist Model of Category Learning. Psychological Review 99 (1992), 22--44.Google ScholarCross Ref
Y. Niv. 2009. Reinforcement Learning in the Brain. 53 (2009), 139--154. Google ScholarCross Ref
R. C. O'Reilly, D. C. Noelle, T. S. Braver, and J. D. Cohen. 2002. Prefrontal Cortex and Dynamic Categorization Tasks: Representational Organization and Neuromodulatory Control. Cerebral Cortex 12, 3 (Mar 2002), 246--257. Google ScholarCross Ref
J. L. Phillips and D. C. Noelle. 2004. Reinforcement Learning of Dimensional Attention for Categorization. The 26th Annual Meeting of the Cognitive Science Society, Chicago, US, 1101--1106.Google Scholar
J. L. Phillips and D. C. Noelle. 2006. Working Memory for Robots: Inspirations for Computational Neuroscience. 5th International Conference on Development and Learning, Bloomington, US.Google Scholar
W. Schultz. 1998. Predictive Reward Signal of Dopamine Neurons. 80 (1998), 1--27.Google Scholar
R. S. Sutton and A.G. Barto. 1998. Reinforcement Learning: An Introduction (first ed.). The MIT Press. http://incompleteideas.net/book/the-book-2nd.html.Google ScholarDigital Library
A. Turing. 1950. Computing Machinery and Intelligence. 236 (1950), 433--460. Google ScholarCross Ref
N. C. Waugh and D. A. Norman. 1965. Primary Memory. 72 (1965), 89--104. /. Google ScholarCross Ref

Index Terms

Benefits of combining dimensional attention and working memory for partially observable reinforcement learning problems
1. Applied computing
  1. Law, social and behavioral sciences
    1. Psychology
2. Computing methodologies
  1. Artificial intelligence
    1. Philosophical/theoretical foundations of artificial intelligence
      1. Cognitive science
  2. Machine learning
    1. Learning paradigms
      1. Reinforcement learning
        Sequential decision making

Recommendations

Learning continuous-time working memory tasks with on-policy neural reinforcement learning
Abstract
An animals’ ability to learn how to make decisions based on sensory evidence is often well described by Reinforcement Learning (RL) frameworks. These frameworks, however, typically apply to event-based representations and lack the ...
Read More
Reward Shaping in Episodic Reinforcement Learning
AAMAS '17: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems

Recent advancements in reinforcement learning confirm that reinforcement learning techniques can solve large scale problems leading to high quality autonomous decision making. It is a matter of time until we will see large scale applications of ...
Read More
A pulse neural network reinforcement learning algorithm for partially observable Markov decision processes

This paper considers learning by a pulse neural network and proposes a new reinforcement learning algorithm focusing on the ability of pulse neuron elements to process time series. The conventional integrator neuron element is modeled in terms of the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ACM SE '21: Proceedings of the 2021 ACM Southeast Conference
April 2021
263 pages
ISBN:9781450380683
DOI:10.1145/3409334
Conference Chair:
Kazi Rahman
Jacksonville State University
,
Program Chair:
Eric Gamess
Jacksonville State University, Jacksonville, Alabama, USA
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 May 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
artificial neural networks
dimensional attention learning
reinforcement learning
working memory
Qualifiers
- short-paper
Conference

Acceptance Rates
Overall Acceptance Rate178of377submissions,47%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 31
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Benefits of combining dimensional attention and working memory for partially observable reinforcement learning problems

ACM SE '21: Proceedings of the 2021 ACM Southeast Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Learning continuous-time working memory tasks with on-policy neural reinforcement learning

Reward Shaping in Episodic Reinforcement Learning

A pulse neural network reinforcement learning algorithm for partially observable Markov decision processes

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Benefits of combining dimensional attention and working memory for partially observable reinforcement learning problems

ACM SE '21: Proceedings of the 2021 ACM Southeast Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Learning continuous-time working memory tasks with on-policy neural reinforcement learning

Reward Shaping in Episodic Reinforcement Learning

A pulse neural network reinforcement learning algorithm for partially observable Markov decision processes

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media