skip to main content
10.1145/3637843.3637849acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicraiConference Proceedingsconference-collections
research-article

Complex Skill Learning using Modified Transformers with Automatic Chunking and Forgetting

Published: 06 March 2024 Publication History

Abstract

Transformers are increasingly becoming mainstream in the field of reinforcement learning due to their ability to learn complex behaviour sequences over long periods of time. Human learning and decision-making works on recollection which is based on only parts of memory that are important to the current context. The past contains information that might not be useful for making decisions in the present. The human brain reduces its need to store every past event by forgetting less important events and only associating importance to the events that lead to the desired outcome, giving a broader picture of the past and helping with faster decision making. Current transformer models lack the ability to scale to large memory sizes and learn long horizon tasks efficiently as the computational requirements needed to run these models scale non-linearly with memory length. We propose Automatic Chunking + ForgetSpan, an architecture that reduces the computational performance needed to train and run transformer models by leveraging automatic chunking and forgetting. Automatic chunking helps with chunking the memory into parts which might be relevant in the present context while forgetting removes events within the memory that do not contribute to learning, helping the model learn multiple sequences of events with ease and reducing the computational cost of larger memory sizes. We demonstrate that Automatic Chunking + ForgetSpan can help models to memorize important information and achieve improved performance with reduced computational needs on various memory, visual navigation, robotic locomotion and multimodal tasks. We also test SimilarityWeight, a method which weighs a new memory element based on its similarity with already existing elements in the buffer. Comparative analysis with existing memory architectures was performed to understand the effects of relevant memory selection and forgetting. Gating connections which make use of gated recurrent units and memory mean pooling based on masking are also used for further improvement in performance. Effects of using multiple memory-based transformers with automatic chunking in series operating on data from consecutive timesteps were also studied.

References

[1]
Lasse Espeholt, Hubert Soyer, Remi Munos, Karen Simonyan, Volodymyr Mnih, Tom Ward, Yotam Doron, Vlad Firoiu, Tim Harley, Iain Dunning, Shane Legg, and Koray Kavukcuoglu. Impala: Scalable distributed deep-rl with importance weighted actor-learner architectures. In International Conference on Machine Learning, pp. 1406–1415, 2018.
[2]
Jonas Rothfuss and Fabio Ferreira and Eren Erdal Aksoy and You Zhou and Tamim Asfour. arXiv: Deep Episodic Memory: Encoding, Recalling, and Predicting Episodic Experiences for Robot Action Execution, arXiv:1801.04134, 2018
[3]
Max Jaderberg, Wojciech M Czarnecki, Iain Dunning, Luke Marris, Guy Lever, Antonio Garcia Castaneda, Charles Beattie, Neil C Rabinowitz, Ari S Morcos, Avraham Ruderman, Human level performance in 3d multiplayer games with population-based reinforcement learning. Science, 364(6443):859–865, 2019.
[4]
Pleines, Marco and Pallasch, Matthias, Zimmer, Mike, Frank and Preuss TransformerXL as Episodic Memory in Proximal Policy Optimization, Github Repository, https://github.com/MarcoMeter/episodic-transformer-memory-ppo, 2023.
[5]
Ignasi Sols, Sarah DuBrow, Lila Davachi, and Lluís Fuentemilla. Event boundaries trigger rapid memory reinstatement of the prior events to promote their representation in long-term memory. Current Biology, 27(22):3499–3504, 2017.
[6]
Stephanie CY Chan, Marissa C Applegate, Neal W Morton, Sean M Polyn, and Kenneth A Norman. Lingering representations of stimuli influence recall organization. Neuropsychologia, 97:72–82, 2017
[7]
Aida Nematzadeh, Sebastian Ruder, and Dani Yogatama. On memory in human and artificial language processing systems. In Bridging AI and Cognitive Science Workshop at ICLR 2020, 2020.
[8]
Ali Hassani, Steven Walton, Nikhil Shah, Abulikemu Abuduweili, Jiachen Li and Humphrey Shi. arXiv preprint: Escaping the Big Data Paradigm with Compact Transformers, arXiv:2104.05704, 2021
[9]
Urvashi Khandelwal, Omer Levy, Dan Jurafsky, Luke Zettlemoyer, and Mike Lewis. Generalization through memorization: Nearest neighbor language models. In International Conference on Learning Representations, 2020.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICRAI '23: Proceedings of the 2023 9th International Conference on Robotics and Artificial Intelligence
November 2023
72 pages
ISBN:9798400708282
DOI:10.1145/3637843
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 March 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Forgetting
  2. Reinforcement Learning
  3. Transformers

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICRAI 2023

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 24
    Total Downloads
  • Downloads (Last 12 months)24
  • Downloads (Last 6 weeks)4
Reflects downloads up to 08 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media