Journals & Magazines >IEEE Transactions on Computers >Volume: 66 Issue: 2

Hardware-Friendly Actor-Critic Reinforcement Learning Through Modulation of Spike-Timing-Dependent Plasticity

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

In this work, we propose a hardware-friendly reinforcement learning algorithm. The learning algorithm is based on an actor-critic structure implemented with spiking neura...Show More

Metadata

Abstract:

In this work, we propose a hardware-friendly reinforcement learning algorithm. The learning algorithm is based on an actor-critic structure implemented with spiking neural networks (SNNs). A biologically plausible and hardware-friendly spike-timing-dependent plasticity learning rule is formulated and employed in the training of SNNs. Several important aspects of applying the learning rule in a reinforcement learning context is studied, especially from the circuit designers' point of view. Pitfalls of potential noise mixing and correlated spikes are identified and properly addressed. To feature a low-power learning architecture, techniques such as down-sampling data for certain learning blocks, injecting quantization noise as noisy residues in neurons, and proper memory partitioning are proposed. A 1-D state-value function learning problem and a 2-D maze walking problem are examined in this paper to illustrate effectiveness of the proposed algorithm and learning rules. A low-power hardware architecture is proposed and examples are implemented with Verilog. Hardware complexity of the proposed algorithm is analyzed, and potential solutions to breaking memory bottleneck when the size of the problem gets large is also discussed.

Published in: IEEE Transactions on Computers ( Volume: 66, Issue: 2, 01 February 2017)

Page(s): 299 - 311

Date of Publication: 27 July 2016

ISSN Information:

DOI: 10.1109/TC.2016.2595580

Funding Agency:

Contents

References is not available for this document.

Hardware-Friendly Actor-Critic Reinforcement Learning Through Modulation of Spike-Timing-Dependent Plasticity

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Hardware-Friendly Actor-Critic Reinforcement Learning Through Modulation of Spike-Timing-Dependent Plasticity

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?