Journals & Magazines >IEEE Transactions on Systems,... >Volume: 55 Issue: 2

On the Effectiveness of Regularization Methods for Soft Actor-Critic in Discrete-Action Domains

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Soft actor-critic (SAC) is a reinforcement learning algorithm that employs the maximum entropy framework to train a stochastic policy. This work examines a specific failu...Show More

Metadata

Abstract:

Soft actor-critic (SAC) is a reinforcement learning algorithm that employs the maximum entropy framework to train a stochastic policy. This work examines a specific failure case of SAC where the stochastic policy is trained to maximize the expected entropy from a sparse reward environment. We demonstrate that the over-exploration of SAC can make the entropy temperature collapse, followed by unstable updates to the actor. Based on our analyses, we introduce Reg-SAC, an improved version of SAC, to mitigate the detrimental effects of the entropy temperature on the learning stability of the stochastic policy. Reg-SAC incorporates a clipping value to prevent the entropy temperature collapse and regularizes the gradient updates of the policy via Kullback-Leibler divergence. Through experiments on discrete benchmarks, our proposed Reg-SAC outperforms the standard SAC in spare-reward grid world environments while it is able to maintain competitive performance in the dense-reward Atari benchmark. The results highlight that our regularized version makes the stochastic policy of SAC more stable in discrete-action domains.

Published in: IEEE Transactions on Systems, Man, and Cybernetics: Systems ( Volume: 55, Issue: 2, February 2025)

Page(s): 1425 - 1438

Date of Publication: 04 December 2024

ISSN Information:

DOI: 10.1109/TSMC.2024.3505613

Contents

References is not available for this document.

On the Effectiveness of Regularization Methods for Soft Actor-Critic in Discrete-Action Domains

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

On the Effectiveness of Regularization Methods for Soft Actor-Critic in Discrete-Action Domains

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?