Loading [a11y]/accessibility-menu.js
Multitask Policy Adversarial Learning for Human-Level Control With Large State Spaces | IEEE Journals & Magazine | IEEE Xplore

Multitask Policy Adversarial Learning for Human-Level Control With Large State Spaces


Abstract:

The sequential decision-making problem with large-scale state spaces is an important and challenging topic for multitask reinforcement learning (MTRL). Training near-opti...Show More

Abstract:

The sequential decision-making problem with large-scale state spaces is an important and challenging topic for multitask reinforcement learning (MTRL). Training near-optimality policies across tasks suffers from prior knowledge deficiency in discrete-time nonlinear environment, especially for continuous task variations, requiring scalability approaches to transfer prior knowledge among new tasks when considering large number of tasks. This paper proposes a multitask policy adversarial learning (MTPAL) method for learning a nonlinear feedback policy that generalizes across multiple tasks, making cognizance ability of robot much closer to human-level decision making. The key idea is to construct a parametrized policy model directly from large high-dimensional observations by deep function approximators, and then train optimal of sequential decision policy for each new task by an adversarial process, in which simultaneously two models are trained: a multitask policy generator transforms samples drawn from a prior distribution into samples from a complex data distribution with higher dimensionality, and a multitask policy discriminator decides whether the given sample is prior distribution from human-level empirically derived or from the generator. All the related human-level empirically derived are integrated into the sequential decision policy, transferring human-level policy at every layer in a deep policy network. Extensive experimental testing result of four different WeiChai Power manufacturing data sets shows that our approach can surpass human performance simultaneously from cart-pole to production assembly control.
Published in: IEEE Transactions on Industrial Informatics ( Volume: 15, Issue: 4, April 2019)
Page(s): 2395 - 2404
Date of Publication: 14 November 2018

ISSN Information:

Funding Agency:


References

References is not available for this document.