Efficient hierarchical policy network with fuzzy rules

Shi, Wei; Feng, Yanghe; Huang, Honglan; Liu, Zhong; Huang, Jincai; Cheng, Guangquan

doi:10.1007/s13042-021-01417-2

Efficient hierarchical policy network with fuzzy rules

Original Article
Published: 30 August 2021

Volume 13, pages 447–459, (2022)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Wei Shi¹^na1,
Yanghe Feng ORCID: orcid.org/0000-0003-1608-8695¹^na1,
Honglan Huang¹^na1,
Zhong Liu¹,
Jincai Huang¹ &
…
Guangquan Cheng¹

445 Accesses
1 Altmetric
Explore all metrics

Abstract

Hierarchical reinforcement learning (HRL) is a promising method, which decomposes complex tasks into a series of sub-tasks. However, at present, most HRL methods have slow convergence speed and are difficult to be widely applied to such scenarios in real life. In this paper, we propose an efficient hierarchical reinforcement learning algorithm with fuzzy rules (HFR), a novel framework for integrating human prior knowledge with hierarchical policy network, which can effectively accelerate the optimization of policy. The model presented in this paper uses fuzzy rules to represent the human prior knowledge, making the rules trainable because of the derivability of the fuzzy rules. In addition, a switch module that adaptively adjusts the decision-making frequency of the upper-level policy is proposed to solve the limitation of manual tuning. Experiment results demonstrate that HFR has a faster convergence rate than the current state-of-the-art HRL algorithms, especially in complex scenarios, such as robot control tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accelerating deep reinforcement learning via knowledge-guided policy network

Article 18 February 2023

Learning a robot controller using an adaptive hierarchical fuzzy rule-based system

Article 22 April 2015

Multi-Agent Reward-Iteration Fuzzy Q-Learning

Article 13 April 2021

References

Al-taezi M, Zhu P, Hu Q, Wang Y, Al-badwi A (2021) Self-paced hierarchical metric learning (SPHML). Int J Mach Learn Cybernetics 12(9):2529–2541. https://doi.org/10.1007/s13042-021-01336-2
Article Google Scholar
An S, Hu Q, Wang C, Guo G, Li P (2021) Data reduction based on NN-kNN measure for NN classification and regression. Int J Mach Learn Cybern. https://doi.org/10.1007/s13042-021-01327-3
Article Google Scholar
Bakker B, Schmidhuber J (2003) Hierarchical reinforcement learning based on automatic discovery of subgoals and specialization of subpolicies. In: EWRL-6’2003: European workshop on reinforcement learning
Barto AG, Mahadevan S (2003) Recent advances in hierarchical reinforcement learning. Discrete Event Dyn Syst 13(1–2):341–379
Article MathSciNet Google Scholar
Dietterich TG (1998) The maxq method for hierarchical reinforcement learning. In: Proceedings of the 15th international conference on machine learning
Dietterich TG (2000) An overview of maxq hierarchical reinforcement learning. In: Proceedings of the 4th international symposium on abstraction, reformulation, and approximation
Fan C, Zeng L, Feng Y, Cheng G, Huang J, Liu Z (2020) A novel learning-based approach for efficient dismantling of networks. In J Mach Learn Cybernetics 11(9):2101–2111. https://doi.org/10.1007/s13042-020-01104-8
Article Google Scholar
Feng Y, Dai L, Gao J, Cheng G (2020) Uncertain pursuit-evasion game. Soft Comput 24(4):2425–2429. https://doi.org/10.1007/s00500-018-03689-3
Article MATH Google Scholar
Feng Y, Shi W, Shi W, Cheng G, Huang J, Liu Z (2020) Benchmarking framework for command and control mission planning under uncertain environment. Soft Comput 24(4):2463–2478. https://doi.org/10.1007/s00500-018-03732-3
Article Google Scholar
Feng Y, Yang X, Cheng G (2018) Stability in mean for multi-dimensional uncertain differential equation. Soft Comput 22(17):5783–5789. https://doi.org/10.1007/s00500-017-2659-7
Article MATH Google Scholar
Gu S, Holly E, Lillicrap T, Levine S (2017) Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: 2017 IEEE international conference on robotics and automation (ICRA), pp 3389–3396
Johnson F, Dana K (2020) Feudal steering: hierarchical learning for steering angle prediction. In: 2020 IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW)
Konidaris G, Barto A (2007) Building portable options: skill transfer in reinforcement learning. In: International journal conference on artificial intelligence
Li S, Wang R, Tang M, Zhang C (2019) Hierarchical reinforcement learning with advantage-based auxiliary rewards. arXiv preprint arXiv: 1910.04450
Lillicrap T, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2015) Continuous control with deep reinforcement learning. Computer arXiv: 1509:02971
Mcgovern A (2001) Automatic discovery of subgoals in reinforcement learning using diverse density. In: Proceedings of the 18th international conference on machine learning
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Article Google Scholar
Nachum O, Gu S, Lee H, Levine S (2018) Data-efficient hierarchical reinforcement learning. arXiv preprint arXiv: 180508296
Parr RE (1999) Hierarchical control and learning for markov decision processes. Thesis, University of California
Perkins TJ, Barto AG, Brodley CE, Danyluk A (2003) Lyapunov design for safe reinforcement learning. J Mach Learn Res 3:803–832
MathSciNet MATH Google Scholar
Rafati J, Noelle D (2019) Efficient exploration through intrinsic motivation learning for unsupervised subgoal discovery in model-free hierarchical reinforcement learning. arXiv preprint arXiv: 191110164
Schulman J, Levine S, Abbeel P, Jordan MI, Moritz P (2015) Trust region policy optimization. International conference on machine learning arXiv: 1502:05477
Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv preprint arXiv: 170706347
Stolle M, Precup D (2002) Learning options in reinforcement learning. In: Abstraction, reformulation and approximation, 5th international symposium, SARA 2002, Kananaskis, Alberta, Canada, August 2–4, 2002, Proceedings
Tai L, Paolo G, Liu M (2017) Virtual-to-real deep reinforcement learning: continuous control of mobile robots for mapless navigation. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 31–36
Vezhnevets AS, Osindero S, Schaul T, Heess N, Jaderberg M, Silver D, Kavukcuoglu K (2017) Feudal networks for hierarchical reinforcement learning. In: International conference on machine learning, PMLR, pp 3540–3549
Wang Y, Liu R, Lin D, Chen D, Li P, Hu Q, Philip CL (2021) Chen coarse-to-fine: progressive knowledge transfer-based multitask convolutional neural network for intelligent large-scale fault diagnosis. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2021.3100928
Article Google Scholar
Wu G, Fan M, Shi J, Feng Y. Reinforcement learning based truck-and-drone Coordinated Delivery. In: IEEE Transactions on Artificial Intelligence, pp 1–1. https://doi.org/10.1109/TAI.2021.3087666
Xu Z, He Y, Wang X (2019) An overview of probabilistic-based expressions for qualitative decision-making: techniques comparisons and developments. Int J Mach Learn Cybernetics 10(6):1513–1528. https://doi.org/10.1007/s13042-018-0830-9
Article Google Scholar
Xu R, Wen Z, Gui L, Lu Q, Li B, Wang X (2020) Ensemble with estimation: seeking for optimization in class noisy data. Int J Mach Learn Cybernetics 11(2):231–248. https://doi.org/10.1007/s13042-019-00969-8
Article Google Scholar
Yu D, Xu Z, Wang X (2020) Bibliometric analysis of support vector machines research trend: a case study in China. Int J Mach Learn Cybernetics 11(3):715–728. https://doi.org/10.1007/s13042-019-01028-y
Article Google Scholar
Zadeh LA (1965) Fuzzy sets. Inf. Control 8(3):338–353
Article Google Scholar
Zadeh LA (1996) Knowledge representation in fuzzy logic. In: Fuzzy sets, fuzzy logic, and fuzzy systems
Zhang P, Hao J, Wang W, Tang H, Ma Y, Duan Y, Zheng Y (2020) Kogun: accelerating deep reinforcement learning via integrating human suboptimal knowledge. arXiv preprint arXiv: 200207418
Zhou WJ, Yu Y (2020) Temporal-adaptive hierarchical reinforcement learning. arXiv preprint arXiv: 200202080

Download references

Acknowledgements

Wei Shi, Yanghe Feng and Honglan Huang contributed equally to this work and should be considered as co-first authors. The authors would like to thank Xingxing Liang for his contribution to the model design of this paper.

Funding

The work described in this paper has been funded by National Natural Science Foundation of P.R. China(71701205).

Author information

Wei Shi, Yanghe Feng and Honglan Huang contributed equally.

Authors and Affiliations

College of Systems Engineering, National University of Defense Technology, Changsha, China
Wei Shi, Yanghe Feng, Honglan Huang, Zhong Liu, Jincai Huang & Guangquan Cheng

Authors

Wei Shi
View author publications
You can also search for this author in PubMed Google Scholar
Yanghe Feng
View author publications
You can also search for this author in PubMed Google Scholar
Honglan Huang
View author publications
You can also search for this author in PubMed Google Scholar
Zhong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jincai Huang
View author publications
You can also search for this author in PubMed Google Scholar
Guangquan Cheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yanghe Feng.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Wei Shi, Yanghe Feng and Honglan Huang co-first author.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shi, W., Feng, Y., Huang, H. et al. Efficient hierarchical policy network with fuzzy rules. Int. J. Mach. Learn. & Cyber. 13, 447–459 (2022). https://doi.org/10.1007/s13042-021-01417-2

Download citation

Received: 24 June 2021
Accepted: 19 August 2021
Published: 30 August 2021
Issue Date: February 2022
DOI: https://doi.org/10.1007/s13042-021-01417-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient hierarchical policy network with fuzzy rules

Abstract

Access this article

Similar content being viewed by others

Accelerating deep reinforcement learning via knowledge-guided policy network

Learning a robot controller using an adaptive hierarchical fuzzy rule-based system

Multi-Agent Reward-Iteration Fuzzy Q-Learning

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient hierarchical policy network with fuzzy rules

Abstract

Access this article

Similar content being viewed by others

Accelerating deep reinforcement learning via knowledge-guided policy network

Learning a robot controller using an adaptive hierarchical fuzzy rule-based system

Multi-Agent Reward-Iteration Fuzzy Q-Learning

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation