ORLEP: an efficient offline reinforcement learning evaluation platform

Mao, Keming; Chen, Chen; Zhang, Jinkai; Li, Yiyang

doi:10.1007/s11042-023-16906-5

ORLEP: an efficient offline reinforcement learning evaluation platform

1230: Sentient Multimedia Systems and Visual Intelligence
Published: 22 September 2023

Volume 83, pages 37073–37087, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Keming Mao¹,
Chen Chen ORCID: orcid.org/0000-0002-8783-3798¹,
Jinkai Zhang¹ &
…
Yiyang Li¹

104 Accesses
Explore all metrics

Abstract

Developing offline reinforcement learning evaluation applications faces challenges such as heterogeneous data and algorithm integration, user-friendly interface, and flexible resource management. This paper designs and implements ORLEP, an efficient platform to provide high-level services for offline reinforcement learning evaluation. Besides integrating underlying infrastructure with highly concurrency and reliability, core components with distributed deployment and 3rd party libs and benchmarks incorporation, ORLEP supplies high-level abstractions for (1) data management, (2) model training and evaluation, (3) result visualization, and (4) resource configuration and supervision. Moreover, this paper verifies specific cases and the results demonstrate the performance and scalability of the proposed ORLEP.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reinforcement Learning: A Friendly Introduction

Common Structures in Resource Management as Driver for Reinforcement Learning: A Survey and Research Tracks

Domain Adaptation in Reinforcement Learning: Approaches, Limitations, and Future Directions

Article 06 April 2024

Data Availability

Data openly available in a public repository.

References

Alshuqayran N, Ali N, Evans R (2016) A systematic mapping study in microservice architecture. In: 2016 IEEE 9th International Conference on Service-Oriented Computing and Applications (SOCA). IEEE, pp 44–51
Burch C (2010) Django, a web framework using python: Tutorial presentation. In: J Comput Sci Coll 25(5):154–155
Google Scholar
Cabi S, et al (2019) A framework for data-driven robotics. In: arXiv:1909.12200
D’Eramo C et al (2021) Mushroomrl: Simplifying reinforcement learning research. In: J Mach Learn Res 22(1):5867–5871
MathSciNet Google Scholar
Denoyer L, et al (2021) Salina: Sequential learning of agents. In: arXiv:2110.07910
Fu J, et al (2020) D4rl: Datasets for deep data-driven reinforcement learning. In: arXiv:2004.07219
Fujimoto S, Meger D, Precup D (2019) Off-policy deep reinforcement learning without exploration. In: International conference on machine learning. PMLR, pp 2052–2062
Gade AN et al (2018) REDIS: A value-based decision support tool for renovation of building portfolios. Building and environment 142:107–118
Article Google Scholar
Henderson J, Lemon O, Georgila K (2008) Hybrid reinforcement/supervised learning of dialogue policies from fixed data sets. In: Comput Linguist 34(4):487–511
Google Scholar
Ionescu VM (2015) The analysis of the performance of RabbitMQ and ActiveMQ. In: 2015 14th RoEduNet International Conference-Networking in Education and Research (RoEduNet NER). IEEE, pp 132–137
Jaques N et al (2019) Way off-policy batch deep reinforcement learning of implicit human preferences in dialog. In: arXiv:1907.00456
Kuhnle A, Schaarschmidt M, Fricke K (2017) Tensorforce: a tensorflow library for applied reinforcement learning. In: Web p 9
Kumar A et al (2020) Conservative q-learning for offline reinforcement learning. Adv Neural Inf Process Syst 33:1179–1191
Google Scholar
Levine S, et al (2020) Offline reinforcement learning:Tutorial, review, and perspectives on open problems. In: arXiv:2005.01643
Li L, et al (2010) A contextual-bandit approach to personalized news article recommendation. In: Proceedings of the 19th international conference on World wide web, pp 661–670
Linzecong. LPOJ usage and development Document. https://docs.lpoj.cn/.2023.5.20
Nandy A, et al (2018) Reinforcement learning with keras, tensorflow, and chainerrl. In: Reinforcement learning: With open ai, tensorflow and keras using python, pp 129–153
Ouyang L et al (2022) Training language models to follow instructions with human feedback. Adv Neural Inf Process Syst 35:27730–27744
Google Scholar
Pietquin O et al (2011) Sample-efficient batch reinforcement learning for dialogue management optimization. In: ACM Trans Audio Speech Lang Process (TSLP) 7(3):1–21s
Google Scholar
Qin RJ et al (2022) NeoRL: A near real-world benchmark for offline reinforcement learning. Adv Neural Inf Process Syst 35:24753–24765
Google Scholar
Raffin A et al (2021) Stable-baselines3: Reliable reinforcement learning implementations. In: J Mach Learn Res 22(1):12348–12355
MathSciNet Google Scholar
Seno T, Imai M (2022) d3rlpy: An offline deep reinforcement learning library. In: J Mach Learn Res 23(1):14205–14224
MathSciNet Google Scholar
Sheldon R, Moes G (2005) Beginning MySQL. John Wiley & Sons
Google Scholar
Silver D et al (2017) Mastering the game of go without human knowledge. In: Nature 550(7676):354–359
Google Scholar
Strehl A, et al (2010) Learning from logged implicit exploration data. In: Adv Neural Inf Process Syst 23
Thomas P, et al (2017) Predictive off-policy policy evaluation for nonstationary decision problems, with applications to digital marketing. In: Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 31(2), pp 4740–4745
Vinyals O et al (2019) Grandmaster level in Star-Craft II using multi-agent reinforcement learning. In: Nature 575(7782):350–354
Google Scholar
Weng J, et al (2021) Tianshou: A highly modularized deep reinforcement learning library. In: arXiv:2107.14171
Wiering MA, Van Otterlo M (2012) Reinforcement learning. In: Adapt Learn Optim 12(3):729
Google Scholar
You E (2022) Vue.js Progressive JavaScript Framework. https://v2.cn.vuejs.org/.2023.5.20

Download references

Acknowledgements

All authors contributed to the study conception and design. Material preparation, analysis and writing original draft were performed by Chen Chen. Resources, supervision, funding acquisition were performed by Mao Keming. Material preparation, software, investigation were performed by Zhang Jinkai. Data collection and test were performed by Li Yiyang. And all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Funding

This work was supported by Natural Science Foundation(No.2022-MS-112) of Lianoning Province, China.

Author information

Authors and Affiliations

Software College, Northeastern University, Shenyang, China
Keming Mao, Chen Chen, Jinkai Zhang & Yiyang Li

Authors

Keming Mao
View author publications
You can also search for this author in PubMed Google Scholar
Chen Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jinkai Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yiyang Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chen Chen.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Mao, K., Chen, C., Zhang, J. et al. ORLEP: an efficient offline reinforcement learning evaluation platform. Multimed Tools Appl 83, 37073–37087 (2024). https://doi.org/10.1007/s11042-023-16906-5

Download citation

Received: 13 July 2022
Revised: 25 May 2023
Accepted: 27 August 2023
Published: 22 September 2023
Issue Date: April 2024
DOI: https://doi.org/10.1007/s11042-023-16906-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

ORLEP: an efficient offline reinforcement learning evaluation platform

Abstract

Access this article

Similar content being viewed by others

Reinforcement Learning: A Friendly Introduction

Common Structures in Resource Management as Driver for Reinforcement Learning: A Survey and Research Tracks

Domain Adaptation in Reinforcement Learning: Approaches, Limitations, and Future Directions

Data Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

ORLEP: an efficient offline reinforcement learning evaluation platform

Abstract

Access this article

Similar content being viewed by others

Reinforcement Learning: A Friendly Introduction

Common Structures in Resource Management as Driver for Reinforcement Learning: A Survey and Research Tracks

Domain Adaptation in Reinforcement Learning: Approaches, Limitations, and Future Directions

Data Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation