Integrating Distributed Component-Based Systems Through Deep Reinforcement Learning

Cohen, Itay; Peled, Doron

doi:10.1007/978-3-031-46002-9_26

Itay Cohen⁸ &
Doron Peled⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14380))

Included in the following conference series:

International Conference on Bridging the Gap between AI and Reality

526 Accesses
1 Citations

Abstract

Modern system design and development often consists of combining different components developed by separate vendors under some known constraints that allow them to operate together. Such a system may further benefit from further refinement when the components are integrated together. We suggest a learning-open architecture that employs deep reinforcement learning performed under weak assumptions. The components are “black boxes”, where their internal structure is not known, and the learning is performed in a distributed way, where each process is aware only on its local execution information and the global utility value of the system, calculated after complete executions. We employ the proximal policy optimization (PPO) as our learning architecture adapted to our case of training control for black box components. We start by applying the PPO architecture to a simplified case, where we need to train a single component that is connected to a black box environment; we show a stark improvement when compared to a previous attempt. Then we move to study the case of multiple components.

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 956123 - FOCETA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
This can also be easily generalized to employ a utility function that is calculated from the local utility functions of each component; for example, each component can calculate some measure of success (e.g., in form of a discounting sum) on performing interactions, and the global utility can be the sum of this measure over the participating processes.

References

Brown, T.B., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6–12, 2020, virtual (2020)
Google Scholar
Busoniu, L., Babuska, R., Schutter, B.D.: A comprehensive survey of multiagent reinforcement learning. IEEE Trans. Syst. Man Cybern. Part C 38(2), 156–172 (2008)
Article Google Scholar
Cassandras, C.G., Lafortune, S.: Introduction to Discrete Event Systems, 2nd edn. Springer, Cham (2008). https://doi.org/10.1007/978-0-387-68612-7
Book Google Scholar
Gößler, G., Sifakis, J.: Composition for component-based modeling. Sci. Comput. Program. 55(1–3), 161–183 (2005)
Article MathSciNet Google Scholar
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: International Conference on Machine Learning, pp. 1861–1870. PMLR (2018)
Google Scholar
Hoare, C.A.R.: Communicating sequential processes. Commun. ACM 21(8), 666–677 (1978)
Article Google Scholar
Iosti, S., Peled, D., Aharon, K., Bensalem, S., Goldberg, Y.: Synthesizing control for a system with black box environment, based on deep learning. In: Margaria, T., Steffen, B. (eds.) ISoLA 2020. LNCS, vol. 12477, pp. 457–472. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-61470-6_27
Chapter Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kravaris, T., Vouros, G.A.: Deep multiagent reinforcement learning methods addressing the scalability challenge. Appl. Intell. (2023)
Google Scholar
Lehmann, D., Rabin, M.O.: On the advantages of free choice: a symmetric and fully distributed solution to the dining philosophers problem. In: Proceedings of the 8th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pp. 133–138 (1981)
Google Scholar
Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems 32, pp. 8024–8035. Curran Associates, Inc. (2019). http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Sutton, R.S., Barto, A.G.: Reinforcement Learning - An Introduction. Adaptive Computation and Machine Learning, 2nd edn. MIT Press, London (2018)
Google Scholar
Tan, M.: Multi-agent reinforcement learning: Independent versus cooperative agents. In: Utgoff, P.E. (ed.) Machine Learning, Proceedings of the Tenth International Conference, University of Massachusetts, Amherst, MA, USA, June 27–29, 1993, pp. 330–337. Morgan Kaufmann (1993)
Google Scholar
Zhang, K., Yang, Z., Liu, H., Zhang, T., Basar, T.: Fully decentralized multi-agent reinforcement learning with networked agents. In: International Conference on Machine Learning, pp. 5872–5881. PMLR (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Bar Ilan University, 52900, Ramat Gan, Israel
Itay Cohen & Doron Peled

Authors

Itay Cohen
View author publications
You can also search for this author in PubMed Google Scholar
Doron Peled
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Doron Peled .

Editor information

Editors and Affiliations

TU Dortmund University, Dortmund, Germany
Bernhard Steffen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cohen, I., Peled, D. (2024). Integrating Distributed Component-Based Systems Through Deep Reinforcement Learning. In: Steffen, B. (eds) Bridging the Gap Between AI and Reality. AISoLA 2023. Lecture Notes in Computer Science, vol 14380. Springer, Cham. https://doi.org/10.1007/978-3-031-46002-9_26

Download citation

DOI: https://doi.org/10.1007/978-3-031-46002-9_26
Published: 14 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46001-2
Online ISBN: 978-3-031-46002-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics