Skip to main content

Integrating Distributed Component-Based Systems Through Deep Reinforcement Learning

  • Conference paper
  • First Online:
Bridging the Gap Between AI and Reality (AISoLA 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14380))

Included in the following conference series:

Abstract

Modern system design and development often consists of combining different components developed by separate vendors under some known constraints that allow them to operate together. Such a system may further benefit from further refinement when the components are integrated together. We suggest a learning-open architecture that employs deep reinforcement learning performed under weak assumptions. The components are “black boxes”, where their internal structure is not known, and the learning is performed in a distributed way, where each process is aware only on its local execution information and the global utility value of the system, calculated after complete executions. We employ the proximal policy optimization (PPO) as our learning architecture adapted to our case of training control for black box components. We start by applying the PPO architecture to a simplified case, where we need to train a single component that is connected to a black box environment; we show a stark improvement when compared to a previous attempt. Then we move to study the case of multiple components.

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 956123 - FOCETA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This can also be easily generalized to employ a utility function that is calculated from the local utility functions of each component; for example, each component can calculate some measure of success (e.g., in form of a discounting sum) on performing interactions, and the global utility can be the sum of this measure over the participating processes.

References

  1. Brown, T.B., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6–12, 2020, virtual (2020)

    Google Scholar 

  2. Busoniu, L., Babuska, R., Schutter, B.D.: A comprehensive survey of multiagent reinforcement learning. IEEE Trans. Syst. Man Cybern. Part C 38(2), 156–172 (2008)

    Article  Google Scholar 

  3. Cassandras, C.G., Lafortune, S.: Introduction to Discrete Event Systems, 2nd edn. Springer, Cham (2008). https://doi.org/10.1007/978-0-387-68612-7

    Book  Google Scholar 

  4. Gößler, G., Sifakis, J.: Composition for component-based modeling. Sci. Comput. Program. 55(1–3), 161–183 (2005)

    Article  MathSciNet  Google Scholar 

  5. Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: International Conference on Machine Learning, pp. 1861–1870. PMLR (2018)

    Google Scholar 

  6. Hoare, C.A.R.: Communicating sequential processes. Commun. ACM 21(8), 666–677 (1978)

    Article  Google Scholar 

  7. Iosti, S., Peled, D., Aharon, K., Bensalem, S., Goldberg, Y.: Synthesizing control for a system with black box environment, based on deep learning. In: Margaria, T., Steffen, B. (eds.) ISoLA 2020. LNCS, vol. 12477, pp. 457–472. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-61470-6_27

    Chapter  Google Scholar 

  8. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  9. Kravaris, T., Vouros, G.A.: Deep multiagent reinforcement learning methods addressing the scalability challenge. Appl. Intell. (2023)

    Google Scholar 

  10. Lehmann, D., Rabin, M.O.: On the advantages of free choice: a symmetric and fully distributed solution to the dining philosophers problem. In: Proceedings of the 8th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pp. 133–138 (1981)

    Google Scholar 

  11. Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)

  12. Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems 32, pp. 8024–8035. Curran Associates, Inc. (2019). http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf

  13. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)

  14. Sutton, R.S., Barto, A.G.: Reinforcement Learning - An Introduction. Adaptive Computation and Machine Learning, 2nd edn. MIT Press, London (2018)

    Google Scholar 

  15. Tan, M.: Multi-agent reinforcement learning: Independent versus cooperative agents. In: Utgoff, P.E. (ed.) Machine Learning, Proceedings of the Tenth International Conference, University of Massachusetts, Amherst, MA, USA, June 27–29, 1993, pp. 330–337. Morgan Kaufmann (1993)

    Google Scholar 

  16. Zhang, K., Yang, Z., Liu, H., Zhang, T., Basar, T.: Fully decentralized multi-agent reinforcement learning with networked agents. In: International Conference on Machine Learning, pp. 5872–5881. PMLR (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Doron Peled .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cohen, I., Peled, D. (2024). Integrating Distributed Component-Based Systems Through Deep Reinforcement Learning. In: Steffen, B. (eds) Bridging the Gap Between AI and Reality. AISoLA 2023. Lecture Notes in Computer Science, vol 14380. Springer, Cham. https://doi.org/10.1007/978-3-031-46002-9_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-46002-9_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-46001-2

  • Online ISBN: 978-3-031-46002-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics