Skip to main content

Synthesizing Control for a System with Black Box Environment, Based on Deep Learning

  • Conference paper
  • First Online:
Book cover Leveraging Applications of Formal Methods, Verification and Validation: Engineering Principles (ISoLA 2020)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12477))

Included in the following conference series:

Abstract

We study the synthesis of control for a system that interacts with a black-box environment, based on deep learning. The goal is to minimize the number of interaction failures. The current state of the environment is unavailable to the controller, hence its operation depends on a limited view of the history. We suggest a reinforcement learning framework of training a Recurrent Neural Network (RNN) to control such a system. We experiment with various parameters: loss function, exploration/exploitation ratio, and size of lookahead. We designed examples that capture various potential control difficulties. We present experiments performed with the toolkit DyNet.

S. Iosti and S. Bensalem—The research performed by these authors was partially funded by H2020-ECSEL grants CPS4EU 2018-IA call - Grant Agreement number 826276.

D. Peled and K. Aharon—The research performed by these authors was partially funded by ISF grants “Runtime Measuring and Checking of Cyber Physical Systems” (ISF award 2239/15) and “Efficient Runtime Verification for Systems with Lots of Data and its Applications” (ISF award 1464/18).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    DyNet is a python package for automatic differentiation and stochastic gradient training, similar to PyTorch, and TensorFlow but which is also optimized for strong CPU performance.

References

  1. Angluin, D.: Learning regular sets from queries and counterexamples. Inf. Comput. 75(2), 87–106 (1987)

    Article  MathSciNet  Google Scholar 

  2. Basu, A., Bensalem, S., Peled, D.A., Sifakis, J.: Priority scheduling of distributed systems based on model checking. Formal Methods Syst. Des. 39(3), 229–245 (2011)

    Article  Google Scholar 

  3. Bensalem, S., Bozga, M., Graf, S., Peled, D., Quinton, S.: Methods for knowledge based controlling of distributed systems. In: Bouajjani, A., Chin, W.-N. (eds.) ATVA 2010. LNCS, vol. 6252, pp. 52–66. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15643-4_6

    Chapter  MATH  Google Scholar 

  4. Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)

    Article  Google Scholar 

  5. Silver, D., et al.: Mastering chess and shogi by self-play with a general reinforcement learning algorithm. CoRR, abs/1712.01815 (2017)

    Google Scholar 

  6. Neubig, G., et al.: DyNet: the dynamic neural network toolkit. CoRR, abs/1701.03980 (2017)

    Google Scholar 

  7. Gerth, R., Peled, D.A., Vardi, M.Y., Wolper, P.: Simple on-the-fly automatic verification of linear temporal logic. In: Dembinski, P., Sredniawa, M., (eds.) Protocol Specification, Testing and Verification XV, Proceedings of the Fifteenth IFIP WG6.1 International Symposium on Protocol Specification, Testing and Verification, Warsaw, Poland, June 1995. IFIP Conference Proceedings, vol. 38, pp. 3–18. Chapman & Hall (1995)

    Google Scholar 

  8. Goodfellow, I.J., Bengio, Y., Courville, A.C.: Deep learning. In: Adaptive Computation and Machine Learning. MIT Press (2016)

    Google Scholar 

  9. Hausknecht, M.J., Stone, P.: Deep recurrent Q-learning for partially observable mdps. CoRR, abs/1507.06527 (2015)

    Google Scholar 

  10. Heess, N., Hunt, J.J., Lillicrap, T.P., Silver, D.: Memory-based control with recurrent neural networks. CoRR, abs/1512.04455 (2015)

    Google Scholar 

  11. Peled, D., Iosti, S., Bensalem, S.: Control synthesis through deep learning. In: Bartocci, E., Cleaveland, R., Grosu, R., Sokolsky, O. (eds.) From Reactive Systems to Cyber-Physical Systems - Essays Dedicated to Scott A. Smolka on the Occasion of His 65th Birthday, pp. 242–255. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-31514-6_14

    Chapter  Google Scholar 

  12. Pnueli, A., Rosner, R.: On the synthesis of a reactive module. In: Conference Record of the Sixteenth Annual ACM Symposium on Principles of Programming Languages, Austin, Texas, USA, 11–13 January 1989, pp. 179–190 (1989)

    Google Scholar 

  13. Pnueli, A., Rosner, R.: Distributed reactive systems are hard to synthesize. In: 31st Annual Symposium on Foundations of Computer Science, St. Louis, Missouri, USA, 22–24 October 1990, vol. II, pp. 746–757 (1990)

    Google Scholar 

  14. Safra, S.: On the complexity of omega-automata. In: 29th Annual Symposium on Foundations of Computer Science, White Plains, New York, USA, 24–26 October 1988, pp. 319–327. IEEE Computer Society (1988)

    Google Scholar 

  15. Sutton, R.S., Barto, A.G.: Reinforcement Learning - An Introduction. Adaptive Computation and Machine Learning, 2nd edn. MIT Press (2018)

    Google Scholar 

  16. Wonham, W.M., Ramadge, P.J.: Modular supervisory control of discrete-event systems. MCSS 1(1), 13–30 (1988)

    MathSciNet  MATH  Google Scholar 

  17. Zhu, P., Li, X., Poupart, P.: On improving deep reinforcement learning for pomdps. CoRR, abs/1704.07978 (2017)

    Google Scholar 

  18. Zielonka, W.: Infinite games on finitely coloured graphs with applications to automata on infinite trees. Theor. Comput. Sci. 200(1–2), 135–183 (1998)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Simon Iosti , Doron Peled , Khen Aharon , Saddek Bensalem or Yoav Goldberg .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Iosti, S., Peled, D., Aharon, K., Bensalem, S., Goldberg, Y. (2020). Synthesizing Control for a System with Black Box Environment, Based on Deep Learning. In: Margaria, T., Steffen, B. (eds) Leveraging Applications of Formal Methods, Verification and Validation: Engineering Principles. ISoLA 2020. Lecture Notes in Computer Science(), vol 12477. Springer, Cham. https://doi.org/10.1007/978-3-030-61470-6_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-61470-6_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-61469-0

  • Online ISBN: 978-3-030-61470-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics