Skip to main content

A Survey of Model Learning Techniques for Recurrent Neural Networks

  • Chapter
  • First Online:
A Journey from Process Algebra via Timed Automata to Model Learning

Abstract

Ensuring the correctness and reliability of deep neural networks is a challenge. Suitable formal analysis and verification techniques have yet to be developed. One promising approach towards this goal is model learning, which seeks to derive surrogate models of the underlying neural network in a model class that permits sophisticated analysis and verification techniques. This paper surveys several existing model learning approaches that infer finite-state automata and context-free grammars from Recurrent Neural Networks, an essential class of deep neural networks for sequential data. Most of these methods rely on Angluin’s approach for learning finite automata but implement different ways of checking the equivalence of a learned model with the neural network. Our paper presents these distinct techniques in a unified language and discusses their strengths and weaknesses. Furthermore, we survey model learning techniques that follow a novel trend in explainable artificial intelligence and learn models in the form of formal grammars.

This work was partly supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) grant number 434592664.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We refer to the comprehensive overview article by Frits Vaandrager [30] for a gentle introduction to the field of automata-based model learning, which highlights the milestones until the state-of-the art and identifies the challenges faced today.

  2. 2.

    Various improvements have been proposed over the years, such as Rivest and Shapire’s algorithm [28] and Kearns and Vazirani’s algorithm [17]. However, all of these operate within Angluin’s minimally adequate teacher framework [3] and can seamlessly be swapped if desired. Hence, for the sake of a more straightforward exposition, this paper focuses on Angluin’s L\(^*\) algorithm as a prototypical example. The reader may substitute L\(^*\) with any other learning algorithm that operates in the minimal adequate teacher framework.

  3. 3.

    For the sake of convenience, Barbot et al. [5] restrict their focus to well-formed words as this entails a bijection between push and pop positions.

  4. 4.

    Here, we assume that the acceptance function \(g :\mathbb R^\ell \rightarrow \{ 0, 1 \}\) of the given RNN is obtained as the composition of a function \( score :\mathbb R^\ell \rightarrow \mathbb R\) and a threshold function \( thr :\mathbb R \rightarrow \{ 0, 1 \}\) such that, for a given threshold \(\tau \in \mathbb R\), we have \( thr (x) = 1\) iff \(x \ge \tau \).

References

  1. Alur, R., Kumar, V., Madhusudan, P., Viswanathan, M.: Congruences for visibly pushdown languages. In: Caires, L., Italiano, G.F., Monteiro, L., Palamidessi, C., Yung, M. (eds.) ICALP 2005. LNCS, vol. 3580, pp. 1102–1114. Springer, Heidelberg (2005). https://doi.org/10.1007/11523468_89

    Chapter  MATH  Google Scholar 

  2. Alur, R., Madhusudan, P.: Adding nesting structure to words. J. ACM 56(3), 16:1–16:43 (2009). https://doi.org/10.1145/1516512.1516518

  3. Angluin, D.: Learning regular sets from queries and counterexamples. Inf. Comput. 75(2), 87–106 (1987). https://doi.org/10.1016/0890-5401(87)90052-6

    Article  MathSciNet  MATH  Google Scholar 

  4. Ayache, S., Eyraud, R., Goudian, N.: Explaining black boxes on sequential data using weighted automata. In: Unold, O., Dyrka, W., Wieczorek, W. (eds.) Proceedings of the 14th International Conference on Grammatical Inference, ICGI 2018, Wrocław, Poland, 5–7 September 2018. Proceedings of Machine Learning Research, vol. 93, pp. 81–103. PMLR (2018). http://proceedings.mlr.press/v93/ayache19a.html

  5. Barbot, B., Bollig, B., Finkel, A., Haddad, S., Khmelnitsky, I., Leucker, M., Neider, D., Roy, R., Ye, L.: Extracting context-free grammars from recurrent neural networks using tree-automata learning and a* search. In: Chandlee, J., Eyraud, R., Heinz, J., Jardine, A., van Zaanen, M. (eds.) Proceedings of the Fifteenth International Conference on Grammatical Inference. Proceedings of Machine Learning Research, vol. 153, pp. 113–129. PMLR, 23–27 August 2021. https://proceedings.mlr.press/v153/barbot21a.html

  6. Boser, B.E., Guyon, I., Vapnik, V.: A training algorithm for optimal margin classifiers. In: Haussler, D. (ed.) Proceedings of the Fifth Annual ACM Conference on Computational Learning Theory, COLT 1992, Pittsburgh, PA, USA, 27–29 July 1992, pp. 144–152. ACM (1992). https://doi.org/10.1145/130385.130401

  7. Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Moschitti, A., Pang, B., Daelemans, W. (eds.) Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, 25–29 October 2014, Doha, Qatar, A Meeting of SIGDAT, a Special Interest Group of the ACL, pp. 1724–1734. ACL (2014). https://doi.org/10.3115/v1/d14-1179

  8. Drewes, F., Högberg, J.: Query learning of regular tree languages: How to avoid dead states. Theory Comput. Syst. 40(2), 163–185 (2007). https://doi.org/10.1007/s00224-005-1233-3

    Article  MathSciNet  MATH  Google Scholar 

  9. Eisner, C., Fisman, D.: A Practical Introduction to PSL. Series on Integrated Circuits and Systems. Springer, Heidelberg (2006). https://doi.org/10.1007/978-0-387-36123-9

    Book  Google Scholar 

  10. Garavel, H., Beek, M.H., Pol, J.: The 2020 expert survey on formal methods. In: ter Beek, M.H., Ničković, D. (eds.) FMICS 2020. LNCS, vol. 12327, pp. 3–69. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58298-2_1

    Chapter  Google Scholar 

  11. Hart, P.E., Nilsson, N.J., Raphael, B.: A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. Syst. Sci. Cybern. 4(2), 100–107 (1968). https://doi.org/10.1109/TSSC.1968.300136

    Article  Google Scholar 

  12. de la Higuera, C.: A bibliographical study of grammatical inference. Pattern Recognit. 38(9), 1332–1348 (2005). https://doi.org/10.1016/j.patcog.2005.01.003

    Article  Google Scholar 

  13. de la Higuera, C.: Grammatical Inference. Cambridge University Press, Cambridge (2010). https://doi.org/10.1017/CBO9781139194655

    Book  MATH  Google Scholar 

  14. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735

    Article  Google Scholar 

  15. Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58(301), 13–30 (1963). https://doi.org/10.2307/2282952

    Article  MathSciNet  MATH  Google Scholar 

  16. Jacobsson, H.: Rule extraction from recurrent neural networks: A taxonomy and review. Neural Comput. 17(6), 1223–1263 (2005). https://doi.org/10.1162/0899766053630350

    Article  MathSciNet  MATH  Google Scholar 

  17. Kearns, M.J., Vazirani, U.V.: An Introduction to Computational Learning Theory. MIT Press, Cambridge (1994). https://doi.org/10.7551/mitpress/3897.001.0001

    Book  Google Scholar 

  18. Khmelnitsky, I., et al.: Property-directed verification and robustness certification of recurrent neural networks. In: Hou, Z., Ganesh, V. (eds.) ATVA 2021. LNCS, vol. 12971, pp. 364–380. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-88885-5_24

    Chapter  Google Scholar 

  19. Legay, A., Lukina, A., Traonouez, L.M., Yang, J., Smolka, S.A., Grosu, R.: Statistical model checking. In: Steffen, B., Woeginger, G. (eds.) Computing and Software Science. LNCS, vol. 10000, pp. 478–504. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-91908-9_23

    Chapter  Google Scholar 

  20. Leucker, M.: Formal verification of neural networks? In: Carvalho, G., Stolz, V. (eds.) SBMF 2020. LNCS, vol. 12475, pp. 3–7. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-63882-5_1

    Chapter  Google Scholar 

  21. Liu, B.: Sentiment analysis and subjectivity. In: Indurkhya, N., Damerau, F.J. (eds.) Handbook of Natural Language Processing, 2nd edn., pp. 627–666. Chapman and Hall/CRC (2010). http://www.crcnetbase.com/doi/abs/10.1201/9781420085938-c26

  22. Mayr, F., Visca, R., Yovine, S.: On-the-fly black-box probably approximately correct checking of recurrent neural networks. In: Holzinger, A., Kieseberg, P., Tjoa, A.M., Weippl, E. (eds.) CD-MAKE 2020. LNCS, vol. 12279, pp. 343–363. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-57321-8_19

    Chapter  Google Scholar 

  23. Mayr, F., Yovine, S.: Regular inference on artificial neural networks. In: Holzinger, A., Kieseberg, P., Tjoa, A.M., Weippl, E. (eds.) CD-MAKE 2018. LNCS, vol. 11015, pp. 350–369. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99740-7_25

    Chapter  Google Scholar 

  24. Mayr, F., Yovine, S., Visca, R.: Property checking with interpretable error characterization for recurrent neural networks. Mach. Learn. Knowl. Extr. 3(1), 205–227 (2021). https://doi.org/10.3390/make3010010

    Article  Google Scholar 

  25. Okudono, T., Waga, M., Sekiyama, T., Hasuo, I.: Weighted automata extraction from recurrent neural networks via regression on state spaces. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, 7–12 February 2020, pp. 5306–5314. AAAI Press (2020). https://ojs.aaai.org/index.php/AAAI/article/view/5977

  26. Omlin, C.W., Giles, C.L.: Extraction of rules from discrete-time recurrent neural networks. Neural Netw. 9(1), 41–52 (1996). https://doi.org/10.1016/0893-6080(95)00086-0

    Article  Google Scholar 

  27. Pnueli, A.: The temporal logic of programs. In: 18th Annual Symposium on Foundations of Computer Science, Providence, Rhode Island, USA, 31 October–1 November 1977, pp. 46–57. IEEE Computer Society (1977). https://doi.org/10.1109/SFCS.1977.32

  28. Rivest, R.L., Schapire, R.E.: Inference of finite automata using homing sequences. Inf. Comput. 103(2), 299–347 (1993). https://doi.org/10.1006/inco.1993.1021

    Article  MathSciNet  MATH  Google Scholar 

  29. Seshia, S.A., et al.: Formal specification for deep neural networks. In: Lahiri, S.K., Wang, C. (eds.) ATVA 2018. LNCS, vol. 11138, pp. 20–34. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01090-4_2

    Chapter  Google Scholar 

  30. Vaandrager, F.W.: Model learning. Commun. ACM 60(2), 86–95 (2017). https://doi.org/10.1145/2967606

    Article  Google Scholar 

  31. Valiant, L.G.: A theory of the learnable. Commun. ACM 27(11), 1134–1142 (1984). https://doi.org/10.1145/1968.1972

    Article  MATH  Google Scholar 

  32. Weiss, G., Goldberg, Y., Yahav, E.: Extracting automata from recurrent neural networks using queries and counterexamples. In: Dy, J.G., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, 10–15 July 2018. Proceedings of Machine Learning Research, vol. 80, pp. 5244–5253. PMLR (2018). http://proceedings.mlr.press/v80/weiss18a.html

  33. Weiss, G., Goldberg, Y., Yahav, E.: On the practical computational power of finite precision RNNs for language recognition. In: Gurevych, I., Miyao, Y. (eds.) Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, 15–20 July 2018, Volume 2: Short Papers, pp. 740–745. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/P18-2117

  34. Weiss, G., Goldberg, Y., Yahav, E.: Learning deterministic weighted automata with queries and counterexamples. In: Wallach, H.M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8–14 December 2019, Vancouver, BC, Canada, pp. 8558–8569 (2019). https://proceedings.neurips.cc/paper/2019/hash/d3f93e7766e8e1b7ef66dfdd9a8be93b-Abstract.html

  35. Xie, X., Kersting, K., Neider, D.: Neuro-symbolic verification of deep neural networks. In: Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI 2022. ijcai.org (2022, to appear). https://doi.org/10.48550/arXiv.2203.00938

  36. Yellin, D.M., Weiss, G.: Synthesizing context-free grammars from recurrent neural networks. In: Groote, J.F., Larsen, K.G. (eds.) TACAS 2021. LNCS, vol. 12651, pp. 351–369. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72016-2_19

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel Neider .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Bollig, B., Leucker, M., Neider, D. (2022). A Survey of Model Learning Techniques for Recurrent Neural Networks. In: Jansen, N., Stoelinga, M., van den Bos, P. (eds) A Journey from Process Algebra via Timed Automata to Model Learning . Lecture Notes in Computer Science, vol 13560. Springer, Cham. https://doi.org/10.1007/978-3-031-15629-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-15629-8_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-15628-1

  • Online ISBN: 978-3-031-15629-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics