A Survey of Model Learning Techniques for Recurrent Neural Networks

Bollig, Benedikt; Leucker, Martin; Neider, Daniel

doi:10.1007/978-3-031-15629-8_5

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13560))

774 Accesses
2 Citations

Abstract

Ensuring the correctness and reliability of deep neural networks is a challenge. Suitable formal analysis and verification techniques have yet to be developed. One promising approach towards this goal is model learning, which seeks to derive surrogate models of the underlying neural network in a model class that permits sophisticated analysis and verification techniques. This paper surveys several existing model learning approaches that infer finite-state automata and context-free grammars from Recurrent Neural Networks, an essential class of deep neural networks for sequential data. Most of these methods rely on Angluin’s approach for learning finite automata but implement different ways of checking the equivalence of a learned model with the neural network. Our paper presents these distinct techniques in a unified language and discusses their strengths and weaknesses. Furthermore, we survey model learning techniques that follow a novel trend in explainable artificial intelligence and learn models in the form of formal grammars.

This work was partly supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) grant number 434592664.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We refer to the comprehensive overview article by Frits Vaandrager [30] for a gentle introduction to the field of automata-based model learning, which highlights the milestones until the state-of-the art and identifies the challenges faced today.
2.
Various improvements have been proposed over the years, such as Rivest and Shapire’s algorithm [28] and Kearns and Vazirani’s algorithm [17]. However, all of these operate within Angluin’s minimally adequate teacher framework [3] and can seamlessly be swapped if desired. Hence, for the sake of a more straightforward exposition, this paper focuses on Angluin’s L\(^*\) algorithm as a prototypical example. The reader may substitute L\(^*\) with any other learning algorithm that operates in the minimal adequate teacher framework.
3.
For the sake of convenience, Barbot et al. [5] restrict their focus to well-formed words as this entails a bijection between push and pop positions.
4.
Here, we assume that the acceptance function \(g :\mathbb R^\ell \rightarrow \{ 0, 1 \}\) of the given RNN is obtained as the composition of a function \( score :\mathbb R^\ell \rightarrow \mathbb R\) and a threshold function \( thr :\mathbb R \rightarrow \{ 0, 1 \}\) such that, for a given threshold \(\tau \in \mathbb R\), we have \( thr (x) = 1\) iff \(x \ge \tau \).

References

Alur, R., Kumar, V., Madhusudan, P., Viswanathan, M.: Congruences for visibly pushdown languages. In: Caires, L., Italiano, G.F., Monteiro, L., Palamidessi, C., Yung, M. (eds.) ICALP 2005. LNCS, vol. 3580, pp. 1102–1114. Springer, Heidelberg (2005). https://doi.org/10.1007/11523468_89
Chapter MATH Google Scholar
Alur, R., Madhusudan, P.: Adding nesting structure to words. J. ACM 56(3), 16:1–16:43 (2009). https://doi.org/10.1145/1516512.1516518
Angluin, D.: Learning regular sets from queries and counterexamples. Inf. Comput. 75(2), 87–106 (1987). https://doi.org/10.1016/0890-5401(87)90052-6
Article MathSciNet MATH Google Scholar
Ayache, S., Eyraud, R., Goudian, N.: Explaining black boxes on sequential data using weighted automata. In: Unold, O., Dyrka, W., Wieczorek, W. (eds.) Proceedings of the 14th International Conference on Grammatical Inference, ICGI 2018, Wrocław, Poland, 5–7 September 2018. Proceedings of Machine Learning Research, vol. 93, pp. 81–103. PMLR (2018). http://proceedings.mlr.press/v93/ayache19a.html
Barbot, B., Bollig, B., Finkel, A., Haddad, S., Khmelnitsky, I., Leucker, M., Neider, D., Roy, R., Ye, L.: Extracting context-free grammars from recurrent neural networks using tree-automata learning and a* search. In: Chandlee, J., Eyraud, R., Heinz, J., Jardine, A., van Zaanen, M. (eds.) Proceedings of the Fifteenth International Conference on Grammatical Inference. Proceedings of Machine Learning Research, vol. 153, pp. 113–129. PMLR, 23–27 August 2021. https://proceedings.mlr.press/v153/barbot21a.html
Boser, B.E., Guyon, I., Vapnik, V.: A training algorithm for optimal margin classifiers. In: Haussler, D. (ed.) Proceedings of the Fifth Annual ACM Conference on Computational Learning Theory, COLT 1992, Pittsburgh, PA, USA, 27–29 July 1992, pp. 144–152. ACM (1992). https://doi.org/10.1145/130385.130401
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Moschitti, A., Pang, B., Daelemans, W. (eds.) Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, 25–29 October 2014, Doha, Qatar, A Meeting of SIGDAT, a Special Interest Group of the ACL, pp. 1724–1734. ACL (2014). https://doi.org/10.3115/v1/d14-1179
Drewes, F., Högberg, J.: Query learning of regular tree languages: How to avoid dead states. Theory Comput. Syst. 40(2), 163–185 (2007). https://doi.org/10.1007/s00224-005-1233-3
Article MathSciNet MATH Google Scholar
Eisner, C., Fisman, D.: A Practical Introduction to PSL. Series on Integrated Circuits and Systems. Springer, Heidelberg (2006). https://doi.org/10.1007/978-0-387-36123-9
Book Google Scholar
Garavel, H., Beek, M.H., Pol, J.: The 2020 expert survey on formal methods. In: ter Beek, M.H., Ničković, D. (eds.) FMICS 2020. LNCS, vol. 12327, pp. 3–69. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58298-2_1
Chapter Google Scholar
Hart, P.E., Nilsson, N.J., Raphael, B.: A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. Syst. Sci. Cybern. 4(2), 100–107 (1968). https://doi.org/10.1109/TSSC.1968.300136
Article Google Scholar
de la Higuera, C.: A bibliographical study of grammatical inference. Pattern Recognit. 38(9), 1332–1348 (2005). https://doi.org/10.1016/j.patcog.2005.01.003
Article Google Scholar
de la Higuera, C.: Grammatical Inference. Cambridge University Press, Cambridge (2010). https://doi.org/10.1017/CBO9781139194655
Book MATH Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735
Article Google Scholar
Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58(301), 13–30 (1963). https://doi.org/10.2307/2282952
Article MathSciNet MATH Google Scholar
Jacobsson, H.: Rule extraction from recurrent neural networks: A taxonomy and review. Neural Comput. 17(6), 1223–1263 (2005). https://doi.org/10.1162/0899766053630350
Article MathSciNet MATH Google Scholar
Kearns, M.J., Vazirani, U.V.: An Introduction to Computational Learning Theory. MIT Press, Cambridge (1994). https://doi.org/10.7551/mitpress/3897.001.0001
Book Google Scholar
Khmelnitsky, I., et al.: Property-directed verification and robustness certification of recurrent neural networks. In: Hou, Z., Ganesh, V. (eds.) ATVA 2021. LNCS, vol. 12971, pp. 364–380. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-88885-5_24
Chapter Google Scholar
Legay, A., Lukina, A., Traonouez, L.M., Yang, J., Smolka, S.A., Grosu, R.: Statistical model checking. In: Steffen, B., Woeginger, G. (eds.) Computing and Software Science. LNCS, vol. 10000, pp. 478–504. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-91908-9_23
Chapter Google Scholar
Leucker, M.: Formal verification of neural networks? In: Carvalho, G., Stolz, V. (eds.) SBMF 2020. LNCS, vol. 12475, pp. 3–7. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-63882-5_1
Chapter Google Scholar
Liu, B.: Sentiment analysis and subjectivity. In: Indurkhya, N., Damerau, F.J. (eds.) Handbook of Natural Language Processing, 2nd edn., pp. 627–666. Chapman and Hall/CRC (2010). http://www.crcnetbase.com/doi/abs/10.1201/9781420085938-c26
Mayr, F., Visca, R., Yovine, S.: On-the-fly black-box probably approximately correct checking of recurrent neural networks. In: Holzinger, A., Kieseberg, P., Tjoa, A.M., Weippl, E. (eds.) CD-MAKE 2020. LNCS, vol. 12279, pp. 343–363. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-57321-8_19
Chapter Google Scholar
Mayr, F., Yovine, S.: Regular inference on artificial neural networks. In: Holzinger, A., Kieseberg, P., Tjoa, A.M., Weippl, E. (eds.) CD-MAKE 2018. LNCS, vol. 11015, pp. 350–369. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99740-7_25
Chapter Google Scholar
Mayr, F., Yovine, S., Visca, R.: Property checking with interpretable error characterization for recurrent neural networks. Mach. Learn. Knowl. Extr. 3(1), 205–227 (2021). https://doi.org/10.3390/make3010010
Article Google Scholar
Okudono, T., Waga, M., Sekiyama, T., Hasuo, I.: Weighted automata extraction from recurrent neural networks via regression on state spaces. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, 7–12 February 2020, pp. 5306–5314. AAAI Press (2020). https://ojs.aaai.org/index.php/AAAI/article/view/5977
Omlin, C.W., Giles, C.L.: Extraction of rules from discrete-time recurrent neural networks. Neural Netw. 9(1), 41–52 (1996). https://doi.org/10.1016/0893-6080(95)00086-0
Article Google Scholar
Pnueli, A.: The temporal logic of programs. In: 18th Annual Symposium on Foundations of Computer Science, Providence, Rhode Island, USA, 31 October–1 November 1977, pp. 46–57. IEEE Computer Society (1977). https://doi.org/10.1109/SFCS.1977.32
Rivest, R.L., Schapire, R.E.: Inference of finite automata using homing sequences. Inf. Comput. 103(2), 299–347 (1993). https://doi.org/10.1006/inco.1993.1021
Article MathSciNet MATH Google Scholar
Seshia, S.A., et al.: Formal specification for deep neural networks. In: Lahiri, S.K., Wang, C. (eds.) ATVA 2018. LNCS, vol. 11138, pp. 20–34. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01090-4_2
Chapter Google Scholar
Vaandrager, F.W.: Model learning. Commun. ACM 60(2), 86–95 (2017). https://doi.org/10.1145/2967606
Article Google Scholar
Valiant, L.G.: A theory of the learnable. Commun. ACM 27(11), 1134–1142 (1984). https://doi.org/10.1145/1968.1972
Article MATH Google Scholar
Weiss, G., Goldberg, Y., Yahav, E.: Extracting automata from recurrent neural networks using queries and counterexamples. In: Dy, J.G., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, 10–15 July 2018. Proceedings of Machine Learning Research, vol. 80, pp. 5244–5253. PMLR (2018). http://proceedings.mlr.press/v80/weiss18a.html
Weiss, G., Goldberg, Y., Yahav, E.: On the practical computational power of finite precision RNNs for language recognition. In: Gurevych, I., Miyao, Y. (eds.) Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, 15–20 July 2018, Volume 2: Short Papers, pp. 740–745. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/P18-2117
Weiss, G., Goldberg, Y., Yahav, E.: Learning deterministic weighted automata with queries and counterexamples. In: Wallach, H.M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8–14 December 2019, Vancouver, BC, Canada, pp. 8558–8569 (2019). https://proceedings.neurips.cc/paper/2019/hash/d3f93e7766e8e1b7ef66dfdd9a8be93b-Abstract.html
Xie, X., Kersting, K., Neider, D.: Neuro-symbolic verification of deep neural networks. In: Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI 2022. ijcai.org (2022, to appear). https://doi.org/10.48550/arXiv.2203.00938
Yellin, D.M., Weiss, G.: Synthesizing context-free grammars from recurrent neural networks. In: Groote, J.F., Larsen, K.G. (eds.) TACAS 2021. LNCS, vol. 12651, pp. 351–369. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72016-2_19
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Université Paris-Saclay, CNRS, ENS Paris-Saclay, LMF, Gif-sur-Yvette, France
Benedikt Bollig
Institute for Software Engineering and Programming Languages, Universität zu Lübeck, Lübeck, Germany
Martin Leucker
Safety and Explainability of Learning Systems Group, Carl von Ossietzky Universität Oldenburg, Oldenburg, Germany
Daniel Neider

Authors

Benedikt Bollig
View author publications
You can also search for this author in PubMed Google Scholar
Martin Leucker
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Neider
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel Neider .

Editor information

Editors and Affiliations

Radboud University Nijmegen, Nijmegen, The Netherlands
Nils Jansen
University of Twente, Enschede, The Netherlands
Mariëlle Stoelinga
University of Twente, Enschede, The Netherlands
Petra van den Bos

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bollig, B., Leucker, M., Neider, D. (2022). A Survey of Model Learning Techniques for Recurrent Neural Networks. In: Jansen, N., Stoelinga, M., van den Bos, P. (eds) A Journey from Process Algebra via Timed Automata to Model Learning . Lecture Notes in Computer Science, vol 13560. Springer, Cham. https://doi.org/10.1007/978-3-031-15629-8_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-15629-8_5
Published: 07 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15628-1
Online ISBN: 978-3-031-15629-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics