Abstract
Machine Learning (ML) is the discipline that studies methods for automatically inferring models from data. Machine learning has been successfully applied in many areas of software engineering including: behaviour extraction, testing and bug fixing. Many more applications are yet to be defined. Therefore, a better fundamental understanding of ML methods, their assumptions and guarantees can help to identify and adopt appropriate ML technology for new applications.
In this chapter, we present an introductory survey of ML applications in software engineering, classified in terms of the models they produce and the learning methods they use. We argue that the optimal choice of an ML method for a particular application should be guided by the type of models one seeks to infer. We describe some important principles of ML, give an overview of some key methods, and present examples of areas of software engineering benefiting from ML. We also discuss the open challenges for reaching the full potential of ML for software engineering and how ML can benefit from software engineering methods.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Informally identification in the limit means that for any infinite sequence of observations \(o_1, o_2 , \ldots \) there exists some finite point n after which the function learned from the first n observations is longer changed by any later observations.
- 2.
Note that in general A and B can be cartesian products of sets, e.g. \(A = A_1 \times \ldots \times A_n\).
- 3.
Here, the relation \(f(x_i) \equiv y_i\) means that \(f(x_i)\) is very close to \(y_i\) for some suitable metric. Of course one such relation is the equality relation on B.
- 4.
By data smoothing we mean any form of statistical averaging or filtering process, that can reduce the effects of noise. Data smoothing may be necessary even when a relational model is appropriate.
- 5.
This is done in many approaches including linear regression models, polynomial approximation, Fourier methods, simple neural networks and deep learning.
- 6.
A universal algebraic structure is a many-sorted first-order structure \(\langle A_i : i = 1 , \ldots , n; c_j : j = 1 , \ldots , m; f_k : k = 1 , \ldots , p, \rangle \) consisting of data sets \(A_i\), constants \(c_j \in A_{i_j}\), and functions \(f_k : A_{i_{k(1)}} \times \ldots \times A_{i_{k(n)}} \rightarrow A_{i_{k(n+1)}}\) but no relations. Thus a deterministic Moore automaton is a 3-sorted universal algebraic structure. See e.g. [45] or [44] for further details.
- 7.
Here supervised learning is more obvious if we think in terms of regular languages rather than automata. Then we are inferring the language acceptance function \(L: {\varSigma }^* \rightarrow \{ 0, 1 \}\) from a finite set of instances. However, the two viewpoints are equivalent by Kleene’s Theorem.
- 8.
This assumption amounts to little more than retaining, i.e. not throwing away, any observational data.
- 9.
For algebraists, the important fact here is that \(T({\varSigma }^*, \textit{Observations})\) is the initial object in the appropriate category of automata and homorphisms, which is unique up to isomorphism. See [46] for details.
- 10.
The fundamental principle of initiality for \(T({\varSigma }^*, \textit{Observations})\) says that such folding is always possible.
- 11.
For active learning, counterfactual evidence may eventually emerge that destroys the loop hypothesis, but in passive learning this is not possible.
- 12.
Merging is a little complicated to define graph theoretically, so we leave it to the reader as an exercise!
- 13.
Then \(v_1\) is a prefix of \(v_2\) or vice versa.
- 14.
Assuming that \(v_1 . \sigma , v_2 . \sigma \in \textit{Queries}\).
- 15.
Notice here that is a partial function, i.e. \(\delta \) is not necessarily defined on all arguments.
- 16.
In a lattice, \((A, \le )\) a maximum element exceeds all others while a maximal element is exceeded by none. Thus a maximum element must be unique, while a maximal element need not be.
- 17.
We actually present a simple generalisation of L* to an arbitrary output alphabet \({\varOmega }\). This algorithm is termed L*Mealy and first appeared in [34] where it was applied to Mealy machines.
- 18.
Minimum state size seems to be a natural result of many active learning algorithms for Moore automata. This seems to be due to the difficulty of distinguishing pairs of states without any concrete evidence.
- 19.
In model construction, red prefixes are needed to represent states, while blue prefixes are needed for defining transitions. According to our definition, a prefix can be both red and blue, but this is not problematic.
- 20.
What to do when the SUL is non-deterministic will be discussed later.
- 21.
For the reader familiar with algebra, condition (i) corresponds to an algebraic closure condition on the red prefix set under the operation of appending an input \(\sigma \in {\varSigma }\) and modulo row equivalence. The closure condition (ii) corresponds to row equivalence being a congruence on the red prefix set with respect to the state transition function \(\delta \). Thus the red prefix set is able to provide a state set for a quotient automaton defined by the table T.
- 22.
For example, the short-lex ordering < on \({\varSigma }^*\) is suitable. Here \({\overline{\sigma }} < {\overline{\sigma }}'\) if \(\vert {\overline{\sigma }} \vert < \vert {\overline{\sigma }}' \vert \). If \(\vert {\overline{\sigma }} \vert = \vert {\overline{\sigma }}' \vert = n\) then \({\overline{\sigma }} < {\overline{\sigma }}'\) if, and only if \({\overline{\sigma }} < {\overline{\sigma }}'\) in the lexicographical ordering on \({\varSigma }^n\).
- 23.
Notice that it is necessary to use a multiset of observations, i.e. to repeat previous SUL experiments, in order to establish frequencies and probabilities. Thus the cost of learning a probabilistic automaton may be quite high in terms of the query count.
- 24.
- 25.
- 26.
In fact congruences and quotients are universal constructions throughout the whole of mathematics.
References
Supporting Processes, Baseline 17. ISO 26262:(2011): Part 8 Road vehicles-functional safety. International Organization for Standardization (2011)
Abowd, G.D.: Beyond Weiser: from ubiquitous to collective computing. IEEE Comput. 49(1), 17–23 (2016). https://doi.org/10.1109/MC.2016.22
Alrajeh, D., Russo, A.: Logic-based Learning: Theory and Application. In: Bennaceur, A., Hähnle, R., Meinke, K. (eds.) ML for Dynamic Software Analysis. LNCS, vol. 11026, pp. 219–256. Springer, Cham (2018)
Alur, R., Courcoubetis, C., Henzinger, T.A., Ho, P.-H.: Hybrid automata: an algorithmic approach to the specification and verification of hybrid systems. In: Grossman, R.L., Nerode, A., Ravn, A.P., Rischel, H. (eds.) HS 1991-1992. LNCS, vol. 736, pp. 209–229. Springer, Heidelberg (1993). https://doi.org/10.1007/3-540-57318-6_30
Alur, R., Dill, D.L.: A theory of timed automata. Theor. Comput. Sci. 126(2), 183–235 (1994)
Angluin, D.: Learning regular sets from queries and counterexamples. Inf. Comput. 75(2), 87–106 (1987)
Angluin, D., Frazier, M., Pitt, L.: Learning conjunctions of Horn clauses. Mach. Learn. 9(2–3), 147–164 (1992)
Arias, M., Balcázar, J.L., Tirnăucă, C.: Learning definite Horn formulas from closure queries. Theor. Comput. Sci. 658, 346–356 (2017)
Balcázar, J.L., Díaz, J., Gavaldà, R., Watanabe, O.: Algorithms for learning finite automata from queries: a unified view. In: Du, D.Z., Ko, K.I. (eds.) Advances in Algorithms, Languages, and Complexity- In Honor of Ronald V. Book, pp. 53–72. Springer, Boston (1997). https://doi.org/10.1007/978-1-4613-3394-4_2
Bennaceur, A., Giannakopoulou, D., Hähnle, R., Meinke, K.: Machine learning for dynamic software analysis: potentials and limits (Dagstuhl seminar 16172). Dagstuhl Rep. 6(4), 161–173 (2016)
Bergadano, F., Gunetti, D.: Testing by means of inductive program learning. ACM Trans. Softw. Eng. Methodol. 5(2), 119–145 (1996)
Bollig, B., Habermehl, P., Kern, C., Leucker, M.: Angluin-style learning of NFA. In: Proceedings of the 21st International Joint Conference on Artificial Intelligence, IJCAI 2009, Pasadena, California, USA, 11–17 July 2009, pp. 1004–1009 (2009)
Carrasco, R.C., Oncina, J.: Learning deterministic regular grammars from stochastic samples in polynomial time. ITA 33(1), 1–20 (1999)
Cassel, S., Howar, F., Jonsson, B., Merten, M., Steffen, B.: A succinct canonical register automaton model. In: Bultan, T., Hsiung, P.-A. (eds.) ATVA 2011. LNCS, vol. 6996, pp. 366–380. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24372-1_26
Chow, T.S.: Testing software design modeled by finite-state machines. IEEE Trans. Softw. Eng. 4(3), 178–187 (1978)
Denis, F., Lemay, A., Terlutte, A.: Residual finite state automata. Fundam. Inform. 51(4), 339–368 (2002)
Denis, F., Lemay, A., Terlutte, A.: Learning regular languages using RFSAs. Theor. Comput. Sci. 313(2), 267–294 (2004)
Eyraud, R., de la Higuera, C., Janodet, J.: LARS: a learning algorithm for rewriting systems. Mach. Learn. 66(1), 7–31 (2007)
Howar, F., Steffen, B.: Active automata learning in practice: an annotated bibliography of the years 2011 to 2016. In: Bennaceur, A., Hähnle, R., Meinke, K. (eds.) ML for Dynamic Software Analysis. LNCS, vol. 11026, pp. 123–148. Springer, Cham (2018)
Feng, L., Lundmark, S., Meinke, K., Niu, F., Sindhu, M.A., Wong, P.Y.H.: Case studies in learning-based testing. In: Yenigün, H., Yilmaz, C., Ulrich, A. (eds.) ICTSS 2013. LNCS, vol. 8254, pp. 164–179. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41707-8_11
García, P., de Parga, M.V., Álvarez, G.I., Ruiz, J.: Learning regular languages using nondeterministic finite automata. In: Ibarra, O.H., Ravikumar, B. (eds.) CIAA 2008. LNCS, vol. 5148, pp. 92–101. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-70844-5_10
Giannakopoulou, D., Pasareanu, C.S.: Abstraction and learning for infinite-state compositional verification. In: Semantics, Abstract Interpretation, and Reasoning about Programs: Essays Dedicated to David A. Schmidt on the Occasion of his Sixtieth Birthday, Manhattan, Kansas, USA, 19–20th September 2013, pp. 211–228 (2013)
Gold, E.M.: Language identification in the limit. Inf. Control 10, 447–474 (1967)
Gold, E.M.: Complexity of automaton identification from given data. Inf. Control 37, 302–320 (1978)
Grinchtein, O., Jonsson, B., Leucker, M.: Inference of timed transition systems. Electr. Notes Theor. Comput. Sci. 138(3), 87–99 (2005)
Grosu, R., Smolka, S.A., Corradini, F., Wasilewska, A., Entcheva, E., Bartocci, E.: Learning and detecting emergent behavior in networks of cardiac myocytes. Commun. ACM 52(3), 97–105 (2009)
Harel, D., Politi, M.: Modeling Reactive Systems with Statecharts: The Statemate Approach, 1st edn. McGraw-Hill Inc., New York (1998)
de la Higuera, C.: Grammatical Inference: Learning Automata and Grammars. Cambridge University Press, New York (2010)
Holcombe, W.M.: Algebraic Automata Theory. Cambridge University Press, New York (1982)
Hopcroft, J.E., Motwani, R., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation, 3rd edn. Addison-Wesley Longman Publishing Co., Inc., Boston (2006)
Hosseini, M., Shahri, A., Phalp, K., Ali, R.: Four reference models for transparency requirements in information systems. Requir. Eng. 23, 1–25 (2017)
Howar, F.: Active learning of interface programs. Ph.D. thesis, Dortmund University of Technology (2012)
Howar, F., Steffen, B., Jonsson, B., Cassel, S.: Inferring canonical register automata. In: Kuncak, V., Rybalchenko, A. (eds.) VMCAI 2012. LNCS, vol. 7148, pp. 251–266. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-27940-9_17
Hungar, H., Niese, O., Steffen, B.: Domain-specific optimization in automata learning. In: Hunt, W.A., Somenzi, F. (eds.) CAV 2003. LNCS, vol. 2725, pp. 315–327. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45069-6_31
Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0026683
Kearns, M.J., Vazirani, U.V.: An Introduction to Computational Learning Theory. MIT Press, Cambridge (1994)
Krka, I., Brun, Y., Popescu, D., Garcia, J., Medvidovic, N.: Using dynamic execution traces and program invariants to enhance behavioral model inference. In: ICSE, vol. 2, pp. 179–182 (2010)
Lang, K.J., Pearlmutter, B.A., Price, R.A.: Results of the Abbadingo one DFA learning competition and a new evidence-driven state merging algorithm. In: Honavar, V., Slutzki, G. (eds.) ICGI 1998. LNCS, vol. 1433, pp. 1–12. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0054059
Lorenzoli, D., Mariani, L., Pezzè, M.: Automatic generation of software behavioral models. In: Proceedings of the International Conference on Software Engineering, ICSE, pp. 501–510 (2008)
Mao, H., Chen, Y., Jaeger, M., Nielsen, T.D., Larsen, K.G., Nielsen, B.: Learning probabilistic automata for model checking. In: Eighth International Conference on Quantitative Evaluation of Systems, QEST 2011, Aachen, Germany, 5–8 September 2011, pp. 111–120 (2011)
Maruyama, H.: Machine learning as a programming paradigm and its implications to requirements engineering. In: Asia-Pacific Requirements Engineering Symposium, APRES (2016)
Meinke, K., Niu, F.: An incremental learning algorithm for hybrid automata. Techical report series, KTH Royal Institute of Technology, EECS School (2013)
Meinke, K., Sindhu, M.A.: LBTest: A learning-based testing tool for reactive systems. In: Sixth IEEE International Conference on Software Testing, Verification and Validation, ICST 2013, Luxembourg, Luxembourg, 18–22 March 2013, pp. 447–454 (2013)
Meinke, K., Tucker, J.V. (eds.): Many-Sorted Logic and Its Applications. Wiley, New York (1993)
Meinke, K., Tucker, J.: Universal algebra. In: Handbook of Logic in Computer Science (vol. 1): Background: Mathematical Structures (1993)
Meinke, K.: CGE: a sequential learning algorithm for mealy automata. In: Sempere, J.M., García, P. (eds.) ICGI 2010. LNCS (LNAI), vol. 6339, pp. 148–162. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15488-1_13
Meinke, K., Nycander, P.: Learning-based testing of distributed microservice architectures: correctness and fault injection. In: Bianculli, D., Calinescu, R., Rumpe, B. (eds.) SEFM 2015. LNCS, vol. 9509, pp. 3–10. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-49224-6_1
Merten, M., Steffen, B., Howar, F., Margaria, T.: Next generation LearnLib. In: Abdulla, P.A., Leino, K.R.M. (eds.) TACAS 2011. LNCS, vol. 6605, pp. 220–223. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19835-9_18
Moore, E.F.: Gedanken-experiments on sequential machines. In: Shannon, C., McCarthy, J. (eds.) Automata Studies, Princeton, NJ, pp. 129–153 (1956)
Moschitti, A.: Kernel-based machines for abstract and easy modeling of automatic learning. In: Bernardo, M., Issarny, V. (eds.) SFM 2011. LNCS, vol. 6659, pp. 458–503. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21455-4_14
Niggemann, O., Stein, B., Vodencarevic, A., Maier, A., Kleine Büning, H.: Learning behavior models for hybrid timed systems. In: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 22–26 July 2012, Toronto, Ontario, Canada (2012). http://www.aaai.org/ocs/index.php/AAAI/AAAI12/paper/view/4993
Peled, D., Vardi, M.Y., Yannakakis, M.: Black box checking. J. Autom. Lang. Comb. 7(2), 225–246 (2001)
Reisig, W.: Understanding Petri Nets: Modeling Techniques, Analysis Methods, Case Studies. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-33278-4
Groz, R., Simao, A., Petrenko, A., Oriat, C.: Inferring FSM models of systems without reset. In: Bennaceur, A., Hähnle, R., Meinke, K. (eds.) ML for Dynamic Software Analysis. LNCS, vol. 11026, pp. 178–201. Springer, Cham (2018)
Ron, D., Singer, Y., Tishby, N.: On the learnability and usage of acyclic probabilistic finite automata. J. Comput. Syst. Sci. 56(2), 133–152 (1998)
Salton, G., Yang, C.S., Yu, C.T.: Contribution to the theory of indexing. In: IFIP Congress, pp. 584–590 (1974)
Shinohara, T.: Inductive inference of monotonic formal systems from positive data. New Gener. Comput. 8(4), 371–384 (1991)
Cassel, S., Howar, F., Jonsson, B., Steffen, B.: Extending automata learning to extended finite state machines. In: Bennaceur, A., Hähnle, R., Meinke, K. (eds.) ML for Dynamic Software Analysis. LNCS, vol. 11026, pp. 149–177. Springer, Cham (2018)
Thollard, F., Dupont, P., de la Higuera, C.: Probabilistic DFA inference using Kullback-Leibler divergence and minimality. In: Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, 29 June–2 July 2000, pp. 975–982 (2000)
Vaandrager, F.: Active learning of extended finite state machines. In: Nielsen, B., Weise, C. (eds.) ICTSS 2012. LNCS, vol. 7641, pp. 5–7. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34691-0_2
Valiant, L.G.: A theory of the learnable. Commun. ACM 27(11), 1134–1142 (1984)
Vasilevski, M.P.: Failure diagnosis of automata. Cybernetic 9(4), 653–665 (1973)
Verwer, S.: Efficient identification of timed automata: theory and practice. Ph.D. thesis, Delft University of Technology, Netherlands (2010). http://resolver.tudelft.nl/uuid:61d9f199-7b01-45be-a6ed-04498113a212
Wakefield, J.: Microsoft chatbot is taught to swear on Twitter. Accessed 30 Mar 2017
Walkinshaw, N.: Testing functional black-box programs without a specification. In: Bennaceur, A., Hähnle, R., Meinke, K. (eds.) ML for Dynamic Software Analysis. LNCS, vol. 11026, pp. 101–120. Springer, Cham (2018)
Watkins, C.J.C.H.: Learning from delayed rewards. Ph.D. thesis, King’s College, Cambridge (1989)
Weyuker, E.J.: Assessing test data adequacy through program inference. ACM Trans. Program. Lang. Syst. 5(4), 641–655 (1983)
Wieczorek, W.: Grammatical Inference: Algorithms Routines and Applications, 1st edn. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46801-3
Zhu, H.: A formal interpretation of software testing as inductive inference. J. Softw. Test. Verif. Reliab. 6(1), 3–31 (1996)
Acknowledgments
We gratefully acknowledge financial support for this research from the following projects: EU ITEA 3 project 16032 Testomat, EU ECSEL project 692529-2 SafeCOP, Vinnova FFI project 2013-05608 VIRTUES, ERC Advanced Grant no. 291652 (ASAP), and the EPSRC EP/R013144/1 SAUSE project. We are also very thankful to the Schloss Dagstuhl for their support of the Dagstuhl Seminar 16172 and for the participant to this workshop for their insightful discussions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Bennaceur, A., Meinke, K. (2018). Machine Learning for Software Analysis: Models, Methods, and Applications. In: Bennaceur, A., Hähnle, R., Meinke, K. (eds) Machine Learning for Dynamic Software Analysis: Potentials and Limits. Lecture Notes in Computer Science(), vol 11026. Springer, Cham. https://doi.org/10.1007/978-3-319-96562-8_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-96562-8_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-96561-1
Online ISBN: 978-3-319-96562-8
eBook Packages: Computer ScienceComputer Science (R0)