Abstract
Software systems are often built without developing any explicit model and therefore research has been focusing on automatic inference of models by applying machine learning to execution logs. However, the logs generated by a real software system may be very large and the inference algorithm can exceed the capacity of a single computer.
This paper focuses on inference of behavioral models and explores to use of MapReduce to deal with large logs. The approach consists of two distributed algorithms that perform trace slicing and model synthesis. For each job, a distributed algorithm using MapReduce is developed. With the parallel data processing capacity of MapReduce, the problem of inferring behavioral models from large logs can be efficiently solved. The technique is implemented on top of Hadoop. Experiments on Amazon clusters show efficiency and scalability of our approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ammons, G., BodÃk, R., Larus, J.R.: Mining specifications. In: Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2002, pp. 4–16. ACM, New York (2002)
Barre, B., Klein, M., Soucy-Boivin, M., Ollivier, P.-A., Hallé, S.: MapReduce for parallel trace validation of LTL properties. In: Qadeer, S., Tasiran, S. (eds.) RV 2012. LNCS, vol. 7687, pp. 184–198. Springer, Heidelberg (2013)
Basin, D., Caronni, G., Ereth, S., Harvan, M., Klaedtke, F., Mantel, H.: Scalable offline monitoring. In: Bonakdarpour, B., Smolka, S.A. (eds.) RV 2014. LNCS, vol. 8734, pp. 31–47. Springer, Heidelberg (2014)
Beschastnikh, I., Brun, Y., Ernst, M.D., Krishnamurthy, A.: Inferring models of concurrent systems from logs of their behavior with CSight. In: Proceedings of the 36th International Conference on Software Engineering, pp. 468–479. ACM (2014)
Beschastnikh, I., Brun, Y., Schneider, S., Sloan, M., Ernst, M.D.: Leveraging existing instrumentation to automatically infer invariant-constrained models. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European Conference on Foundations of Software Engineering, pp. 267–277. ACM (2011)
Bianculli, D., Ghezzi, C., Krstić, S.: Trace checking of metric temporal logic with aggregating modalities using MapReduce. In: Giannakopoulou, D., Salaün, G. (eds.) SEFM 2014. LNCS, vol. 8702, pp. 144–158. Springer, Heidelberg (2014)
Biermann, A., Feldman, J.: On the synthesis of finite-state machines from samples of their behavior. Computers, IEEE Transactions on C 21(6), 592–597 (1972)
Cook, J.E., Wolf, A.L.: Discovering models of software processes from event-based data. ACM Trans. Softw. Eng. Methodol. 7(3), 215–249 (1998)
Dean, J., Ghemawat, S.: Mapreduce: Simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Ghezzi, C., Pezzè, M., Sama, M., Tamburrelli, G.: Mining behavior models from user-intensive web applications. In: Proceedings of the 36th International Conference on Software Engineering, pp. 277–287. ACM (2014)
Lee, C., Chen, F., Roşu, G.: Mining parametric specifications. In: Proceedings of the 33rd International Conference on Software Engineering, pp. 591–600. ICSE 2011. ACM, New York (2011)
Lee, K.H., Lee, Y.J., Choi, H., Chung, Y.D., Moon, B.: Parallel data processing with mapreduce: A survey. SIGMOD Rec. 40(4), 11–20 (2012)
Lo, D., Mariani, L., Pezzè, M.: Automatic steering of behavioral model inference. In: Proceedings of the 7th Joint Meeting Of The European Software Engineering Conference and the ACM SIGSOFT symposium on The foundations of software engineering, pp. 345–354. ACM (2009)
Lorenzoli, D., Mariani, L., Pezzè, M.: Automatic generation of software behavioral models. In: Proceedings of the 30th international conference on Software engineering, pp. 501–510. ACM (2008)
Luo, C., He, F., Ghezzi, C.: Inferring software behavioral models with mapreduce (extended version). http://sts.thss.tsinghua.edu.cn/beagle/paper/model-2015.pdf
Thollard, F., Dupont, P., Higuera, C.d.l.: Probabilistic dfa inference using kullback-leibler divergence and minimality. In: Proceedings of the Seventeenth International Conference on Machine Learning, ICML 2000, pp. 975–982. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2000)
Walkinshaw, N., Bogdanov, K.: Inferring finite-state models with temporal constraints. In: Proceedings of the 2008 23rd IEEE/ACM International Conference on Automated Software Engineering, pp. 248–257. IEEE Computer Society (2008)
Xu, W., Huang, L., Fox, A., Patterson, D., Jordan, M.: Experience mining google’s production console logs. In: Proceedings of the 2010 Workshop on Managing Systems via Log Analysis and Machine Learning Techniques, SLAML 2010, pp. 5–5. USENIX Association, Berkeley, CA, USA (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Luo, C., He, F., Ghezzi, C. (2015). Inferring Software Behavioral Models with MapReduce. In: Li, X., Liu, Z., Yi, W. (eds) Dependable Software Engineering: Theories, Tools, and Applications. SETTA 2015. Lecture Notes in Computer Science(), vol 9409. Springer, Cham. https://doi.org/10.1007/978-3-319-25942-0_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-25942-0_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25941-3
Online ISBN: 978-3-319-25942-0
eBook Packages: Computer ScienceComputer Science (R0)