ABSTRACT
Software engineering researchers analyze programs by applying a range of test cases, measuring relevant statistics and reasoning about the observed phenomena. Though the traditional statistical methods provide a rigorous analysis of the data obtained during program analysis, they lack the flexibility to build a unique representation for each program. Bayesian methods for data analysis, on the other hand, allow for flexible updates of the knowledge acquired through observations. Despite their strong mathematical basis and obvious suitability to software analysis, Bayesian methods are still largely under-utilized in the software engineering community, primarily because many software engineers are unfamiliar with the use of Bayesian methods to formulate their research problems.
This tutorial will provide a broad introduction of Bayesian methods for data analysis, with a specific focus on problems of interest to software engineering researchers. In addition, the tutorial will provide an in-depth understanding of a subset of popular topics such as Bayesian inference, probabilistic prediction techniques, Markov models, information theory and sampling. The core concepts will be explained using case studies and the application of prominent statistical tools on examples drawn from software engineering research. At the end of the tutorial, the participants will acquire the necessary skills and background knowledge to formulate their research problems using Bayesian methods, and analyze their formulation using appropriate software tools.
- G. K. Baah, A. Gray, and M. J. Harrold. On-line Aanomaly Detection of Deployed Software: A Statistical Machine Learning Approach. In SOQUA, pages 70--77, 2006. Google ScholarDigital Library
- G. K. Baah, A. Podgurski, and M. J. Harrold. The Probabilistic Program Dependence Graph and its Application to Fault Diagnosis. In ISSTA, pages 189--200, 2008. Google ScholarDigital Library
- C. M. Bishop. Pattern Recognition and Machine Learning. Springer-Verlag, New York, 2008.Google Scholar
- L. C. Briand. Novel Applications of Machine Learning in Software Testing. In The Eighth International Conference on Quality Software, pages 3--10, Washington, DC, USA, 2008. IEEE Computer Society. Google ScholarDigital Library
- A. Gelman, J. B. Carlin, H. S. Stern, and D. B. Rubin. Bayesian Data Analysis. Chapman and HALL/CRC, 2004.Google Scholar
- J.-J. Gras, R. Gupta, and E. Perez-Minana. Generating A Test Strategy with Bayesian Networks and Common Sense. Academic and Industrial Conference on Practice And Research Techniques, Testing, 0:29--40, 2006. Google ScholarDigital Library
- The Mathworks Website. http://www.mathworks.com/products/matlab/.Google Scholar
- T. B. Project. Bayesian Inference Using Gibbs Sampling. MRC Biostatistics Unit, Cambridge, UK, 1997.Google Scholar
- R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2007. ISBN 3-900051-07-0.Google Scholar
- D. A. Wooff, M. Goldstein, and F. P. A. Coolen. Bayesian Graphical Models for Software Testing. IEEE Trans. Softw. Eng., 28(5):510--525, 2002. Google ScholarDigital Library
- Bayesian methods for data analysis in software engineering
Comments