Abstract
Mining software historical repositories (SHR) has emerged as a research direction Sun, over the past decade, which achieved substantial success in both research and practice to support various software maintenance tasks. Use of different types of SHR, or even different versions of the software project may derive different results for the same technique or approach of a maintenance task. Inclusion of unrelated information in SHR-based technique may lead to decreased effectiveness or even wrong results. To the best of our knowledge, few focus is on this respect in the SE community. This paper attempts to bridge this gap and proposes a preprocess to facilitate selection of related SHR to support various software maintenance tasks. The preprocess uses the topic model to extract the related information from SHR to help support software maintenance, thus improving the effectiveness of traditional SHR-based technique. Empirical results show the effectiveness of our approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Here we need consecutive versions since many software maintenance tasks are performed based on the differences between consecutive versions in source control repositories.
- 2.
References
Anthes, G.: Topic models versus unstructured data. Commun. ACM 53(12), 16–18 (2010)
Antoniol, G., Huffman Hayes, J., Gaël Guéhéneuc, Y., Di Penta, M.: Reuse or rewrite: Combining textual, static, and dynamic analyses to assess the cost of keeping a system up-to-date. In: 24th IEEE International Conference on Software Maintenance, pp. 147–156 (2008)
Antoniol, G., Canfora, G., Casazza, G., De Lucia, A., Merlo, E.: Recovering traceability links between code and documentation. IEEE Trans. Software Eng. 28(10), 970–983 (2002)
Barnard, K., Duygulu, P., Forsyth, D.A., de Freitas, N., Blei, D.M., Jordan, M.I.: Matching words and pictures. J. Mach. Learn. Res. 3, 1107–1135 (2003)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Bohner, S., Arnold, R.: Software Change Impact Analysis. IEEE Computer Society Press, Los Alamitos (1996)
De Lucia, A., Di Penta, M., Oliveto, R., Panichella, A., Panichella, S.: Applying a smoothing filter to improve ir-based traceability recovery processes: An empirical investigation. Inf. Softw. Technol. 55(4), 741–754 (2013)
Dit, B., Revelle, M., Gethers, M., Poshyvanyk, D.: Feature location in source code: a taxonomy and survey. J. Softw. Evol. Process 25(1), 53–95 (2013)
Dit, B., Revelle, M., Poshyvanyk, D.: Integrating information retrieval, execution and link analysis algorithms to improve feature location in software. Empir. Softw. Eng. 18(2), 277–309 (2013)
Fontana, F.A., Braione, P., Zanoni, M.: Automatic detection of bad smells in code: an experimental assessment. J. Object Technol. 11(2), 5:1–5:38 (2012)
Hassan, A.E., Holt, R.C.: Predicting change propagation in software systems. In: 20th International Conference on Software Maintenance, pp. 284–293 (2004)
Kagdi, H.H., Collard, M.L., Maletic, J.I.: A survey and taxonomy of approaches for mining software repositories in the context of software evolution. J. Softw. Maintenance 19(2), 77–131 (2007)
Kagdi, H.H., Gethers, M., Poshyvanyk, D.: Integrating conceptual and logical couplings for change impact analysis in software. Empir. Softw. Eng. 18(5), 933–969 (2013)
Li, B., Sun, X., Leung, H., Zhang, S.: A survey of code-based change impact analysis techniques. Softw. Test., Verif. Reliab. 23(8), 613–646 (2013)
Li, D., Ding, Y., Shuai, X., Bollen, J., Tang, J., Chen, S., Zhu, J., Rocha, G.: Adding community and dynamic to topic models. J. Informetrics 6(2), 237–253 (2012)
Mockus, A., Fielding, R.T., Herbsleb, J.D.: Two case studies of open source software development: Apache and mozilla. ACM Trans. Softw. Eng. Methodol. 11(3), 309–346 (2002)
Nguyen, A.T., Nguyen, T.T., Al-Kofahi, J., Nguyen, H.V., Nguyen, T.N.: A topic-based approach for narrowing the search space of buggy files from a bug report. In: Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, pp. 263–272 (2011)
Palomba, F., Bavota, G., Di Penta, M., Oliveto, R., De Lucia, A., Poshyvanyk, D.: Detecting bad smells in source code using change history information. In: IEEE/ACM International Conference on Automated Software Engineering, pp. 268–278 (2013)
Panichella, A., Dit, B., Oliveto, R., Di Penta, M., Poshyvanyk, D., De Lucia, A.: How to effectively use topic models for software engineering tasks? an approach based on genetic algorithms. In: 35th International Conference on Software Engineering, pp. 522–531 (2013)
Poshyvanyk, D., Marcus, A., Ferenc, R., Gyimothy, T.: Using information retrieval based coupling measures for impact analysis. Empir. Softw. Eng. 14(1), 5–32 (2009)
Schneidewind, N.F.: The state of software maintenance. IEEE Trans. Softw. Eng. 13(3), 303–310 (1987)
Shtern, M., Tzerpos, V.: Clustering methodologies for software engineering. Adv. Softw. Eng. 2012, 18. doi:10.1155/2012/792024 (2012)
Sliwerski, J., Zimmermann, T., Zeller, A.: When do changes induce fixes? ACM SIGSOFT Softw. Eng. Notes 30(4), 1–5 (2005)
Sun, X., Li, B., Li, B., Wen, W.: A comparative study of static cia techniques. In: Proceedings of the Fourth Asia-Pacific Symposium on Internetware, pp. 23 (2012)
Thomas, S.W.: Mining software repositories using topic models. In: Proceedings of the 33rd International Conference on Software Engineering, pp. 1138–1139 (2011)
van Rijsbergen, C.J.: Information Retrieval. Butterworths, London (1979)
Zimmermann, T., Weißgerber, P., Diehl, S., Zeller, A.: Mining version histories to guide software changes. IEEE Trans. Softw. Eng. 31(6), 429–445 (2005)
Zimmermann, T., Zeller, A., Weissgerber, P., Diehl, S.: Mining version histories to guide software changes. IEEE Trans. Softw. Eng. 31(6), 429–445 (2005)
Acknowledgments
The authors would like to thank anonymous reviewers who make the paper more understandable and stronger. This work is supported partially by the Natural Science Foundation of the Jiangsu Higher Education Institutions of China under Grant No. 13KJB520027, partially by National Natural Science Foundation of China under Grant No. 61402396 and No. 61472344, partially by the Innovative Fund for Industry-Academia-Research Cooperation of Jiangsu Province under Grant No. BY2013063-10, and partially by the Cultivating Fund for Science and Technology Innovation of Yangzhou University under Grant No. 2013CXJ025.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Sun, X., Li, B., Li, Y., Chen, Y. (2015). What Information in Software Historical Repositories Do We Need to Support Software Maintenance Tasks? An Approach Based on Topic Model. In: Lee, R. (eds) Computer and Information Science. Studies in Computational Intelligence, vol 566. Springer, Cham. https://doi.org/10.1007/978-3-319-10509-3_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-10509-3_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10508-6
Online ISBN: 978-3-319-10509-3
eBook Packages: EngineeringEngineering (R0)