Skip to main content

What Information in Software Historical Repositories Do We Need to Support Software Maintenance Tasks? An Approach Based on Topic Model

  • Chapter
  • First Online:
Computer and Information Science

Part of the book series: Studies in Computational Intelligence ((SCI,volume 566))

Abstract

Mining software historical repositories (SHR) has emerged as a research direction Sun, over the past decade, which achieved substantial success in both research and practice to support various software maintenance tasks. Use of different types of SHR, or even different versions of the software project may derive different results for the same technique or approach of a maintenance task. Inclusion of unrelated information in SHR-based technique may lead to decreased effectiveness or even wrong results. To the best of our knowledge, few focus is on this respect in the SE community. This paper attempts to bridge this gap and proposes a preprocess to facilitate selection of related SHR to support various software maintenance tasks. The preprocess uses the topic model to extract the related information from SHR to help support software maintenance, thus improving the effectiveness of traditional SHR-based technique. Empirical results show the effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Here we need consecutive versions since many software maintenance tasks are performed based on the differences between consecutive versions in source control repositories.

  2. 2.

    http://www.cs.wm.edu/semeru/data/msr13/.

References

  1. Anthes, G.: Topic models versus unstructured data. Commun. ACM 53(12), 16–18 (2010)

    Article  Google Scholar 

  2. Antoniol, G., Huffman Hayes, J., Gaël Guéhéneuc, Y., Di Penta, M.: Reuse or rewrite: Combining textual, static, and dynamic analyses to assess the cost of keeping a system up-to-date. In: 24th IEEE International Conference on Software Maintenance, pp. 147–156 (2008)

    Google Scholar 

  3. Antoniol, G., Canfora, G., Casazza, G., De Lucia, A., Merlo, E.: Recovering traceability links between code and documentation. IEEE Trans. Software Eng. 28(10), 970–983 (2002)

    Article  Google Scholar 

  4. Barnard, K., Duygulu, P., Forsyth, D.A., de Freitas, N., Blei, D.M., Jordan, M.I.: Matching words and pictures. J. Mach. Learn. Res. 3, 1107–1135 (2003)

    MATH  Google Scholar 

  5. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  6. Bohner, S., Arnold, R.: Software Change Impact Analysis. IEEE Computer Society Press, Los Alamitos (1996)

    Google Scholar 

  7. De Lucia, A., Di Penta, M., Oliveto, R., Panichella, A., Panichella, S.: Applying a smoothing filter to improve ir-based traceability recovery processes: An empirical investigation. Inf. Softw. Technol. 55(4), 741–754 (2013)

    Article  Google Scholar 

  8. Dit, B., Revelle, M., Gethers, M., Poshyvanyk, D.: Feature location in source code: a taxonomy and survey. J. Softw. Evol. Process 25(1), 53–95 (2013)

    Article  Google Scholar 

  9. Dit, B., Revelle, M., Poshyvanyk, D.: Integrating information retrieval, execution and link analysis algorithms to improve feature location in software. Empir. Softw. Eng. 18(2), 277–309 (2013)

    Google Scholar 

  10. Fontana, F.A., Braione, P., Zanoni, M.: Automatic detection of bad smells in code: an experimental assessment. J. Object Technol. 11(2), 5:1–5:38 (2012)

    Google Scholar 

  11. Hassan, A.E., Holt, R.C.: Predicting change propagation in software systems. In: 20th International Conference on Software Maintenance, pp. 284–293 (2004)

    Google Scholar 

  12. Kagdi, H.H., Collard, M.L., Maletic, J.I.: A survey and taxonomy of approaches for mining software repositories in the context of software evolution. J. Softw. Maintenance 19(2), 77–131 (2007)

    Article  Google Scholar 

  13. Kagdi, H.H., Gethers, M., Poshyvanyk, D.: Integrating conceptual and logical couplings for change impact analysis in software. Empir. Softw. Eng. 18(5), 933–969 (2013)

    Article  Google Scholar 

  14. Li, B., Sun, X., Leung, H., Zhang, S.: A survey of code-based change impact analysis techniques. Softw. Test., Verif. Reliab. 23(8), 613–646 (2013)

    Article  Google Scholar 

  15. Li, D., Ding, Y., Shuai, X., Bollen, J., Tang, J., Chen, S., Zhu, J., Rocha, G.: Adding community and dynamic to topic models. J. Informetrics 6(2), 237–253 (2012)

    Article  Google Scholar 

  16. Mockus, A., Fielding, R.T., Herbsleb, J.D.: Two case studies of open source software development: Apache and mozilla. ACM Trans. Softw. Eng. Methodol. 11(3), 309–346 (2002)

    Article  Google Scholar 

  17. Nguyen, A.T., Nguyen, T.T., Al-Kofahi, J., Nguyen, H.V., Nguyen, T.N.: A topic-based approach for narrowing the search space of buggy files from a bug report. In: Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, pp. 263–272 (2011)

    Google Scholar 

  18. Palomba, F., Bavota, G., Di Penta, M., Oliveto, R., De Lucia, A., Poshyvanyk, D.: Detecting bad smells in source code using change history information. In: IEEE/ACM International Conference on Automated Software Engineering, pp. 268–278 (2013)

    Google Scholar 

  19. Panichella, A., Dit, B., Oliveto, R., Di Penta, M., Poshyvanyk, D., De Lucia, A.: How to effectively use topic models for software engineering tasks? an approach based on genetic algorithms. In: 35th International Conference on Software Engineering, pp. 522–531 (2013)

    Google Scholar 

  20. Poshyvanyk, D., Marcus, A., Ferenc, R., Gyimothy, T.: Using information retrieval based coupling measures for impact analysis. Empir. Softw. Eng. 14(1), 5–32 (2009)

    Article  Google Scholar 

  21. Schneidewind, N.F.: The state of software maintenance. IEEE Trans. Softw. Eng. 13(3), 303–310 (1987)

    Google Scholar 

  22. Shtern, M., Tzerpos, V.: Clustering methodologies for software engineering. Adv. Softw. Eng. 2012, 18. doi:10.1155/2012/792024 (2012)

  23. Sliwerski, J., Zimmermann, T., Zeller, A.: When do changes induce fixes? ACM SIGSOFT Softw. Eng. Notes 30(4), 1–5 (2005)

    Article  Google Scholar 

  24. Sun, X., Li, B., Li, B., Wen, W.: A comparative study of static cia techniques. In: Proceedings of the Fourth Asia-Pacific Symposium on Internetware, pp. 23 (2012)

    Google Scholar 

  25. Thomas, S.W.: Mining software repositories using topic models. In: Proceedings of the 33rd International Conference on Software Engineering, pp. 1138–1139 (2011)

    Google Scholar 

  26. van Rijsbergen, C.J.: Information Retrieval. Butterworths, London (1979)

    Google Scholar 

  27. Zimmermann, T., Weißgerber, P., Diehl, S., Zeller, A.: Mining version histories to guide software changes. IEEE Trans. Softw. Eng. 31(6), 429–445 (2005)

    Article  Google Scholar 

  28. Zimmermann, T., Zeller, A., Weissgerber, P., Diehl, S.: Mining version histories to guide software changes. IEEE Trans. Softw. Eng. 31(6), 429–445 (2005)

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to thank anonymous reviewers who make the paper more understandable and stronger. This work is supported partially by the Natural Science Foundation of the Jiangsu Higher Education Institutions of China under Grant No. 13KJB520027, partially by National Natural Science Foundation of China under Grant No. 61402396 and No. 61472344, partially by the Innovative Fund for Industry-Academia-Research Cooperation of Jiangsu Province under Grant No. BY2013063-10, and partially by the Cultivating Fund for Science and Technology Innovation of Yangzhou University under Grant No. 2013CXJ025.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaobing Sun .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Sun, X., Li, B., Li, Y., Chen, Y. (2015). What Information in Software Historical Repositories Do We Need to Support Software Maintenance Tasks? An Approach Based on Topic Model. In: Lee, R. (eds) Computer and Information Science. Studies in Computational Intelligence, vol 566. Springer, Cham. https://doi.org/10.1007/978-3-319-10509-3_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-10509-3_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-10508-6

  • Online ISBN: 978-3-319-10509-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics