Summary
Mining software repositories, which is the process of analyzing the data related to software development practices, is an emerging field of research which aims to improve software evolutionary tasks. The data in many software repositories is unstructured (for example, the natural language text in bug reports), making it particularly difficult to mine and analyze. In this chapter, we survey tools and techniques for mining unstructured software repositories, with a focus on information retrieval models. In addition, we discuss several software engineering tasks that can be enhanced by leveraging unstructured data, including bug prediction, clone detection, bug triage, feature location, code search engines, traceability link recovery, evolution and trend analysis, bug localization, and more. Finally, we provide a hands-on tutorial for using an IR model on an unstructured repository to perform a software engineering task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Thomas, S.W., Hassan, A.E., Blostein, D. (2014). Mining Unstructured Software Repositories. In: Mens, T., Serebrenik, A., Cleve, A. (eds) Evolving Software Systems. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45398-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-45398-4_5
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45397-7
Online ISBN: 978-3-642-45398-4
eBook Packages: Computer ScienceComputer Science (R0)