Skip to main content

An Application of Intelligent Data Analysis Techniques to a Large Software Engineering Dataset

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5772))

Abstract

Within the development of large software systems, there is significant value in being able to predict changes. If we can predict the likely changes that a system will undergo, then we can estimate likely developer effort and allocate resources appropriately. Within object oriented software development, these changes are often identified as refactorings. Very few studies have explored the prediction of refactorings on a wide-scale. Within this paper we aim to do just this, through applying intelligent data analysis techniques to a uniquely large and comprehensive software engineering time series dataset. Our analysis show extremely promising results, allowing us to predict the occurrence of future large changes.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Advani, D., Hassoun, Y., Counsell, S.: Extracting Refactoring Trends from Open-source Software and a Possible Solution to the ‘Related Refactoring’ Conundrum. In: Proceedings of ACM Symposium on Applied Computing, Dijon, France (April 2006)

    Google Scholar 

  2. BSCKIT Browser Toolkits for Microsoft Visual C++, Microsoft support knowledge base article number Q153393, http://support.microsoft.com/

  3. Cain, J.: Debugging with the DIA SDK, Visual System Journal (April 2004), http://www.vsj.co.uk/dotnet/display.asp?id=320

  4. Chen, G., Banerjee, N., Jaradat, S.A., Tanaka, T.S., Ko, M.S.H., Zhang, M.Q.: Evaluation and Comparison of Clustering Algorithms in Analyzing ES Cell Gene Expression Data. Statistica Sinica 12, 241–262 (2002)

    MathSciNet  MATH  Google Scholar 

  5. Demeyer, S., Ducasse, S., Nierstrasz, O.: Finding refactorings via change metrics. In: ACM Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA), Minneapolis, USA, pp. 166–177 (2000)

    Google Scholar 

  6. Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. Chapman & Hall, London (1993)

    Book  MATH  Google Scholar 

  7. Fowler, M.: Refactoring: Improving the Design of Existing Code. Addison-Wesley, Reading (1999)

    MATH  Google Scholar 

  8. Mancoridis, S., Mitchell, B.S., Rorres, C., Chen, Y., Gansner, E.R.: Using automatic clustering to produce high-level system organizations of source code. In: Proceedings of the 6th International Workshop on Program Comprehension (IWPC 1998), Ischia, Italy, pp. 45–52. IEEE Computer Society Press, Los Alamitos (1998)

    Google Scholar 

  9. Mens, T., Tourwe, T.: A Survey of Software Refactoring. IEEE Transactions on Software Engineering 30(2), 126–139 (2004)

    Article  Google Scholar 

  10. Mens, T., van Deursen, A.: Refactoring: Emerging Trends and Open Problems (2003), http://www.swen.uwaterloo.ca/~reface03/Papers/TomMens.pdf

  11. Murphy, K.: Dynamic Bayesian Networks: Representation, Inference and Learning. PhD Thesis, UC Berkeley, Computer Science Division (July 2002)

    Google Scholar 

  12. Opdyke, W.: Refactoring Object-Oriented Frameworks. PhD thesis, University of Illinois at Urbana-Champaign (1992)

    Google Scholar 

  13. Pietrek, M.: Under the Hood. MSDN Magazine 17(3) (2002)

    Google Scholar 

  14. Rabiner, L.R.: A tutorial on HMM and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–286 (1989)

    Article  Google Scholar 

  15. Schreiber, S.: Undocumented Windows 2000 Secrets, A Programmer’s Cookbook. Addison-Wesley, Reading (2001)

    Google Scholar 

  16. Stevens, W., Myers, G., Constantine, L.: Structured design. IBM Systems Journal 13(2), 115–139 (1974)

    Article  Google Scholar 

  17. Zhao, L., Hayes, J.: Predicting Classes in Need of Refactoring: An Application of Static Metrics. In: Proceedings of 2nd International PROMISE Workshop, Philadelphia, US (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cain, J., Counsell, S., Swift, S., Tucker, A. (2009). An Application of Intelligent Data Analysis Techniques to a Large Software Engineering Dataset. In: Adams, N.M., Robardet, C., Siebes, A., Boulicaut, JF. (eds) Advances in Intelligent Data Analysis VIII. IDA 2009. Lecture Notes in Computer Science, vol 5772. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03915-7_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03915-7_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03914-0

  • Online ISBN: 978-3-642-03915-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics