Abstract
Software module clustering is the problem of automatically partitioning the structure of a software system using low-level dependencies in the source code to understand and improve the system’s architecture. Munch, a clustering tool based on search-based software engineering techniques, was used to modularise a unique dataset of sequential source code software versions. This paper investigates whether the dataset used for the modularisation resembles a random graph by computing the probabilities of observing certain connectivity. Modularisation will not be possible with data that resembles random graphs. Thus, this paper demonstrates that our real world time-series dataset does not resemble a random graph except for small sections where there were large maintenance activities. Furthermore, the random graph metric can be used as a tool to indicate areas of interest in the dataset, without the need to run the modularisation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Altman, D.G.: Practical Statistics for Medical research. Chapman and Hall (1997)
Arzoky, M., Swift, S., Tucker, A., Cain, J.: Munch: An Efficient Modularisation Strategy to Assess the Degree of Refactoring on Sequential Source Code Checkings. In: IEEE Fourth International Conference on Software Testing, Verification and Validation Workshops, pp. 422–429 (2011)
Arzoky, M., Swift, S., Tucker, A., Cain, J.: A Seeded Search for the Modularisation of Sequential Software Versions. Journal of Object Technology 11(2), 6:1-27 (2012)
Barabási, A.L., Albert, R., Jeong, H.: Scale-free characteristics of random networks: The topology of the world-wide web. Physica A: Statistical Mechanics and its Applications 281(1), 69–77 (2000)
Cain, J., Counsell, S., Swift, S., Tucker, A.: An Application of Intelligent Data Analysis Techniques to a Large Software Engineering Dataset. In: Adams, N.M., Robardet, C., Siebes, A., Boulicaut, J.-F. (eds.) IDA 2009. LNCS, vol. 5772, pp. 261–272. Springer, Heidelberg (2009)
Chidamber, S.R., Kemerer, C.F.: A metrics suite for object oriented design. IEEE Trans. Software Eng. 20(6), 476–493 (1994)
Constantine, L.L., Yourdon, E.: Structured Design. Prentice Hall (1979)
Doval, D., Mancoridis, S., Mitchell, B.S.: Automatic clustering of software systems using a genetic algorithm. In: Software Technology and Engineering Practice. IEEE Proceedings STEP 1999, pp. 73–81 (1999)
Erdős, P., Rényi, A.: On the evolution of random graphs. Magyar Tud. Akad, Mat. Kutató Int. Közl. 5, 17–61 (1960)
Gilbert, E.N.: Random graphs. The Annals of Mathematical Statistics, 1141–1144 (1959)
Harman, M., Hierons, R., Proctor, M.: A new representation and crossover operator for search based optimization of software modularization. In: Proc. Genetic and Evolutionary Computation Conference, pp. 1351–1358. Morgan Kaufmann Publishers (2002)
Harman, M., Mansouri, S.A., Zhang, Y.: Search-based software engineering: Trends, techniques and applications. ACM Computing Surveys 45(1), 11 (2012)
Harman, M., Swift, S., Mahdavi, K.: An empirical study of the robustness of two module clustering fitness functions. In: Genetic and Evolutionary Computation Conference, Washington, DC, pp. 1029–1036 (2005)
Mancoridis, S., Mitchell, B.S., Rorres, C., Chen, Y., Gansner, E.R.: Using automatic clustering to produce high-level system organizations of source code. In: International Workshop on Program Comprehension (IWPC 1998), pp. 45–53. IEEE Computer Society Press, Los Alamitos (1998)
Massey, F.J.: The Kolmogorov-Smirnov Test for Goodness of Fit. Journal of the American Statistical Association 46(253), 68–78 (1951)
Mislove, A., Marcon, M., Gummadi, K.P., Druschel, P., Bhattacharjee, B.: Measurement and analysis of online social networks. In: Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement, pp. 29–42 (2007)
Mitchell, B.S.: A Heuristic Search Approach to Solving the Software Clustering Problem. PhD Thesis, Drexel University, Philadelphia, PA (2002)
Praditwong, K., Harman, M., Yao, X.: Software Module Clustering as a Multi–Objective Search Problem. IEEE Transactions on Software Engineering 37(2), 264–282 (2011)
Sommerville, I.: Software Engineering, 5th edn. Addison-Wesley (1995)
Stroggylos, K., Spinellis, D.: Refactoring does it improve software quality? In: WoSQ 2007: Proceedings of the 5th International Workshop on Software Quality. IEEE Computer Society, Washington, DC (2007)
Roth, C., Kang, S.M., Batty, M., Barthelemy, M.: A long-time limit for world subway networks. Journal of The Royal Society Interface 9(75), 2540–2550 (2012)
Tucker, A., Swift, S., Liu, X.: Variable Grouping in multivariate time series via correlation. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 31(2), 235–245 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Arzoky, M., Swift, S., Counsell, S., Cain, J. (2014). A Measure of the Modularisation of Sequential Software Versions Using Random Graph Theory. In: Dingsøyr, T., Moe, N.B., Tonelli, R., Counsell, S., Gencel, C., Petersen, K. (eds) Agile Methods. Large-Scale Development, Refactoring, Testing, and Estimation. XP 2014. Lecture Notes in Business Information Processing, vol 199. Springer, Cham. https://doi.org/10.1007/978-3-319-14358-3_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-14358-3_10
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14357-6
Online ISBN: 978-3-319-14358-3
eBook Packages: Computer ScienceComputer Science (R0)