Abstract
This work presents the application of branch-and-price approaches to the automatic version of the Software Clustering Problem. To tackle this problem, we apply the Dantzig–Wolfe decomposition to a formulation from the literature. Given this, we present two Column Generation (CG) approaches to solve the linear programming relaxation of the resulting reformulation: the standard CG approach, and a new approach, which we call Staged Column Generation (SCG). Also, we propose a modification to the pricing subproblem that allows to add multiple columns at each iteration of the CG. We test our algorithms in a set of 45 instances from the literature. The proposed approaches were able to improve the literature results solving all these instances to optimality. Furthermore, the SCG approach presented a considerable performance improvement regarding computational time, number of iterations and generated columns when compared with the standard CG as the size of the instances grows.
Similar content being viewed by others
Notes
More info at https://realopt.bordeaux.inria.fr/?page_id=2.
References
Billionnet, A., Djebali, K.: Résolution d’un problème combinatoire fractionnaire par la programmation linéaire mixte. RAIRO. Recherche opérationnelle 40(2), 97–111 (2006)
Dantzig, G.B., Wolfe, P.: Decomposition principle for linear programs. Oper. Res. 8(1), 101–111 (1960)
Doval, D., Mancoridis, S., Mitchell, B.S.: Automatic clustering of software systems using a genetic algorithm. In: Proceedings of the Software Technology and Engineering Practice, pp. 73–81. IEEE (1999)
Gauthier, R., Pont, S.: Designing Systems Programs. Prentice-Hall, Englewood Cliffs (1970)
Harman, M., Hierons, R.M., Proctor, M.: A new representation and crossover operator for search-based optimization of software modularization. GECCO 2, 1351–1358 (2002)
Hochbaum, D.S.: Polynomial time algorithms for ratio regions and a variant of normalized cut. IEEE Trans. Pattern Anal. Mach. Intell. 32(5), 889–898 (2010)
Hochbaum, D.S.: A polynomial time algorithm for rayleigh ratio on discrete variables: Replacing spectral techniques for expander ratio, normalized cut, and cheeger constant. Oper. Res. 61(1), 184–198 (2013)
Jeet, K., Dhir, R.: Software architecture recovery using genetic black hole algorithm. ACM SIGSOFT Softw. Eng. Notes 40(1), 1–5 (2015)
Kazem, A.A.P., Lotfi, S.: A modified genetic algorithm for software clustering problem. In: Proceedings of the 6th WSEAS International Conference on Applied Informatics and Communications, pp. 306–311. World Scientific and Engineering Academy and Society (WSEAS) (2006)
Kazem, A.A.P., Lotfi, S.: An evolutionary approach for partitioning weighted module dependency graphs. In: 4th International Conference on Innovations in Information Technology, 2007. IIT’07, pp. 252–256. IEEE (2007)
Köhler, V., Fampa, M., Araújo, O.: Mixed-integer linear programming formulations for the software clustering problem. Comput. Optim. Appl. 55(1), 1–23 (2013)
Mahdavi, K., Harman, M., Hierons, R.M.: Finding building blocks for software clustering. Lecture Notes in Computer Science, vol, 2724, pp. 2513–2514 (2003)
Mahdavi, K., Harman, M., Hierons, R.M.: A multiple hill climbing approach to software module clustering. In: Proceedings of the International Conference on Software Maintenance, pp. 315–324. IEEE (2003)
Mamaghani, A.S., Meybodi, M.R.: Clustering of software systems using new hybrid algorithms. In: Proceedings of the 2009 IEEE International Conference on Computer and Information Technology (CIT’09), vol. 1, pp. 20–25 (2009)
Mancoridis, S., Mitchell, B.S., Chen, Y., Gansner, E.R.: Bunch: A clustering tool for the recovery and maintenance of software system structures. In: Proceedings of the IEEE International Conference on Software Maintenance, pp. 50–59. IEEE (1999)
Mancoridis, S., Mitchell, B.S., Rorres, C., Chen, Y., Gansner, E.R.: Using automatic clustering to produce high-level system organizations of source code. In: Proceedings of the 6th International Workshop on Program Comprehension, 1998. IWPC’98. pp. 45–52. IEEE (1998)
Mitchell, B.S.: A heuristic search approach to solving the software clustering problem. Ph.D. thesis, Drexel University (2002)
Mitchell, B.S., Mancoridis, S.: Using heuristic search techniques to extract design abstractions from source code. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1375–1382. Morgan Kaufmann Publishers Inc. (2002)
Parnas, D.L.: On the criteria to be used in decomposing systems into modules. Commun. ACM 15(12), 1053–1058 (1972)
Parsa, S., Bushehrian, O.: A new encoding scheme and a framework to investigate genetic clustering algorithms. J. Res. Pract. Inf. Technol. 37(1) (2005)
Räihä, O.: A survey on search-based software design. Comput. Sci. Rev. 4(4), 203–249 (2010)
Ryan, D.M., Foster, B.A.: An integer programming approach to scheduling. Computer scheduling of public transport urban passenger vehicle and crew scheduling, pp. 269–280 (1981)
Schaeffer, S.E.: Graph clustering. Comput. Sci. Rev. 1(1), 27–64 (2007)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. on Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
Vanderbeck, F.: Branching in branch-and-price: a generic scheme. Math. Program. 130(2), 249–294 (2011)
Acknowledgments
HH Kramer was financially supported by CNPq/CsF Grant No. 246661/2012-7 and CAPES
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kramer, H.H., Uchoa, E., Fampa, M. et al. Column generation approaches for the software clustering problem. Comput Optim Appl 64, 843–864 (2016). https://doi.org/10.1007/s10589-015-9822-9
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10589-015-9822-9