Skip to main content
Log in

Column generation approaches for the software clustering problem

  • Published:
Computational Optimization and Applications Aims and scope Submit manuscript

Abstract

This work presents the application of branch-and-price approaches to the automatic version of the Software Clustering Problem. To tackle this problem, we apply the Dantzig–Wolfe decomposition to a formulation from the literature. Given this, we present two Column Generation (CG) approaches to solve the linear programming relaxation of the resulting reformulation: the standard CG approach, and a new approach, which we call Staged Column Generation (SCG). Also, we propose a modification to the pricing subproblem that allows to add multiple columns at each iteration of the CG. We test our algorithms in a set of 45 instances from the literature. The proposed approaches were able to improve the literature results solving all these instances to optimality. Furthermore, the SCG approach presented a considerable performance improvement regarding computational time, number of iterations and generated columns when compared with the standard CG as the size of the instances grows.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. More info at https://realopt.bordeaux.inria.fr/?page_id=2.

References

  1. Billionnet, A., Djebali, K.: Résolution d’un problème combinatoire fractionnaire par la programmation linéaire mixte. RAIRO. Recherche opérationnelle 40(2), 97–111 (2006)

    MathSciNet  Google Scholar 

  2. Dantzig, G.B., Wolfe, P.: Decomposition principle for linear programs. Oper. Res. 8(1), 101–111 (1960)

    Article  MATH  Google Scholar 

  3. Doval, D., Mancoridis, S., Mitchell, B.S.: Automatic clustering of software systems using a genetic algorithm. In: Proceedings of the Software Technology and Engineering Practice, pp. 73–81. IEEE (1999)

  4. Gauthier, R., Pont, S.: Designing Systems Programs. Prentice-Hall, Englewood Cliffs (1970)

    Google Scholar 

  5. Harman, M., Hierons, R.M., Proctor, M.: A new representation and crossover operator for search-based optimization of software modularization. GECCO 2, 1351–1358 (2002)

    Google Scholar 

  6. Hochbaum, D.S.: Polynomial time algorithms for ratio regions and a variant of normalized cut. IEEE Trans. Pattern Anal. Mach. Intell. 32(5), 889–898 (2010)

    Article  MathSciNet  Google Scholar 

  7. Hochbaum, D.S.: A polynomial time algorithm for rayleigh ratio on discrete variables: Replacing spectral techniques for expander ratio, normalized cut, and cheeger constant. Oper. Res. 61(1), 184–198 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  8. Jeet, K., Dhir, R.: Software architecture recovery using genetic black hole algorithm. ACM SIGSOFT Softw. Eng. Notes 40(1), 1–5 (2015)

    Article  Google Scholar 

  9. Kazem, A.A.P., Lotfi, S.: A modified genetic algorithm for software clustering problem. In: Proceedings of the 6th WSEAS International Conference on Applied Informatics and Communications, pp. 306–311. World Scientific and Engineering Academy and Society (WSEAS) (2006)

  10. Kazem, A.A.P., Lotfi, S.: An evolutionary approach for partitioning weighted module dependency graphs. In: 4th International Conference on Innovations in Information Technology, 2007. IIT’07, pp. 252–256. IEEE (2007)

  11. Köhler, V., Fampa, M., Araújo, O.: Mixed-integer linear programming formulations for the software clustering problem. Comput. Optim. Appl. 55(1), 1–23 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  12. Mahdavi, K., Harman, M., Hierons, R.M.: Finding building blocks for software clustering. Lecture Notes in Computer Science, vol, 2724, pp. 2513–2514 (2003)

  13. Mahdavi, K., Harman, M., Hierons, R.M.: A multiple hill climbing approach to software module clustering. In: Proceedings of the International Conference on Software Maintenance, pp. 315–324. IEEE (2003)

  14. Mamaghani, A.S., Meybodi, M.R.: Clustering of software systems using new hybrid algorithms. In: Proceedings of the 2009 IEEE International Conference on Computer and Information Technology (CIT’09), vol. 1, pp. 20–25 (2009)

  15. Mancoridis, S., Mitchell, B.S., Chen, Y., Gansner, E.R.: Bunch: A clustering tool for the recovery and maintenance of software system structures. In: Proceedings of the IEEE International Conference on Software Maintenance, pp. 50–59. IEEE (1999)

  16. Mancoridis, S., Mitchell, B.S., Rorres, C., Chen, Y., Gansner, E.R.: Using automatic clustering to produce high-level system organizations of source code. In: Proceedings of the 6th International Workshop on Program Comprehension, 1998. IWPC’98. pp. 45–52. IEEE (1998)

  17. Mitchell, B.S.: A heuristic search approach to solving the software clustering problem. Ph.D. thesis, Drexel University (2002)

  18. Mitchell, B.S., Mancoridis, S.: Using heuristic search techniques to extract design abstractions from source code. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1375–1382. Morgan Kaufmann Publishers Inc. (2002)

  19. Parnas, D.L.: On the criteria to be used in decomposing systems into modules. Commun. ACM 15(12), 1053–1058 (1972)

    Article  Google Scholar 

  20. Parsa, S., Bushehrian, O.: A new encoding scheme and a framework to investigate genetic clustering algorithms. J. Res. Pract. Inf. Technol. 37(1) (2005)

  21. Räihä, O.: A survey on search-based software design. Comput. Sci. Rev. 4(4), 203–249 (2010)

    Article  Google Scholar 

  22. Ryan, D.M., Foster, B.A.: An integer programming approach to scheduling. Computer scheduling of public transport urban passenger vehicle and crew scheduling, pp. 269–280 (1981)

  23. Schaeffer, S.E.: Graph clustering. Comput. Sci. Rev. 1(1), 27–64 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  24. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. on Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)

    Article  Google Scholar 

  25. Vanderbeck, F.: Branching in branch-and-price: a generic scheme. Math. Program. 130(2), 249–294 (2011)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

HH Kramer was financially supported by CNPq/CsF Grant No. 246661/2012-7 and CAPES

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marcia Fampa.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kramer, H.H., Uchoa, E., Fampa, M. et al. Column generation approaches for the software clustering problem. Comput Optim Appl 64, 843–864 (2016). https://doi.org/10.1007/s10589-015-9822-9

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10589-015-9822-9

Keywords

Navigation