Abstract
Finding clusters, or communities, in a graph, or network is a very important problem which arises in many domains. Several models were proposed for its solution. One of the most studied and exploited is the maximization of the so called modularity, which represents the sum over all communities of the fraction of edges within these communities minus the expected fraction of such edges in a random graph with the same distribution of degrees. As this problem is NP-hard, a few non-polynomial algorithms and a large number of heuristics were proposed in order to find respectively optimal or high modularity partitions for a given graph. We focus on one of these heuristics, namely a divisive hierarchical method, which works by recursively splitting a cluster into two new clusters in an optimal way. This splitting step is performed by solving a convex quadratic program. We propose a compact reformulation of such model, using change of variables, expansion of integers in powers of two and symmetry breaking constraints. The resolution time is reduced by a factor up to 10 with respect to the one obtained with the original formulation.
Similar content being viewed by others
References
Adams, W. P., & Dearing, P. M. (1994). On the equivalence between roof duality and Lagrangian duality for unconstrained 0–1 quadratic programming problems. Discrete Applied Mathematics, 48(1), 1–20.
Adomavicius, G., & Tuzhilin, A. (2005). Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering, 17(6), 734–749.
Agarwal, G., & Kempe, D. (2008). Modularity-maximizing graph communities via mathematical programming. The European Physical Journal B, Condensed Matter and Complex Systems, 66(3), 409–418.
Aloise, D., Cafieri, S., Caporossi, G., Hansen, P., Perron, S., & Liberti, L. (2010). Column generation algorithms for exact modularity maximization in networks. Physical Review E, 82(4), 046112.
Arenas, A., Fernández, F., & Gómez, S. (2008). Analysis of the structure of complex networks at different resolution levels. New Journal of Physics, 10(5), 053039.
Batagelj, V., & Mrvar, A. (2006). Pajek datasets. http://vlado.fmf.uni-lj.si/pub/networks/data/.
Boulle, M. (2004). Compact mathematical formulation for graph partitioning. Optimization and Engineering, 5(3), 315–333.
Brandes, U., Delling, D., Gaertler, M., Görke, R., Hoefer, M., Nikoloski, Z., & Wagner, D. (2008). On modularity clustering. IEEE Transactions on Knowledge and Data Engineering, 20(2), 172–188.
Brown, G. G., & Dell, R. F. (2007). Formulating integer linear programs: a rogues’ gallery. INFORMS Transactions on Education, 7(2), 1–13.
Cafieri, S., Hansen, P., & Liberti, L. (2010). Loops and multiple edges in modularity maximization of networks. Physical Review E, 81(4), 046102.
Cafieri, S., Hansen, P., & Liberti, L. (2011). Locally optimal heuristic for modularity maximization of networks. Physical Review E, 83(5), 056105.
Clauset, A., Newman, M. E. J., & Moore, C. (2004). Finding community structure in very large networks. Physical Review E, 70(6), 066111.
Dartnell, L., Simeonidis, E., Hubank, M., Tsoka, S., Bogle, I. D. L., & Papageorgiou, L. G. (2005). Robustness of the p53 network and biological hackers. FEBS Letters, 579(14), 3037–3042.
Fan, N., & Pardalos, P. M. (2010). Linear and quadratic programming approaches for the general graph partitioning problem. Journal of Global Optimization, 48(1), 57–71.
Flake, G. W., Lawrence, S., Lee Giles, C., & Coetzee, F. M. (2002). Self-organization and identification of web communities. IEEE Computer, 35(3), 66–71.
Fortet, R. (1960). Applications de l’algèbre de Boole en recherche opérationelle. Revue Française de Recherche Opérationelle, 4, 17–26.
Fortunato, S. (2010). Community detection in graphs. Physics Reports, 486(3–5), 75–174.
Fortunato, S., & Barthélemy, M. (2007). Resolution limit in community detection. Proceedings of the National Academy of Sciences of the United States of America, 104(1), 36–41.
Girvan, M., & Newman, M. E. J. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences of the United States of America, 99(12), 7821–7826.
Good, B. H., de Montjoye, Y.-A., & Clauset, A. (2010). Performance of modularity maximization in practical contexts. Physical Review E, 81(4), 046106.
Grötschel, M., & Wakabayashi, Y. (1989). A cutting plane algorithm for a clustering problem. Mathematical Programming, 45(1), 59–96.
Guimerà, R., & Amaral, L. A. N. (2004). Functional cartography of complex metabolic networks. Nature, 433, 895–900.
Hugo, V. (1951). Bibliothèque de la Pleiade. Les Misérables. Paris: Gallimard.
IBM (2010). ILOG CPLEX 12.2 user’s manual. IBM.
Knuth, D. E. (1993). The Stanford GraphBase: a platform for combinatorial computing. Reading: Addison-Wesley.
Krebs, V. (2008). http://www.orgnet.com/.
Kumpula, J. M., Saramäki, J., Kaski, K., & Kertész, J. (2007). Limited resolution and multiresolution methods in complex network community detection. Fluctuations and Noise Letters, 7(3), 209–214.
Lusseau, D., Schneider, K., Boisseau, O. J., Haase, P., Slooten, E., & Dawson, S. M. (2003). The bottlenose dolphin community of doubtful sound features a large proportion of long-lasting associations. Behavioral Ecology and Sociobiology, 54(4), 396–405.
Milo, R., Itzkovitz, S., Kashtan, N., Levitt, R., Shen-Orr, S., Ayzenshtat, I., Sheffer, M., & Alon, U. (2004). Superfamilies of evolved and designed networks. Science, 303, 1538–1542.
Newman, M. E. J. (2006a). Finding community structure in networks using the eigenvectors of matrices. Physical Review E, 74(3), 036104.
Newman, M. E. J. (2006b). Modularity and community structure in networks. Proceedings of the National Academy of Sciences of the United States of America, 103(23), 8577–8582.
Newman, M. E. J. (2010). Networks: an introduction. London: Oxford University Press.
Newman, M. E. J., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E, 69(2), 026113.
Palla, G., Derényi, I., Farkas, I., & Vicsek, T. (2005). Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435, 814–818.
Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., & Parisi, D. (2004). Defining and identifying communities in networks. Proceedings of the National Academy of Sciences of the United States of America, 101(9), 2658–2663.
Reichardt, J., & Bornholdt, S. (2006). Statistical mechanics of community detection. Physical Review E, 74(1), 016110.
Sales-Pardo, M., Guimerà, R., Moreira, A., & Amaral, L. A. N. (2007). Extracting the hierarchical organization of complex systems. Proceedings of the National Academy of Sciences of the United States of America, 104(39), 15224–15229.
Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of ‘small-world’ networks. Nature, 393, 440–442.
Xu, G., Tsoka, S., & Papageorgiou, L. G. (2007). Finding community structures in complex networks using mixed integer optimisation. The European Physical Journal B, Condensed Matter and Complex Systems, 60(2), 231–239.
Zachary, W. W. (1977). An information flow model for conflict and fission in small groups. Journal of Anthropological Research, 33(4), 452–473.
Acknowledgements
The authors would like to thank the anonymous referees for their precious comments and suggestions. Financial support by Grants Digiteo 2009-14D “RMNCCO” and Digiteo 2009-55D “ARM” is gratefully acknowledged. P.H. was partially supported by fqrnt (Fonds de recherche du Québec—Nature et technologies) team grant PR-131365.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cafieri, S., Costa, A. & Hansen, P. Reformulation of a model for hierarchical divisive graph modularity maximization. Ann Oper Res 222, 213–226 (2014). https://doi.org/10.1007/s10479-012-1286-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-012-1286-z