Skip to main content
Log in

An efficient method for mining the maximal α-quasi-clique-community of a given node in complex networks

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

Detecting communities in large complex networks is important to understand their structure and to extract features useful for visualization or prediction of various phenomena like the diffusion of information or the dynamic of the network. A community is defined by a set of strongly interconnected nodes. An α-quasi-clique is a group of nodes where each member is connected to more than a proportion α of the other nodes. By construction, an α-quasi-clique has a density greater than α. The size of an α-quasi-clique is limited by the degree of its nodes. In complex networks whose degree distribution follows a power law, usually α-quasi-cliques are small sets of nodes for high values of α. In this paper, we present an efficient method for finding the maximal α-quasi-clique of a given node in the network. Therefore, the resulting communities of our method have two main characteristics: they are α-quasi-cliques (very dense for high α) and they are local to the given node. Detecting the local community of specific nodes is very important for applications dealing with huge networks, when iterating through all nodes would be impractical or when the network is not entirely known. The proposed method, called RANK-NUM-NEIGHS (RNN), is evaluated experimentally on real and computer-generated networks in terms of quality (community size), execution time and stability. We also provide an upper bound on the optimal solution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. The density of links \(\delta \) of a graph G with |E| edges et |V| nodes is given by \(\frac{2|E|}{|V|(|V|-1)}\).

  2. A complete clique is a set of node such as every two distinct nodes are connected to each other.

  3. This Definition of an \(\alpha \)-quasi-clique is not unique. Most authors define an \(\alpha \)-quasi-clique as a set of nodes that have a density greater than \(\alpha \), see for instance (Abello et al. 2002). The Definition considered in this paper constitutes a relative relaxation of a complete clique as it depends on the size of the quasi-clique.

  4. Notice that we used the word maximal instead of maximum. In graph theory a maximal clique is a clique which is not a proper subset of another clique whereas a maximum clique is a clique of the maximum cardinality in the graph. Since we aim to find \(\alpha \)-quasi-cliques containing a given node of interest we are looking for maximal \(\alpha \)-quasi-cliques instead of for the maximum \(\alpha \)-quasi-clique.

  5. If \(\alpha <0.5\), Theorem 1 does not hold anymore, then the input of the Algorithm will not be limited to the second neighborhood, but the whole graph specially for alpha small.

  6. The average degree is 20, the maximum degree 50, the exponent of the degree distribution is − 2 and that of the community size distribution is − 1. We chose three values of mixing parameter \(\lambda \), 0.10, 0.20 and 0.30. The results presented in this paper are those for \(\lambda =0.10\) to evaluate size, density and stability. The results have nearly the same behavior for the other values of mixing parameter.

  7. The results obtained for other network sizes have nearly the same behavior.

References

  • Abello J, Resende MGC, Sudarsky S (2002) Massive quasi-clique detection. In: Proceedings of the 5th Latin American symposium on theoretical informatics, LATIN ’02. Springer, London, pp 598–612

  • Adamic LA, Glance N (2005) The political blogosphere and the 2004 U.S. election. In: Proceedings of the WWW-2005 workshop on the weblogging ecosystem. ACM New York, pp 36–43

  • Akoglu L, Mcglohon M, Faloutsos C (2009) Anomaly detection in large graphs. In: In CMU-CS-09-173 technical report

  • Asahiro Y, Hassin R, Iwama K (2002) Complexity of finding dense subgraphs. Discrete Appl Math 121(1–3):15–26. https://doi.org/10.1016/S0166-218X(01)00243-8

    Article  MathSciNet  MATH  Google Scholar 

  • Bagrow JP (2008) Evaluating local community methods in networks. J Stat Mech 2008:05001

    Article  Google Scholar 

  • Bahmani B, Kumar R, Vassilvitskii S (2012) Densest subgraph in streaming and mapreduce. CoRR abs/1201.6567. http://arxiv.org/abs/1201.6567

  • Battiti R, Mascia F (2007) Reactive local search for maximum clique: a new implementation. Technical report DIT-07-018, Informatica e Telecomunicazioni, University of Trento, Trento, Italy

  • Battiti R, Protasi M (2001) Reactive local search for the maximum clique problem. Algorithmica 29(4):610

    Article  MathSciNet  MATH  Google Scholar 

  • Ben-Dor A, Shamir R, Yakhini Z (1999) Clustering gene expression patterns. J Comput Biol 6(3):281–297

    Article  Google Scholar 

  • Blondel VD, Guillaume J, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008:P10008

    Article  Google Scholar 

  • Bomze IM, Budinich M, Pardalos PM, Pelillo M (1999) The maximum clique problem. In: Du D-Z, Pardalos PM (eds) Handbook of combinatorial optimization. Kluwer Academic Publishers, Dordrecht, pp 1–74

    Google Scholar 

  • Brunato M, Hoos HH, Battiti R (2007) On effectively finding maximal quasi-cliques in graphs. In: Maniezzo V, Battiti R, Watson JP (eds) LION, vol 5313. Lecture Notes in Computer Science. Springer, Berlin, pp 41–55

    Google Scholar 

  • Campigotto R, Conde-Céspedes P, Guillaume J (2014) A generalized and adaptive method for community detection. CoRR abs/1406.2518 http://arxiv.org/abs/1406.2518

  • Chen J, Saad Y (2012) Dense subgraph extraction with application to community detection. IEEE Trans Know Data Eng 24(7):1216–1230

    Article  Google Scholar 

  • Chen J, Zaiane OR, Goebel R (2009) Local communities identification in social networks. In: ASONAM, pp 237–242

  • Clauset A (2005) Finding local community structure in networks. Phys Rev 72:026132

    Google Scholar 

  • Conde-Céspedes P, Marcotorchino J, Viennet E (2015) Comparison of linear modularization criteria using the relational formalism, an approach to easily identify resolution limit. Revue des Nouvelles Technologies de l’Information Extraction et Gestion des Connaissances, RNTI-E-28, pp 203–214

  • Conde-Céspedes P, Marcotorchino JF, Viennet E (2017) Comparison of linear modularization criteria using the relational formalism, an approach to easily identify resolution limit. In: Guillet F, Pinaud B, Venturini G (eds) Advances in knowledge discovery and management (AKDM-6). Springer, Cham, pp 101–120

    Chapter  Google Scholar 

  • Conde-Céspedes P, Ngonmang B, Viennet E(2015) Approximation of the maximal \(\alpha \)-consensus local community detection problem in complex networks. In: IEEE SITIS 2015, complex networks and their applications. Bangkok, Thailand

  • Condorcet CAMd (1785) Essai sur l’application de l’analyse à la probabilité des décisions rendues à la pluralité des voix. J Math Sociol 1(1): 113–120

  • Cui W, Xiao Y, Wang H, Wang W (2014) Local search of communities in large graphs. In: Proceedings of the 2014 ACM SIGMOD international conference on management of data, SIGMOD ’14. ACM, New York, pp 991–1002

  • Dang TA, Viennet E (2012) Community detection based on structural and attribute similarities. In: International conference on digital society (ICDS), pp 7–14

  • Dang TA, Viennet E (2013) Collaborative filtering in social networks: a community-based approach. In: IEEE ComManTel 2013, international conference on computing, management and telecommunications

  • Fortunato S (2010) Community detection in graphs. Phys Rep 486:75–174

    Article  MathSciNet  Google Scholar 

  • Fortunato S, Barthelemy M (2006) Resolution limit in community detection. In: Proceedings of the National Academy of Sciences of the United States of America

  • Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci U. S. A. 99(12):7821–7826

    Article  MathSciNet  MATH  Google Scholar 

  • Harary F, Ross IC (1957) A procedure for clique detection using the group matrix. Sociometry 20:205–215

    Article  MathSciNet  Google Scholar 

  • Karp RM (1972) Reducibility among combinatorial problems. In: Miller RE, Thatcher JW (eds) Complexity of computer computations, the IBM research symposia series. Plenum Press, New York, pp 85–103

    Chapter  Google Scholar 

  • Komusiewicz C (2016) Multivariate algorithmics for finding cohesive subnetworks. Algorithms 9(1):21

    Article  MathSciNet  Google Scholar 

  • Krebs V (2004) Books about US politics http://www.orgnet.com/

  • Lancichinetti A, Fortunato S, Radicchi F (2008) Benchmark graphs for testing community detection algorithms. Phys Rev E 78(4):046110

    Article  Google Scholar 

  • Lee VE, Ruan N, Jin R, Aggarwal CC (2010) A survey of algorithms for dense subgraph discovery. In: Aggarwal CC, Wang H (eds) Managing and mining graph data, advances in database systems, vol 40. Springer, Berlin, pp 303–336

    Chapter  Google Scholar 

  • Liang R, Hua J, Wang X (2012) Vcdanetwork visualization tool based on community detection. In: 2012 12th international conference on control, automation and systems (ICCAS), pp 1221–1226

  • Liu G, Wong L (2008) Effective pruning techniques for mining quasi-cliques. In: Daelemans W, Goethals B, Morik K (eds) Machine learning and knowledge discovery in databases, vol 5212. Lecture notes in computer science. Springer, Berlin, pp 33–49

    Chapter  Google Scholar 

  • Luo F, Wang JZ, Promislow E (2006) Exploring local community structure in large networks. In: WI’06., pp 233–239

  • Marcotorchino F, Michaud P (1979) Optimisation en analyse ordinale des données. Masson, Paris

    MATH  Google Scholar 

  • Matsuda H, Ishihara T, Hashimoto A (1999) Classifying molecular sequences using a linkage graph with their pairwise similarities. Theor Comput Sci 210(2):305–325

    Article  MathSciNet  MATH  Google Scholar 

  • Newman M, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113

    Article  Google Scholar 

  • Ngonmang B, Tchuente M, Viennet E (2012) Local communities identification in social networks. Parallel Process Lett. https://doi.org/10.1142/S012962641240004X

    Article  MATH  Google Scholar 

  • Ngonmang B, Viennet E, Tchuente M(2012) Churn prediction in a real online social network using local community analysis. In: International conference on advances in social networks analysis and mining, In: ASONAM 2012, Istanbul, Turkey, 26–29 August 2012, pp 282–288

  • Owsiński J, Zadrożny S (1986) Clustering for ordinal data: a linear programming formulation. Control Cybern 15(2):183–193

    MATH  Google Scholar 

  • Pattillo J, Veremyev A, Butenko S, Boginski V (2013) On the maximum quasi-clique problem. Discret Appl Math 161:244–257

    Article  MathSciNet  MATH  Google Scholar 

  • Pattillo J, Youssef N, Butenko S (2013) On clique relaxation models in network analysis. Eur J Oper Res 226(1):9–18

    Article  MathSciNet  MATH  Google Scholar 

  • Pei J, Jiang D, Zhang A (2005) On mining cross-graph quasi-cliques. In: Proceedings of the eleventh ACM SIGKDD international conference on knowledge discovery in data mining, KDD ’05. ACM, New York, pp 228–238

  • Pullan WJ, Hoos HH (2006) Dynamic local search for the maximum clique problem. J Artif Intell Res (JAIR) 25:159–185

    Article  MATH  Google Scholar 

  • Tanay A, Sharan R, Shamir R (202) Discovering statistically significant biclusters in gene expression data. In: Proceedings of ISMB 2002, pp 136–144

  • Tsourakakis C, Bonchi F, Gionis A, Gullo F, Tsiarli M (2013) Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’13. ACM, New York, pp 104–112

  • Wu Q, Hao JK (2015) A review on algorithms for maximum clique problems. Eur J Oper Res 242(3):693–709

    Article  MathSciNet  MATH  Google Scholar 

  • Yang J, Leskovec J (2014) Overlapping communities explain core-periphery organization of networks. Technical report, Stanford University . http://ilpubs.stanford.edu:8090/1103/

  • Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33(4):452–473

    Article  Google Scholar 

  • Zahn C (1964) Approximating symmetric relations by equivalence relations. SIAM J Appl Math 12:840–847

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang Y, Lin H, Yang Z, Wang J (2016) Construction of dynamic probabilistic protein interaction networks for protein complex identification. BMC Bioinform. https://doi.org/10.1186/s12859-016-1054-1

Download references

Acknowledgements

This work is supported by REQUEST project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Patricia Conde-Cespedes.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Conde-Cespedes, P., Ngonmang, B. & Viennet, E. An efficient method for mining the maximal α-quasi-clique-community of a given node in complex networks. Soc. Netw. Anal. Min. 8, 20 (2018). https://doi.org/10.1007/s13278-018-0497-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-018-0497-y

Keywords

Navigation