Skip to main content

Detecting Communities in Complex Networks Using Formal Concept Analysis

  • Chapter
  • First Online:
Advances in Knowledge Discovery and Management

Part of the book series: Studies in Computational Intelligence ((SCI,volume 1004))

  • 222 Accesses

Abstract

The complex nature of many real-world networks is motivating researchers to investigate or extend network analysis methods such as centrality computation, link prediction, and community detection. One of these complex structures is the multilayer network in which each layer contains a network. Multilayer networks frequently possess complex local structures of multimodal data and interlinked relations. Thus, efficient detection of local communities in such networks often remains a key challenge. In this paper, we propose a community detection strategy, called CoDeBi, which leverages Formal Concept Analysis (FCA) to find possibly overlapping and nested communities in multilayer networks. At the preprocessing stage, we exploit operations such as apposition, subposition and composition on formal contexts—associated with individual layers—to generate a global formal context representing the whole multilayer network. At the first step of CoDeBi, we extract the formal concepts that capture groups in the global formal context while in the second step, we filter the extracted formal concepts to keep only the ones that have a high harmonic mean of stability and separation indices. Such groups represent core communities. In the third step, we detect final communities by refining the core groups using Silhouette Analysis. Our validation study shows that CoDeBi can accurately identify communities in bipartite graphs, and hence can be exploited for community detection in multilayer networks. Another contribution of this paper is the application of the attractive features of Triadic Concept Analysis and the adaptation of our approach to the analysis of tridimensional networks represented by a tridimensional adjacency matrix.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This diagram is produced using the public domain software called ConExp.

  2. 2.

    We sometimes use simplified notations for sets. E.g. 125 stands for \(\{1, 2, 5\}\) and ab for \(\{a, b\}\). We also write \((X_j, X_k) \subseteq K_j {\times } K_k\) to mean that \(X_j\subseteq K_j\) and \(X_k \subseteq K_k\).

  3. 3.

    The equation is slightly different from the general one proposed by Kuznetsov and Makhalova (2016) because the symbol \(\supseteq \) is more appropriate than \(=\) .

  4. 4.

    https://networkdata.ics.uci.edu/netdata/html/davis.html.

  5. 5.

    http://archive.ics.uci.edu/ml/datasets/zoo.

References

  • Berlingerio, M., Coscia, M., & Giannotti, F. (2011a). Finding and characterizing communities in multidimensional networks. In 2011 International Conference on Advances in Social Networks Analysis and Mining (pp. 490–494). IEEE.

    Google Scholar 

  • Berlingerio, M., Coscia, M., Giannotti, F., Monreale, A., & Pedreschi, D. (2011b). Foundations of multidimensional network analysis. In 2011 International Conference on Advances in Social Networks Analysis and Mining (ASONAM) (pp. 485–489). IEEE.

    Google Scholar 

  • Blondel, V. D., Guillaume, J.-L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10), P10008.

    Google Scholar 

  • Boccaletti, S., Bianconi, G., Criado, R., Genio, C. I. D., Gómez-Gardeñes, J., Romance, M., Sendiña-Nadal, I., Wang, Z., & Zanin, M. (2014). The structure and dynamics of multilayer networks. arXiv:abs/1407.0742.

  • Boden, B., Günnemann, S., Hoffmann, H., & Seidl, T. (2012). Mining coherent subgraphs in multi-layer graphs with edge labels. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1258–1266). ACM.

    Google Scholar 

  • Borgatti, S. P. (2009). 2-mode concepts in social network analysis. Encyclopedia of Complexity and System Science, 6, 8279–8291.

    Article  Google Scholar 

  • Boutemine, O., & Bouguessa, M. (2017). Mining community structures in multidimensional networks. TKDD, 11(4), 51:1–51:36.

    Google Scholar 

  • Buzmakov, A., Kuznetsov, S. O., & Napoli, A. (2014). Scalable estimates of concept stability. In International Conference on Formal Concept Analysis (pp. 157–172). Springer.

    Google Scholar 

  • Cerf, L., Besson, J., Robardet, C., & Boulicaut, J. (2009). Closed patterns meet n-ary relations. TKDD, 3(1), 3:1–3:36.

    Google Scholar 

  • Chakraborty, T., Dalmia, A., Mukherjee, A., & Ganguly, N. (2017). Metrics for community analysis: A survey. ACM Computing Surveys (CSUR), 50(4), 54.

    Google Scholar 

  • Collins, L. M., & Dent, C. W. (1988). Omega: A general formulation of the rand index of cluster recovery suitable for non-disjoint solutions. Multivariate Behavioral Research, 23(2), 231–242.

    Article  Google Scholar 

  • Crampes, M., & Plantié, M. (2012). Détection de communautés dans les graphes bipartis. In IC 2012 (p. 125).

    Google Scholar 

  • Dickison, M. E., Magnani, M., & Rossi, L. (2016). Multilayer Social Networks (1st edn). New York: Cambridge University Press.

    Google Scholar 

  • Dong, X., Frossard, P., Vandergheynst, P., & Nefedov, N. (2012). Clustering with multi-layer graphs: A spectral perspective. IEEE Transactions on Signal Processing, 60(11), 5820–5831.

    Article  MathSciNet  Google Scholar 

  • Du, N., Wang, B., Wu, B., & Wang, Y. (2008). Overlapping community detection in bipartite networks. In Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology-Volume 01 (pp. 176–179). IEEE Computer Society.

    Google Scholar 

  • Dunlavy, D. M., Kolda, T. G., & Kegelmeyer, W. P. (2011). Multilinear algebra for analyzing data with multiple linkages. In Graph Algorithms in the Language of Linear Algebra (pp. 85–114). SIAM.

    Google Scholar 

  • Everett, M. G., & Borgatti, S. P. (2013). The dual-projection approach for two-mode networks. Social Networks, 35(2), 204–210.

    Article  Google Scholar 

  • Fortunato, S. (2010). Community detection in graphs. Physics Reports, 486(3–5), 75–174.

    Article  MathSciNet  Google Scholar 

  • Ganter, B., & Obiedkov, S. A. (2016). Conceptual Exploration. Berlin: Springer.

    Google Scholar 

  • Ganter, B., & Wille, R. (1999). Formal Concept Analysis: Mathematical Foundations. New York: Springer. Translator-C. Franzke.

    Google Scholar 

  • Hacene, M. R., Huchard, M., Napoli, A., & Valtchev, P. (2013). Relational concept analysis: Mining concept lattices from multi-relational data. Annals of Mathematics and Artificial Intelligence, 67(1), 81–108.

    Article  MathSciNet  Google Scholar 

  • Hmimida, M., & Kanawati, R. (2015). Community detection in multiplex networks: A seed-centric approach. NHM, 10(1), 71–85.

    Article  MathSciNet  Google Scholar 

  • Ibrahim, M. H., & Missaoui, R. (2018). An efficient approximation of concept stability using low-discrepancy sampling. In Graph-Based Representation and Reasoning - 23rd International Conference on Conceptual Structures, ICCS 2018, Edinburgh, UK, June 20-22, 2018, Proceedings (pp. 24–38).

    Google Scholar 

  • Interdonato, R., Atzmueller, M., Gaito, S., Kanawati, R., Largeron, C., & Sala, A. (2019). Feature-rich networks: Going beyond complex network topologies. Applied Network Science, 4(1), 4:1–4:13.

    Google Scholar 

  • Jay, N., Kohler, F., & Napoli, A. (2008). Analysis of social communities with iceberg and stability-based concept lattices. In International Conference on Formal Concept Analysis (pp. 258–272). Springer.

    Google Scholar 

  • Kim, J., & Lee, J.-G. (2015). Community detection in multi-layer graphs: A survey. ACM SIGMOD Record, 44(3), 37–48.

    Article  Google Scholar 

  • Kivelä, M., Arenas, A., Barthelemy, M., Gleeson, J. P., Moreno, Y., & Porter, M. A. (2014). Multilayer networks. Journal of Complex Networks, 2(3), 203–271.

    Article  Google Scholar 

  • Klimushkin, M., Obiedkov, S., & Roth, C. (2010). Approaches to the selection of relevant concepts in the case of noisy data. In International Conference on Formal Concept Analysis (pp. 255–266). Springer.

    Google Scholar 

  • Kolda, T. G., & Bader, B. W. (2009). Tensor decompositions and applications. SIAM Review, 51(3), 455–500.

    Article  MathSciNet  Google Scholar 

  • Kuznetsov, S. O. (2007). On stability of a formal concept. Annals of Mathematics and Artificial Intelligence, 49(1), 101–115.

    Article  MathSciNet  Google Scholar 

  • Kuznetsov, S. O., & Makhalova, T. (2018). On interestingness measures of formal concepts. Information Sciences, 442, 202–219.

    Article  MathSciNet  Google Scholar 

  • Kuznetsov, S. O., & Makhalova, T. P. (2016). On stability of triadic concepts. In Proceedings of the Thirteenth International Conference on Concept Lattices and Their Applications, Moscow, Russia, July 18-22, 2016 (pp. 245–253).

    Google Scholar 

  • Lancichinetti, A., Radicchi, F., Ramasco, J. J., & Fortunato, S. (2010). Finding statistically significant communities in networks. arXiv:abs/1012.2363.

  • Lehmann, F., & Wille, R. (1995). A triadic approach to formal concept analysis. In ICCS (pp. 32–43).

    Google Scholar 

  • Lehmann, S., Schwartz, M., & Hansen, L. K. (2008). Biclique communities. Physical Review E, 78(1), 016108.

    Google Scholar 

  • Li, H., Nie, Z., Lee, W.-C., Giles, L., & Wen, J.-R. (2008). Scalable community discovery on textual data with relations. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (pp. 1203–1212). ACM.

    Google Scholar 

  • Messaoudi, A., Missaoui, R., & Ibrahim, M. H. (2019). Detecting overlapping communities in two-mode data networks using formal concept analysis. Revue des Nouvelles Technologies de l’Information, Extraction et Gestion des connaissances, RNTI-E-35, 189–200.

    Google Scholar 

  • Newman, M. E., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E, 69(2), 026113.

    Google Scholar 

  • Newman, M. E. J. (2003). The structure and function of complex networks. SIAM Review, 45(2), 167–256.

    Article  MathSciNet  Google Scholar 

  • Nicosia, V., Mangioni, G., Carchiolo, V., & Malgeri, M. (2009). Extending the definition of modularity to directed graphs with overlapping communities. Journal of Statistical Mechanics: Theory and Experiment, 2009(03), 3–24.

    Article  Google Scholar 

  • Palla, G., Derényi, I., Farkas, I., & Vicsek, T. (2005). Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435(7043), 814.

    Google Scholar 

  • Potgieter, A., April, K. A., Cooke, R. J., & Osunmakinde, I. O. (2009). Temporality in link prediction: Understanding social complexity. Emergence: Complexity & Organization (E: CO), 11(1), 69–83.

    Google Scholar 

  • Rosvall, M., & Bergstrom, C. T. (2008). Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences, 105(4), 1118–1123.

    Article  Google Scholar 

  • Roth, C., Obiedkov, S., & Kourie, D. G. (2008). On succinct representation of knowledge community taxonomies with formal concept analysis. International Journal of Foundations of Computer Science, 19(02), 383–404.

    Article  MathSciNet  Google Scholar 

  • Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53–65.

    Article  Google Scholar 

  • Salehi, M., Sharma, R., Marzolla, M., Magnani, M., Siyari, P., & Montesi, D. (2015). Spreading processes in multilayer networks. IEEE Transactions on Network Science and Engineering, 2(2), 65–83.

    Article  Google Scholar 

  • Shi, C., Li, Y., Zhang, J., Sun, Y., & Philip, S. Y. (2017). A survey of heterogeneous information network analysis. IEEE Transactions on Knowledge and Data Engineering, 29(1), 17–37.

    Article  Google Scholar 

  • Silva, A., Meira, W., Jr., & Zaki, M. J. (2012). Mining attribute-structure correlated patterns in large attributed graphs. Proceedings of the VLDB Endowment, 5(5), 466–477.

    Article  Google Scholar 

  • Sun, Y., & Han, J. (2012). Mining Heterogeneous Information Networks: Principles and Methodologies. Synthesis Lecture on Data Mining and Knowledge Discovery. San Rafael: Morgan & Claypool Publishers.

    Google Scholar 

  • Tang, L., & Liu, H. (2010). Community detection and mining in social media. Synthesis Lectures on Data Mining and Knowledge Discovery, 2(1), 1–137.

    Article  MathSciNet  Google Scholar 

  • Tang, W., Lu, Z., & Dhillon, I. S. (2009). Clustering with multiple graphs. In 2009 Ninth IEEE International Conference on Data Mining (pp. 1016–1021). IEEE.

    Google Scholar 

  • Valtchev, P., & Missaoui, R. (2001). Building concept (galois) lattices from parts: Generalizing the incremental methods. In Conceptual Structures: Broadening the Base, 9th International Conference on Conceptual Structures, ICCS 2001, Stanford, CA, USA, July 30-August 3, 2001, Proceedings (pp. 290–303).

    Google Scholar 

  • Valtchev, P., Missaoui, R., & Lebrun, P. (2002). A partition-based approach towards constructing galois (concept) lattices. Discrete Mathematics, 256(3), 801–829.

    Article  MathSciNet  Google Scholar 

  • Wang, Q., & Fleury, E. (2013). Overlapping community structure and modular overlaps in complex networks. Mining Social Networks and Security Informatics (pp. 15–40). Berlin: Springer.

    Google Scholar 

  • Wille, R. (1995). The basic theorem of triadic concept analysis. Order, 12(2), 149–158.

    Article  MathSciNet  Google Scholar 

  • Wille, R. (1996). Conceptual structures of multicontexts. In International Conference on Conceptual Structures (pp. 23–39). Springer.

    Google Scholar 

  • Xie, J., Kelley, S., & Szymanski, B. K. (2013). Overlapping community detection in networks: The state-of-the-art and comparative study. ACM computing surveys (csur), 45(4), 43.

    Google Scholar 

  • Xu, Z., Ke, Y., Wang, Y., Cheng, H., & Cheng, J. (2012). A model-based approach to attributed graph clustering. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (pp. 505–516). ACM.

    Google Scholar 

  • Zeng, Z., Wang, J., Zhou, L., & Karypis, G. (2006). Coherent closed quasi-clique discovery from large dense graph databases. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 797–802). ACM.

    Google Scholar 

  • Zhang, S., Wang, R.-S., & Zhang, X. (2007). Identification of overlapping community structure in complex networks using fuzzy c-means clustering. Physica A: Statistical Mechanics and its Applications, 374, 483–490.

    Google Scholar 

  • Zhou, K., Martin, A., & Pan, Q. (2015). Evidential communities for complex networks. arXiv:abs/1501.01780.

  • Zhou, Y., Cheng, H., & Yu, J. X. (2009). Graph clustering based on structural/attribute similarities. Proceedings of the VLDB Endowment, 2(1), 718–729.

    Article  Google Scholar 

Download references

Acknowledgements

The first author acknowledges the financial support of the Natural Sciences and Engineering Research Council of Canada (NSERC). All the authors are grateful to the reviewers for their relevant comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rokia Missaoui .

Editor information

Editors and Affiliations

Appendix

Appendix

Omega index (Collins and Dent 1988) counts the number of node pairs without community assignment as well as those which are in exactly one community, two communities, and so on Chakraborty et al. (2017).

Let \(R=\{R_1, R_2, \ldots , R_J\}\) be the set of the J ground-truth communities in the graph of size N, and \(C=\{C_1, C_2, \ldots , C_K\}\) the set of detected communities. The Omega index is then defined as follows:

$$\begin{aligned} Omega \left( C,R\right) = \frac{Omega_u \left( C,R\right) -Omega_e \left( C,R\right) }{1-Omega_e \left( C,R\right) } \end{aligned}$$
(7)

where the unadjusted omega index Omega\(_u\) is defined as

$$\begin{aligned} Omega_u \left( C,R\right) = \frac{1}{M} \sum _{j=1} \max (|C|,|R|)|t_j(c_i)\cap t_j(r_j)| \end{aligned}$$
(8)

where \(M = N(N - 1)/2\) represents the number of node pairs, and \(t_j (R)\) is the set of pairs that appear exactly j times in the ground-truth set R. Finally, the expected omega index Omega\(_e\) is given by

$$\begin{aligned} Omega_e \left( C, R\right) = \frac{1}{M^{2}} \sum _{j=1} \max (|C|,|R|)|t_j(c_i)|\cdot | t_j(r_j)| \end{aligned}$$
(9)

The computation of the overlapping Normalized Mutual Information is as follows. For each node i in the detected community structure C, its community membership can be declared as a binary vector of length |C|, where \((x_i)_k\) is set to 1 if node i is a member of the k-th cluster \(C_k\), and 0 otherwise. The k-th entry of this vector can be viewed as a random variable \(X_k\) whose probability distribution is given by:

\(P(X_k = k) = N_k/N\), \(P(X_k = 0)= 1 - P(X_k =1)\), where \(N_k= |C|\), and N is the number of nodes in the graph. The same holds for the random variable \(Y_l\) associated with the \(l-\)th cluster in the ground truth community structure R.

To define the entropy measures H(X) and \(H(X_k, Y_l)\), both the empirical marginal probability distribution \(P(X_k)\) and the joint probability distribution \(P(X_k, Y_l)\) are needed. The conditional entropy of a cluster \(X_k\) given \(Y_l\) is defined as \(H(X_k|Y_l) = H(X_k, Y_l) - H(Y_l)\). The entropy of \(X_k\) with respect to the entire vector Y is based on the best matching between \(X_k\) and any component of Y:

$$\begin{aligned} H (X_k|Y) = min_{l=1,..,|R|} H (X_k|Y_l) \end{aligned}$$
(10)

The normalized conditional entropy of a community X with respect to Y is

$$\begin{aligned} H(X|Y) = \frac{1}{R} \sum _{k} \frac{H(X_k|y)}{H(X_k)} \end{aligned}$$
(11)

Similarly, we define H(Y|X).

Then, the final Overlapping Normalized Mutual Information formula for two community structures C and R is given by :

$$\begin{aligned} ONMI(X|Y ) = 1 - [H(X|Y ) + H(Y |X)]/2 \end{aligned}$$
(12)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Missaoui, R., Messaoudi, A., Ibrahim, M.H., Abdessalem, T. (2022). Detecting Communities in Complex Networks Using Formal Concept Analysis. In: Jaziri, R., Martin, A., Rousset, MC., Boudjeloud-Assala, L., Guillet, F. (eds) Advances in Knowledge Discovery and Management. Studies in Computational Intelligence, vol 1004. Springer, Cham. https://doi.org/10.1007/978-3-030-90287-2_5

Download citation

Publish with us

Policies and ethics