Detecting Communities in Complex Networks Using Formal Concept Analysis

Missaoui, Rokia; Messaoudi, Abir; Ibrahim, Mohamed Hamza; Abdessalem, Talel

doi:10.1007/978-3-030-90287-2_5

Rokia Missaoui⁷,
Abir Messaoudi⁷,
Mohamed Hamza Ibrahim^7,8 &
…
Talel Abdessalem⁹

Part of the book series: Studies in Computational Intelligence ((SCI,volume 1004))

222 Accesses

Abstract

The complex nature of many real-world networks is motivating researchers to investigate or extend network analysis methods such as centrality computation, link prediction, and community detection. One of these complex structures is the multilayer network in which each layer contains a network. Multilayer networks frequently possess complex local structures of multimodal data and interlinked relations. Thus, efficient detection of local communities in such networks often remains a key challenge. In this paper, we propose a community detection strategy, called CoDeBi, which leverages Formal Concept Analysis (FCA) to find possibly overlapping and nested communities in multilayer networks. At the preprocessing stage, we exploit operations such as apposition, subposition and composition on formal contexts—associated with individual layers—to generate a global formal context representing the whole multilayer network. At the first step of CoDeBi, we extract the formal concepts that capture groups in the global formal context while in the second step, we filter the extracted formal concepts to keep only the ones that have a high harmonic mean of stability and separation indices. Such groups represent core communities. In the third step, we detect final communities by refining the core groups using Silhouette Analysis. Our validation study shows that CoDeBi can accurately identify communities in bipartite graphs, and hence can be exploited for community detection in multilayer networks. Another contribution of this paper is the application of the attractive features of Triadic Concept Analysis and the adaptation of our approach to the analysis of tridimensional networks represented by a tridimensional adjacency matrix.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Softcover Book: USD 179.99; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
This diagram is produced using the public domain software called ConExp.
2.
We sometimes use simplified notations for sets. E.g. 125 stands for $\{1, 2, 5\}$ and ab for $\{a, b\}$. We also write $(X_j, X_k) \subseteq K_j {\times } K_k$ to mean that $X_j\subseteq K_j$ and $X_k \subseteq K_k$.
3.
The equation is slightly different from the general one proposed by Kuznetsov and Makhalova (2016) because the symbol $\supseteq $ is more appropriate than $=$ .
4.
https://networkdata.ics.uci.edu/netdata/html/davis.html.
5.
http://archive.ics.uci.edu/ml/datasets/zoo.

References

Berlingerio, M., Coscia, M., & Giannotti, F. (2011a). Finding and characterizing communities in multidimensional networks. In 2011 International Conference on Advances in Social Networks Analysis and Mining (pp. 490–494). IEEE.
Google Scholar
Berlingerio, M., Coscia, M., Giannotti, F., Monreale, A., & Pedreschi, D. (2011b). Foundations of multidimensional network analysis. In 2011 International Conference on Advances in Social Networks Analysis and Mining (ASONAM) (pp. 485–489). IEEE.
Google Scholar
Blondel, V. D., Guillaume, J.-L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10), P10008.
Google Scholar
Boccaletti, S., Bianconi, G., Criado, R., Genio, C. I. D., Gómez-Gardeñes, J., Romance, M., Sendiña-Nadal, I., Wang, Z., & Zanin, M. (2014). The structure and dynamics of multilayer networks. arXiv:abs/1407.0742.
Boden, B., Günnemann, S., Hoffmann, H., & Seidl, T. (2012). Mining coherent subgraphs in multi-layer graphs with edge labels. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1258–1266). ACM.
Google Scholar
Borgatti, S. P. (2009). 2-mode concepts in social network analysis. Encyclopedia of Complexity and System Science, 6, 8279–8291.
Article Google Scholar
Boutemine, O., & Bouguessa, M. (2017). Mining community structures in multidimensional networks. TKDD, 11(4), 51:1–51:36.
Google Scholar
Buzmakov, A., Kuznetsov, S. O., & Napoli, A. (2014). Scalable estimates of concept stability. In International Conference on Formal Concept Analysis (pp. 157–172). Springer.
Google Scholar
Cerf, L., Besson, J., Robardet, C., & Boulicaut, J. (2009). Closed patterns meet n-ary relations. TKDD, 3(1), 3:1–3:36.
Google Scholar
Chakraborty, T., Dalmia, A., Mukherjee, A., & Ganguly, N. (2017). Metrics for community analysis: A survey. ACM Computing Surveys (CSUR), 50(4), 54.
Google Scholar
Collins, L. M., & Dent, C. W. (1988). Omega: A general formulation of the rand index of cluster recovery suitable for non-disjoint solutions. Multivariate Behavioral Research, 23(2), 231–242.
Article Google Scholar
Crampes, M., & Plantié, M. (2012). Détection de communautés dans les graphes bipartis. In IC 2012 (p. 125).
Google Scholar
Dickison, M. E., Magnani, M., & Rossi, L. (2016). Multilayer Social Networks (1st edn). New York: Cambridge University Press.
Google Scholar
Dong, X., Frossard, P., Vandergheynst, P., & Nefedov, N. (2012). Clustering with multi-layer graphs: A spectral perspective. IEEE Transactions on Signal Processing, 60(11), 5820–5831.
Article MathSciNet Google Scholar
Du, N., Wang, B., Wu, B., & Wang, Y. (2008). Overlapping community detection in bipartite networks. In Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology-Volume 01 (pp. 176–179). IEEE Computer Society.
Google Scholar
Dunlavy, D. M., Kolda, T. G., & Kegelmeyer, W. P. (2011). Multilinear algebra for analyzing data with multiple linkages. In Graph Algorithms in the Language of Linear Algebra (pp. 85–114). SIAM.
Google Scholar
Everett, M. G., & Borgatti, S. P. (2013). The dual-projection approach for two-mode networks. Social Networks, 35(2), 204–210.
Article Google Scholar
Fortunato, S. (2010). Community detection in graphs. Physics Reports, 486(3–5), 75–174.
Article MathSciNet Google Scholar
Ganter, B., & Obiedkov, S. A. (2016). Conceptual Exploration. Berlin: Springer.
Google Scholar
Ganter, B., & Wille, R. (1999). Formal Concept Analysis: Mathematical Foundations. New York: Springer. Translator-C. Franzke.
Google Scholar
Hacene, M. R., Huchard, M., Napoli, A., & Valtchev, P. (2013). Relational concept analysis: Mining concept lattices from multi-relational data. Annals of Mathematics and Artificial Intelligence, 67(1), 81–108.
Article MathSciNet Google Scholar
Hmimida, M., & Kanawati, R. (2015). Community detection in multiplex networks: A seed-centric approach. NHM, 10(1), 71–85.
Article MathSciNet Google Scholar
Ibrahim, M. H., & Missaoui, R. (2018). An efficient approximation of concept stability using low-discrepancy sampling. In Graph-Based Representation and Reasoning - 23rd International Conference on Conceptual Structures, ICCS 2018, Edinburgh, UK, June 20-22, 2018, Proceedings (pp. 24–38).
Google Scholar
Interdonato, R., Atzmueller, M., Gaito, S., Kanawati, R., Largeron, C., & Sala, A. (2019). Feature-rich networks: Going beyond complex network topologies. Applied Network Science, 4(1), 4:1–4:13.
Google Scholar
Jay, N., Kohler, F., & Napoli, A. (2008). Analysis of social communities with iceberg and stability-based concept lattices. In International Conference on Formal Concept Analysis (pp. 258–272). Springer.
Google Scholar
Kim, J., & Lee, J.-G. (2015). Community detection in multi-layer graphs: A survey. ACM SIGMOD Record, 44(3), 37–48.
Article Google Scholar
Kivelä, M., Arenas, A., Barthelemy, M., Gleeson, J. P., Moreno, Y., & Porter, M. A. (2014). Multilayer networks. Journal of Complex Networks, 2(3), 203–271.
Article Google Scholar
Klimushkin, M., Obiedkov, S., & Roth, C. (2010). Approaches to the selection of relevant concepts in the case of noisy data. In International Conference on Formal Concept Analysis (pp. 255–266). Springer.
Google Scholar
Kolda, T. G., & Bader, B. W. (2009). Tensor decompositions and applications. SIAM Review, 51(3), 455–500.
Article MathSciNet Google Scholar
Kuznetsov, S. O. (2007). On stability of a formal concept. Annals of Mathematics and Artificial Intelligence, 49(1), 101–115.
Article MathSciNet Google Scholar
Kuznetsov, S. O., & Makhalova, T. (2018). On interestingness measures of formal concepts. Information Sciences, 442, 202–219.
Article MathSciNet Google Scholar
Kuznetsov, S. O., & Makhalova, T. P. (2016). On stability of triadic concepts. In Proceedings of the Thirteenth International Conference on Concept Lattices and Their Applications, Moscow, Russia, July 18-22, 2016 (pp. 245–253).
Google Scholar
Lancichinetti, A., Radicchi, F., Ramasco, J. J., & Fortunato, S. (2010). Finding statistically significant communities in networks. arXiv:abs/1012.2363.
Lehmann, F., & Wille, R. (1995). A triadic approach to formal concept analysis. In ICCS (pp. 32–43).
Google Scholar
Lehmann, S., Schwartz, M., & Hansen, L. K. (2008). Biclique communities. Physical Review E, 78(1), 016108.
Google Scholar
Li, H., Nie, Z., Lee, W.-C., Giles, L., & Wen, J.-R. (2008). Scalable community discovery on textual data with relations. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (pp. 1203–1212). ACM.
Google Scholar
Messaoudi, A., Missaoui, R., & Ibrahim, M. H. (2019). Detecting overlapping communities in two-mode data networks using formal concept analysis. Revue des Nouvelles Technologies de l’Information, Extraction et Gestion des connaissances, RNTI-E-35, 189–200.
Google Scholar
Newman, M. E., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E, 69(2), 026113.
Google Scholar
Newman, M. E. J. (2003). The structure and function of complex networks. SIAM Review, 45(2), 167–256.
Article MathSciNet Google Scholar
Nicosia, V., Mangioni, G., Carchiolo, V., & Malgeri, M. (2009). Extending the definition of modularity to directed graphs with overlapping communities. Journal of Statistical Mechanics: Theory and Experiment, 2009(03), 3–24.
Article Google Scholar
Palla, G., Derényi, I., Farkas, I., & Vicsek, T. (2005). Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435(7043), 814.
Google Scholar
Potgieter, A., April, K. A., Cooke, R. J., & Osunmakinde, I. O. (2009). Temporality in link prediction: Understanding social complexity. Emergence: Complexity & Organization (E: CO), 11(1), 69–83.
Google Scholar
Rosvall, M., & Bergstrom, C. T. (2008). Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences, 105(4), 1118–1123.
Article Google Scholar
Roth, C., Obiedkov, S., & Kourie, D. G. (2008). On succinct representation of knowledge community taxonomies with formal concept analysis. International Journal of Foundations of Computer Science, 19(02), 383–404.
Article MathSciNet Google Scholar
Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53–65.
Article Google Scholar
Salehi, M., Sharma, R., Marzolla, M., Magnani, M., Siyari, P., & Montesi, D. (2015). Spreading processes in multilayer networks. IEEE Transactions on Network Science and Engineering, 2(2), 65–83.
Article Google Scholar
Shi, C., Li, Y., Zhang, J., Sun, Y., & Philip, S. Y. (2017). A survey of heterogeneous information network analysis. IEEE Transactions on Knowledge and Data Engineering, 29(1), 17–37.
Article Google Scholar
Silva, A., Meira, W., Jr., & Zaki, M. J. (2012). Mining attribute-structure correlated patterns in large attributed graphs. Proceedings of the VLDB Endowment, 5(5), 466–477.
Article Google Scholar
Sun, Y., & Han, J. (2012). Mining Heterogeneous Information Networks: Principles and Methodologies. Synthesis Lecture on Data Mining and Knowledge Discovery. San Rafael: Morgan & Claypool Publishers.
Google Scholar
Tang, L., & Liu, H. (2010). Community detection and mining in social media. Synthesis Lectures on Data Mining and Knowledge Discovery, 2(1), 1–137.
Article MathSciNet Google Scholar
Tang, W., Lu, Z., & Dhillon, I. S. (2009). Clustering with multiple graphs. In 2009 Ninth IEEE International Conference on Data Mining (pp. 1016–1021). IEEE.
Google Scholar
Valtchev, P., & Missaoui, R. (2001). Building concept (galois) lattices from parts: Generalizing the incremental methods. In Conceptual Structures: Broadening the Base, 9th International Conference on Conceptual Structures, ICCS 2001, Stanford, CA, USA, July 30-August 3, 2001, Proceedings (pp. 290–303).
Google Scholar
Valtchev, P., Missaoui, R., & Lebrun, P. (2002). A partition-based approach towards constructing galois (concept) lattices. Discrete Mathematics, 256(3), 801–829.
Article MathSciNet Google Scholar
Wang, Q., & Fleury, E. (2013). Overlapping community structure and modular overlaps in complex networks. Mining Social Networks and Security Informatics (pp. 15–40). Berlin: Springer.
Google Scholar
Wille, R. (1995). The basic theorem of triadic concept analysis. Order, 12(2), 149–158.
Article MathSciNet Google Scholar
Wille, R. (1996). Conceptual structures of multicontexts. In International Conference on Conceptual Structures (pp. 23–39). Springer.
Google Scholar
Xie, J., Kelley, S., & Szymanski, B. K. (2013). Overlapping community detection in networks: The state-of-the-art and comparative study. ACM computing surveys (csur), 45(4), 43.
Google Scholar
Xu, Z., Ke, Y., Wang, Y., Cheng, H., & Cheng, J. (2012). A model-based approach to attributed graph clustering. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (pp. 505–516). ACM.
Google Scholar
Zeng, Z., Wang, J., Zhou, L., & Karypis, G. (2006). Coherent closed quasi-clique discovery from large dense graph databases. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 797–802). ACM.
Google Scholar
Zhang, S., Wang, R.-S., & Zhang, X. (2007). Identification of overlapping community structure in complex networks using fuzzy c-means clustering. Physica A: Statistical Mechanics and its Applications, 374, 483–490.
Google Scholar
Zhou, K., Martin, A., & Pan, Q. (2015). Evidential communities for complex networks. arXiv:abs/1501.01780.
Zhou, Y., Cheng, H., & Yu, J. X. (2009). Graph clustering based on structural/attribute similarities. Proceedings of the VLDB Endowment, 2(1), 718–729.
Article Google Scholar

Download references

Acknowledgements

The first author acknowledges the financial support of the Natural Sciences and Engineering Research Council of Canada (NSERC). All the authors are grateful to the reviewers for their relevant comments and suggestions.

Author information

Authors and Affiliations

Université du Québec en Outaouais (UQO), Gatineau, Canada
Rokia Missaoui, Abir Messaoudi & Mohamed Hamza Ibrahim
Zagazig University, Zagazig, Egypt
Mohamed Hamza Ibrahim
Télécom Paris, Paris, France
Talel Abdessalem

Authors

Rokia Missaoui
View author publications
You can also search for this author in PubMed Google Scholar
Abir Messaoudi
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Hamza Ibrahim
View author publications
You can also search for this author in PubMed Google Scholar
Talel Abdessalem
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rokia Missaoui .

Editor information

Editors and Affiliations

LIASD, Paris 8 University, Saint-Denis, France
Rakia Jaziri
IRISA, University of Rennes, Lannion, France
Arnaud Martin
LIG (CNRS UMR 5217), Université Grenoble Alpes, Grenoble, France
Marie-Christine Rousset
LORIA, Université de Lorraine, Metz, France
Lydia Boudjeloud-Assala
LS2N (CNRS UMR 6004), Nantes University, Nantes, France
Fabrice Guillet

Appendix

Omega index (Collins and Dent 1988) counts the number of node pairs without community assignment as well as those which are in exactly one community, two communities, and so on Chakraborty et al. (2017).

Let $R=\{R_1, R_2, \ldots , R_J\}$ be the set of the J ground-truth communities in the graph of size N, and $C=\{C_1, C_2, \ldots , C_K\}$ the set of detected communities. The Omega index is then defined as follows:

$$\begin{aligned} Omega \left( C,R\right) = \frac{Omega_u \left( C,R\right) -Omega_e \left( C,R\right) }{1-Omega_e \left( C,R\right) } \end{aligned}$$

(7)

where the unadjusted omega index Omega$_u$ is defined as

$$\begin{aligned} Omega_u \left( C,R\right) = \frac{1}{M} \sum _{j=1} \max (|C|,|R|)|t_j(c_i)\cap t_j(r_j)| \end{aligned}$$

(8)

where $M = N(N - 1)/2$ represents the number of node pairs, and $t_j (R)$ is the set of pairs that appear exactly j times in the ground-truth set R. Finally, the expected omega index Omega$_e$ is given by

$$\begin{aligned} Omega_e \left( C, R\right) = \frac{1}{M^{2}} \sum _{j=1} \max (|C|,|R|)|t_j(c_i)|\cdot | t_j(r_j)| \end{aligned}$$

(9)

The computation of the overlapping Normalized Mutual Information is as follows. For each node i in the detected community structure C, its community membership can be declared as a binary vector of length |C|, where $(x_i)_k$ is set to 1 if node i is a member of the k-th cluster $C_k$, and 0 otherwise. The k-th entry of this vector can be viewed as a random variable $X_k$ whose probability distribution is given by:

$P(X_k = k) = N_k/N$, $P(X_k = 0)= 1 - P(X_k =1)$, where $N_k= |C|$, and N is the number of nodes in the graph. The same holds for the random variable $Y_l$ associated with the $l-$th cluster in the ground truth community structure R.

To define the entropy measures H(X) and $H(X_k, Y_l)$, both the empirical marginal probability distribution $P(X_k)$ and the joint probability distribution $P(X_k, Y_l)$ are needed. The conditional entropy of a cluster $X_k$ given $Y_l$ is defined as $H(X_k|Y_l) = H(X_k, Y_l) - H(Y_l)$. The entropy of $X_k$ with respect to the entire vector Y is based on the best matching between $X_k$ and any component of Y:

$$\begin{aligned} H (X_k|Y) = min_{l=1,..,|R|} H (X_k|Y_l) \end{aligned}$$

(10)

The normalized conditional entropy of a community X with respect to Y is

$$\begin{aligned} H(X|Y) = \frac{1}{R} \sum _{k} \frac{H(X_k|y)}{H(X_k)} \end{aligned}$$

(11)

Similarly, we define H(Y|X).

Then, the final Overlapping Normalized Mutual Information formula for two community structures C and R is given by :

$$\begin{aligned} ONMI(X|Y ) = 1 - [H(X|Y ) + H(Y |X)]/2 \end{aligned}$$

(12)

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Missaoui, R., Messaoudi, A., Ibrahim, M.H., Abdessalem, T. (2022). Detecting Communities in Complex Networks Using Formal Concept Analysis. In: Jaziri, R., Martin, A., Rousset, MC., Boudjeloud-Assala, L., Guillet, F. (eds) Advances in Knowledge Discovery and Management. Studies in Computational Intelligence, vol 1004. Springer, Cham. https://doi.org/10.1007/978-3-030-90287-2_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-90287-2_5
Published: 15 March 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-90286-5
Online ISBN: 978-3-030-90287-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Detecting Communities in Complex Networks Using Formal Concept Analysis

Abstract

Access this chapter

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation