Skip to main content
Log in

Community detection in social networks by spectral embedding of typed graphs

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

Although there is considerable disagreement about the details, community detection in social networks requires finding groups of nodes that are similar to one another, and different from other groups. The notion of similarity is therefore key. Some techniques use attribute similarity—two nodes are similar when they share similar attribute values; some use structural similarity—two nodes are similar when they are well connected, directly or indirectly. Recent work has tried to use both attribute and structural similarity, but the obvious challenge is how to merge and weight these two qualitatively different types of similarity. We design a community detection technique that not only uses attributes and structure, but separates qualitatively different kinds of attributes and treats similarity different for each. Attributes and structure are then combined into a single graph in a principled way, and a spectral embedding used to place the nodes in a geometry, where conventional clustering algorithms can be applied. We apply our community detection technique to real-world data, the Instagram social network, which we crawl to extract the data of a large set of users. We compute attribute similarity from users’ post content, hashtags, image content, and followership as qualitatively different modes of similarity. Our technique outperforms a range of popular community detection techniques across many metrics, providing evidence that different attribute modalities are important for discovering communities. We also validate our technique by computing the topics associated with each community and showing that these are plausibly coherent. This highlights a potential application of community detection in social networks, finding groups of users with specific interests who could be the targets of focused marketing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27
Fig. 28
Fig. 29
Fig. 30
Fig. 31
Fig. 32
Fig. 33
Fig. 34
Fig. 35
Fig. 36
Fig. 37

Similar content being viewed by others

References

  • Akbas E, Zhao P (2017) Attributed graph clustering: an attribute-aware graph embedding approach. In: Proceedings of the 2017 IEEE/ACM international conference on advances in social networks analysis and mining 2017, pp 305–308

  • Bansal P, Bansal R, Varma V (2015) Towards deep semantic analysis of hashtags. In: European conference on information retrieval, Springer, pp 453–464

  • Bhat SI, Arif T, Malik MB et al (2020) Browser simulation-based crawler for online social network profile extraction. Int J Web Based Commun 16(4):321–342

    Article  Google Scholar 

  • Blei DM, Lafferty JD (2006) Dynamic topic models. In: Proceedings of the 23rd international conference on Machine learning, pp 113–120

  • Bojanowski P, Grave E, Joulin A et al (2017) Enriching word vectors with subword information. Transa Assoc Comput Ling 5:135–146

    Google Scholar 

  • Bothorel C, Cruz JD, Magnani M et al (2015) Clustering attributed graphs: models, measures and methods. Netw Sci 3(3):408–444

    Article  Google Scholar 

  • Britvin A, Alrawashdeh JH, Tkachuk R (2022) Client-server system for parsing data from web pages. Adv Cyber-Phys Syst 7(1):8–13

    Article  Google Scholar 

  • Buccafurri F, Lax G, Nicolazzo S, et al (2014) A model to support multi-social-network applications. In: OTM confederated international conferences" On the move to meaningful internet systems. Springer, pp 639–656

  • Buccafurri F, Lax G, Nocera A et al (2015) Discovering missing me edges across social networks. Inf Sci 319:18–37

    Article  MathSciNet  Google Scholar 

  • Chakraborty T, Dalmia A, Mukherjee A et al (2017) Metrics for community analysis: a survey. ACM Comput Surv 50(4):1–37

    Article  Google Scholar 

  • Cho WI, Cheon SJ, Kang WH, et al (2018) Real-time automatic word segmentation for user-generated text. arXiv preprint arXiv:1810.13113

  • Chunaev P (2020) Community detection in node-attributed social networks: a survey. Comput Sci Rev 37:100286

    Article  MathSciNet  Google Scholar 

  • Chung F (1997) Spectral graph theory. number 92 in CBMS regional conference series in mathematics. American Mathematical Society

  • Combe D, Largeron C, Géry M, et al (2015) I-louvain: An attributed graph clustering method. In: International symposium on intelligent data analysis, Springer, pp 181–192

  • Crampes M, Plantié M (2014) A unified community detection, visualization and analysis method. Adv Complex Syst 17(01):1450001

    Article  MathSciNet  Google Scholar 

  • Danon L, Diaz-Guilera A, Duch J et al (2005) (2005) Comparing community structure identification. J Stat Mech: Theory Exp 09:P09008

    Google Scholar 

  • Deerwester S, Dumais ST, Furnas GW et al (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407

    Article  Google Scholar 

  • Ding Y (2011) Community detection: topological vs. topical. J Inf 5(4):498–514

    Google Scholar 

  • Dosovitskiy A, Beyer L, Kolesnikov A, et al (2021) An image is worth 16x16 words: Transformers for image recognition at scale. In: ICLR

  • Egger R, Kroner M, Stöckl A (2022) Web scraping. In: Applied Data Science in Tourism. Springer, p 67–82

  • Fei-Fei L (2007) Recognizing and learning object categories. CVPR Short Course, 2007

  • Fortunato S (2010) Community detection in graphs. Phys Rep 486(3–5):75–174

    Article  MathSciNet  Google Scholar 

  • Girvan M, Newman ME (2001) Community structure in social and biological networks. Proc Natl Acad Sci USA 99:8271–8276

    MathSciNet  Google Scholar 

  • Girvan M, Newman ME (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99(12):7821–7826

    Article  MathSciNet  Google Scholar 

  • Günnemann S, Färber I, Boden B, et al (2010) Subspace clustering meets dense subgraph mining: a synthesis of two paradigms. In: 2010 IEEE international conference on data mining, IEEE, pp 845–850

  • Günnemann S, Boden B, Seidl T (2011) Db-csc: a density-based approach for subspace clustering in graphs with feature vectors. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 565–580

  • Günnemann S, Färber I, Raubach S, et al (2013) Spectral subspace clustering for graphs with feature vectors. In: 2013 IEEE 13th international conference on data mining, IEEE, pp 231–240

  • He X, Deng L (2017) Deep learning for image-to-text generation: a technical overview. IEEE Signal Process Mag 34(6):109–116

    Article  Google Scholar 

  • Jia C, Li Y, Carson MB et al (2017) Node attribute-enhanced community detection in complex networks. Sci Rep 7(1):1–15

    Google Scholar 

  • Jin D, Yu Z, Jiao P, et al (2021) A survey of community detection approaches: from statistical modeling to deep learning. arXiv: 2101:01669

  • Karami E, Prasad S, Shehata M (2015) Image matching using sift, surf, brief and orb: performance comparison for distorted images. In: Newfoundland electrical and computer engineering conference

  • Khataei S, Hine MJ, Arya A (2021) The design, development and validation of a persuasive content generator. J Int Technol Inf Manag 29(3):46–80

    Google Scholar 

  • Kodiyala VS, Mercer RE (2021) Emotion recognition and sentiment classification using bert with data augmentation and emotion lexicon enrichment. In: 2021 20th ieee international conference on machine learning and applications (ICMLA), IEEE, pp 191–198

  • Koto F, Adriani M (2015) Hbe: Hashtag-based emotion lexicons for twitter sentiment analysis. In: Proceedings of the 7th Forum for Information Retrieval Evaluation, pp 31–34

  • Lee D, Seung H (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–791

    Article  Google Scholar 

  • Lee JA, Sudarshan S, Sussman KL et al (2022) Why are consumers following social media influencers on instagram? exploration of consumer’s motives for following influencers and the role of materialism. Int J Advert 41(1):78–100

    Article  Google Scholar 

  • Leskovec J, Lang KJ, Mahoney M (2010) Empirical comparison of algorithms for network community detection. In: Proceedings of the 19th international conference on World wide web, pp 631–640

  • Li Y, Sha C, Huang X, et al (2018) Community detection in attributed graphs: An embedding approach. In: The thirty-second AAAI conference on artificial intelligence (AAAI-18)

  • Liu DR, Tsai PY, Chiu PH (2011) Personalized recommendation of popular blog articles for mobile applications. Inf Sci 181(9):1552–1572

    Article  Google Scholar 

  • Lu DD, Qi J, Yan J et al (2022) Community detection combining topology and attribute information. Knowl Inf Syst 64(2):537–558

    Article  Google Scholar 

  • Malliaros FD, Vazirgiannis M (2013) Clustering and community detection in directed networks: A survey. Phys Rep 533(4):95–142

    Article  MathSciNet  Google Scholar 

  • Moser F, Colak R, Rafiey A, et al (2009) Mining cohesive patterns from graphs with feature vectors. In: Proceedings of the 2009 SIAM international conference on data mining, SIAM, pp 593–604

  • Newman ME (2006) Finding community structure in networks using the eigenvectors of matrices. Phys Rev E 74(3):036104

    Article  MathSciNet  Google Scholar 

  • Newman ME, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113

    Article  Google Scholar 

  • Nick B, Lee C, Cunningham P, et al (2013) Simmelian backbones: Amplifying hidden homophily in Facebook networks. In: Proceedings of Advances in Social Network Analysis and Modelling ASONAM, ACM & IEEE

  • Orman GK, Labatut V, Cherifi H (2012) Comparative evaluation of community detection algorithms: a topological approach. J Stat Mech: Theory Exp 08:P08001

    Google Scholar 

  • Perozzi B, Akoglu L, Iglesias Sánchez P, et al (2014a) Focused clustering and outlier detection in large attributed graphs. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 1346–1355

  • Perozzi B, Al-Rfou R, Skiena S (2014b) Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 701–710

  • Rosvall M, Axelsson D, Bergstrom CT (2009) The map equation. Eur Phys J Spcl Top 178(1):13–23

    Article  Google Scholar 

  • Ruan Y, Fuhry D, Liang J, et al (2015) Community discovery: simple and scalable approaches. In: User community discovery. Springer, p 23–54

  • Sandholm T, Ung H (2011) Real-time, location-aware collaborative filtering of web content. In: Proceedings of the 2011 workshop on context-awareness in retrieval and recommendation, pp 14–18

  • Schliski F, Schlötterer J, Granitzer M (2020) Influence of random walk parametrization on graph embeddings. In: European conference on information retrieval, Springer, pp 58–65

  • Sheikh N, Kefato Z, Montresor A (2019) gat2vec: representation learning for attributed graphs. Computing 101(3):187–209

    Article  MathSciNet  Google Scholar 

  • Skillicorn D, Zheng Q (2012) Global similarity in social networks with typed edges. In: 2012 IEEE/ACM international conference on advances in social networks analysis and mining, pp 79–85

  • Sun H, He F, Huang J et al (2020) Network embedding for community detection in attributed networks. ACM Trans Knowl Discov Data 14(3):1–25

    Article  Google Scholar 

  • Tang J, Wang X, Gao H et al (2012) Enriching short text representation in microblog for clustering. Front Comp Sci 6(1):88–101

    Article  MathSciNet  Google Scholar 

  • Traag V, Krings G, Dooren PV (2013) Significant scales in community structure. Sci Rep 3:1–10

    Article  Google Scholar 

  • Wang C, Pan S, Long G, et al (2017) MGAE: Marginalized graph autoencoder for graph clustering. In: CIKM’17

  • Wu H, Cui X, He J et al (2014) On improving aggregate recommendation diversity and novelty in folksonomy-based social systems. Pers Ubiquit Comput 18(8):1855–1869

    Article  Google Scholar 

  • Xie J, Kelley S, Szymanski BK (2013) Overlapping community detection in networks: the state-of-the-art and comparative study. ACM Comput Surv 45(4):1–35

    Article  Google Scholar 

  • Xu X, Yuruk N, Feng Z, et al (2007) Scan: a structural clustering algorithm for networks. In: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 824–833

  • Xu Z, Ke Y, Wang Y, et al (2012) A model-based approach to attributed graph clustering. In: Proceedings of the 2012 ACM SIGMOD international conference on management of data, pp 505–516

  • Yang J, Leskovec J (2012) Defining and evaluating network communities based on ground-truth. In: Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics, pp 1–8

  • Yang J, McAuley J, Leskovec J (2013) Community detection in networks with node attributes. In: 2013 IEEE 13th international conference on data mining, IEEE, pp 1151–1156

  • Yang T, Jin R, Chi Y, et al (2009) Combining link and content for community detection: a discriminative approach. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 927–936

  • Zheng Q (2016) Spectral techniques for heterogeneous social networks. PhD thesis, Queen’s University at Kingston

  • Zheng Q, Skillicorn D (2017) Social networks with rich edge semantics. Taylor & Francis, Milton Park

    Book  Google Scholar 

  • Zhou Y, Liu L (2013) Social influence based clustering of heterogeneous information networks. In: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 338–346

  • Zhou Y, Cheng H, Yu JX (2009) Graph clustering based on structural/attribute similarities. Proc VLDB Endowment 2(1):718–729

    Article  Google Scholar 

  • Zhou Y, Cheng H, Yu JX (2010) Clustering large attributed graphs: an efficient incremental approach. In: 2010 IEEE International conference on data mining, IEEE, pp 689–698

Download references

Author information

Authors and Affiliations

Authors

Contributions

MA carried out the research and contributed to the writing. DBS wrote the main manuscript text. Both authors reviewed the manuscript.

Corresponding author

Correspondence to D. B. Skillicorn.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alfaqeeh, M., Skillicorn, D.B. Community detection in social networks by spectral embedding of typed graphs. Soc. Netw. Anal. Min. 14, 12 (2024). https://doi.org/10.1007/s13278-023-01172-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-023-01172-y

Navigation