Abstract
In recent years, social media has become a ubiquitous and integral part of social discourse. Homophily is a fundamental topic in network science and can provide insights into the flow of information and behaviours within society. Homophily mainly refers to the tendency of similar-minded people to interact with one another in social groups than with dissimilar-minded people. The study of homophily has been very useful in analyzing the formations of online communities. In this paper, we review and survey the effects of homophily in social networks and summarize the state-of-art methods that have been proposed in the past recent years to identify and measure those effects in multiple types of social networks. We conclude with a critical discussion of open challenges and directions for future research.
Similar content being viewed by others
References
Albalawi Y, Nikolov NS, Buckley J (2019) Trustworthy health-related tweets on social media in saudi arabia: tweet metadata analysis. Journal of medical Internet research 21(10):e14731
Arun R, Suresh V, Madhavan CV, Murthy MN (2010) On finding the natural number of topics with latent dirichlet allocation: Some observations. In: Pacific-asia conference on knowledge discovery and data mining, pp. 391–402. Springer
Bandura A (2009) Social cognitive theory of mass communication. In: Media effects, pp. 110–140. Routledge
Barabási A. L., Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509– 512
Barone MJ, Miyazaki AD, Taylor KA (2000) The influence of cause-related marketing on consumer choice: does one good turn deserve another? Journal of the academy of marketing Science 28(2):248–262
Barreto JE, Whitehair CL (2017) Social media and web presence for patients and professionals: evolving trends and implications for practice. PM&R 9 (5):S98–S105
Basov N (2019) The ambivalence of cultural homophily: Field positions, semantic similarities, and social network ties in creative collectives Poetics
Bass FM, Krishnan TV, Jain DC (1994) Why the bass model fits without decision variables. Marketing science 13(3):203–223
Belford M, Mac Namee B, Greene D (2018) Stability of topic modeling via matrix factorization. Expert Syst Appl 91:159–169
van den Beukel S, Goos SH, Treur J (2019) An adaptive temporal-causal network model for social networks based on the homophily and more-becomes-more principle. Neurocomputing 338:361–371
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. Journal of machine Learning research 3(Jan):993–1022
Blevins JL, Lee JJ, McCabe EE, Edgerton E (2019) Tweeting for social justice in# ferguson:, Affective discourse in twitter hashtags. New Media & Society 21(7):1636–1653
Boutyline A, Willer R (2017) The social structure of political echo chambers: Variation in ideological homophily in online networks. Political Psychology 38(3):551–569
Bucur D (2019) Gender homophily in online book networks. Information sciences 481:229–243
Cao J, Xia T, Li J, Zhang Y, Tang S (2009) A density-based method for adaptive lda model selection. Neurocomputing 72(7-9):1775–1781
Cepić D., Tonković ž (2020) How social ties transcend class boundaries? network variability as tool for exploring occupational homophily. Soc Networks 62:33–42
Cero I, Witte TK (2020) Assortativity of suicide-related posting on social media. Am Psychol 75(3):365
Chang J, Gerrish S, Wang C, Boyd-Graber JL, Blei DM (2009) Reading tea leaves: How humans interpret topic models. In: Advances in neural information processing systems, pp 288–296
Chu KH, Allem JP, Unger JB, Cruz TB, Akbarpour M, Kirkpatrick MG (2019) Strategies to find audience segments on twitter for e-cigarette education campaigns. Addictive behaviors 91:222–226
Chu KH, Colditz JB, Primack BA, Shensa A, Allem JP, Miller E, Unger JB, Cruz TB (2018) Juul: spreading online and offline. J Adolesc Health 63(5):582–586
Chung TLD, Johnson O, Hall-Phillips A, Kim K (2021) The effects of offline events on online connective actions: an examination of# boycottnfl using social network analysis. Comput Hum Behav 115:106623
Colladon AF, Gloor PA (2019) Measuring the impact of spammers on e-mail and twitter networks. Int J Inf Manag 48:254–262
Conneau A, Lample G (2019) Cross-lingual language model pretraining. In: Advances in neural information processing systems, pp. 7057–7067
Currarini S, Matheson J, Vega-Redondo F (2016) A simple model of homophily in social networks. Eur Econ Rev 90:18–39
Cvetojevic S, Hochmair HH (2021) Modeling interurban mentioning relationships in the us twitter network using geo-hashtags. Comput Environ Urban Syst 87:101621
Dehghani M, Johnson K, Hoover J, Sagi E, Garten J, Parmar NJ, Vaisey S, Iliev R, Graham J (2016) Purity homophily in social networks. J Exp Psychol Gen 145(3):366
Di Tommaso G, Gatti M, Iannotta M, Mehra A, Stilo G, Velardi P (2020) Gender, rank, and social networks on an enterprise social media platform. Soc Networks 62:58–67
Dincelli E, Hong Y, DePaula N (2016) Information diffusion and opinion change during the gezi park protests: Homophily or social influence? Proceedings of the Association for Information Science and Technology 53(1):1–5
Ejima H, Richardson JJ, Caruso F (2017) Metal-phenolic networks as a versatile platform to engineer nanomaterials and biointerfaces. Nano Today 12:136–148
Escobar-Viera CG, Whitfield DL, Wessel CB, Shensa A, Sidani JE, Brown AL, Chandler CJ, Hoffman BL, Marshal MP, Primack BA (2018) For better or for worse? a systematic review of the evidence on social media use and depression among lesbian, gay, and bisexual minorities. JMIR mental health 5(3):e10496
Eyal K, Rubin AM (2003) Viewer aggression and homophily, identification, and parasocial relationships with television characters. Journal of Broadcasting & Electronic Media 47(1):77–98
Fincham K (2019) Exploring political journalism homophily on twitter: a comparative analysis of us and uk elections in 2016 and 2017. Media and Communication 7(1):213–224
Franz D, Marsh HE, Chen JI, Teo AR (2019) Using facebook for qualitative research: a brief primer. Journal of medical Internet research 21(8):e13544
Getchell MC, Sellnow TL (2016) A network analysis of official twitter accounts during the west virginia water crisis. Comput Hum Behav 54:597–606
Ghaznavi J, Taylor LD (2015) Bones, body parts, and sex appeal: an analysis of# thinspiration images on popular social media. Body image 14:54–61
Gilbert E, Karahalios K (2009) Predicting tie strength with social media. In: Proceedings of the SIGCHI conference on human factors in computing systems, pp. 211–220
Gonzalez-Bailon S (2009) Opening the black box of link formation: Social factors underlying the structure of the web. Soc Networks 31(4):271–280
Goyal P, Ferrara E (2018) Graph embedding techniques, applications, and performance: a survey. Knowl-Based Syst 151:78–94
Grace MK (2018) Friend or frenemy? experiential homophily and educational track attrition among premedical students. Social Science & Medicine 212:33–42
Halberstam Y, Knight B (2016) Homophily, group size, and the diffusion of political information in social networks: Evidence from twitter. Journal of public economics 143:73–88
Han S, Qiao Y, Zhang Y, Lin W, Yang J (2018) Analyze users’ online shopping behavior using interconnected online interest-product network. In: 2018 IEEE Wireless communications and networking conference (WCNC), pp. 1–6. IEEE
Hanks L, Line N, Yang W (2017) Status seeking and perceived similarity: a consideration of homophily in the social servicescape. Int J Hosp Manag 60:123–132
Himelboim I, Sweetser KD, Tinkham SF, Cameron K, Danelo M, West K (2016) Valence-based homophily on twitter: Network analysis of emotions and political talk in the 2012 presidential election. New media & society 18(7):1382–1400
Horn RA (1990) The hadamard product. In: Proc. Symp. Appl. math, vol. 40, pp. 87–169
Huber GA, Malhotra N (2017) Political homophily in social relationships: Evidence from online dating behavior. The Journal of Politics 79 (1):269–283
Huberty M (2015) Can we vote with our tweet? on the perennial difficulty of election forecasting with social media. Int J Forecast 31(3):992–1007
Jang SM, Hart PS (2015) Polarized frames on “climate change” and “global warming” across countries and states: Evidence from twitter big data. Glob Environ Chang 32:11–17
Jia R, Li W (2020) Public diplomacy networks: China’s public diplomacy communication practices in twitter during two sessions. Public Relations Review 46(1):101818
Jin Y (2017) Development of word cloud generator software based on python. Procedia engineering 174:788–792
Kamilaris A, Prenafeta-Boldú FX (2018) Deep learning in agriculture: a survey. Computers and electronics in agriculture 147:70–90
Karimi F, Génois M, Wagner C, Singer P, Strohmaier M (2018) Homophily influences ranking of minorities in social networks. Scientific reports 8(1):1–12
Kassens-Noor E, Vertalka J, Wilson M (2019) Good games, bad host? using big data to measure public attention and imagery of the olympic games. Cities 90:229–236
Kets W, Sandroni A (2019) A belief-based theory of homophily. Games and Economic Behavior 115:410–435
Khan ML (2017) Social media engagement: What motivates user participation and consumption on youtube? Comput Hum Behav 66:236–247
Kim K, Altmann J (2017) Effect of homophily on network formation. Commun Nonlinear Sci Numer Simul 44:482–494
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv:1609.02907
Koiranen I, Koivula A, Keipi T, Saarinen A (2019) Shared contexts, shared background, shared values–homophily in finnish parliament members’ social networks on twitter. Telematics Inform 36:117–131
Kwon HE, Oh W, Kim T (2017) Platform structures, homing preferences, and homophilous propensities in online social networks. J Manag Inf Syst 34(3):768–802
Ladhari R, Massa E, Skandrani H (2020) Youtube vloggers’ popularity and influence: the roles of homophily, emotional attachment, and expertise. J Retail Consum Serv 54:102027
Lai M, Tambuscio M, Patti V, Ruffo G, Rosso P (2019) Stance polarity in political debates: a diachronic perspective of network homophily and conversations on twitter. Data & Knowledge Engineering 124:101738
Lazarsfeld PF, Merton RK et al (1954) Friendship as a social process: a substantive and methodological analysis. Freedom and control in modern society 18(1):18–66
Li S, Da Xu L, Zhao S (2018) 5g internet of things: a survey. Journal of Industrial Information Integration 10:1–9
Liang H, Shen F (2018) Birds of a schedule flock together: Social networks, peer influence, and digital activity cycles. Comput Hum Behav 82:167–176
Linvill DL, Boatwright BC, Grant WJ, Warren PL (2019) “the russians are hacking my brain!” investigating russia’s internet research agency twitter tactics during the 2016 United States presidential campaign. Comput Hum Behav 99:292–300
Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, Van Der Laak JA, Van Ginneken B, Sánchez C. I. (2017) A survey on deep learning in medical image analysis. Medical image analysis 42:60–88
Lusher D, Koskinen J, Robins G (2013) Exponential random graph models for social networks: Theory, methods, and applications Cambridge University Press
Ma L, Krishnan R, Montgomery AL (2015) Latent homophily or social influence? an empirical analysis of purchase within a social network. Manag Sci 61(2):454–473
MacQueen J et al (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol. 1, pp 281–297. Oakland, CA, USA
Mahmood A, Sismeiro C (2017) Will they come and will they stay? online social networks and news consumption on external websites. J Interact Mark 37:117–132
Mayer A, Puller SL (2008) The old boy (and girl) network: Social network formation on university campuses. Journal of public economics 92 (1-2):329–347
McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: Homophily in social networks. Annual review of sociology 27(1):415–444
Mei W, Cisneros-Velarde P, Chen G, Friedkin NE, Bullo F (2019) Dynamic social balance and convergent appraisals via homophily and influence mechanisms. Automatica 110:108580
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp. 3111–3119
Mimno D, McCallum A (2012) Topic models conditioned on arbitrary features with dirichlet-multinomial regression. arXiv:1206.3278
Mimno D, Wallach HM, Talley E, Leenders M, McCallum A (2011) Optimizing semantic coherence in topic models. In: Proceedings of the conference on empirical methods in natural language processing, pp. 262–272. Association for Computational Linguistics
Moody J (2001) Race, school integration, and friendship segregation in america. American journal of Sociology 107(3):679–716
Morris M, Handcock MS, Hunter DR (2008) Specification of exponential-family random graph models: terms and computational aspects. Journal of statistical software 24(4):1548
Mou Y, Xu K (2017) The media inequality: Comparing the initial human-human and human-ai social interactions. Comput Hum Behav 72:432–440
Mukherjee S, Althuizen N (2020) Brand activism: Does courting controversy help or hurt a brand? International Journal of Research in Marketing
Murase Y, Jo HH, Török J, Kertész J, Kaski K (2019) Structural transition in social networks: the role of homophily. Scientific reports 9 (1):1–8
Nazan Ö, Ayvaz S (2018) Sentiment analysis on twitter: a text mining approach to the syrian refugee crisis. Telematics Inform 314:136–147
Newman ME (2001) Clustering and preferential attachment in growing networks. Physical review E 64(2):025102
Newman ME (2002) Assortative mixing in networks. Physical review letters 89(20):208701
Nguyen VA, Ying JL, Resnik P (2019) Lexical and hierarchical topic regression. In: Advances in neural information processing systems, pp. 1106–1114
O’Connor B, Krieger M, Ahn D (2010) Tweetmotif: Exploratory search and topic summarization for twitter. In: Fourth international AAAI conference on weblogs and social media
O’Neill S, Williams HT, Kurz T, Wiersma B, Boykoff M (2015) Dominant frames in legacy and social media coverage of the ipcc fifth assessment report. Nat Clim Chang 5(4):380–385
Pan J, Bhardwaj R, Lu W, Chieu HL, Pan X, Puay NY (2019) Twitter homophily: Network based prediction of user’s occupation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 2633–2638
Papadimitriou CH, Raghavan P, Tamaki H, Vempala S (2000) Latent semantic indexing: a probabilistic analysis. J Comput Syst Sci 61(2):217–235
Peel L, Delvenne JC, Lambiotte R (2018) Multiscale mixing patterns in networks. Proceedings of the National Academy of Sciences 115 (16):4057–4062
Perra N, Fortunato S (2008) Spectral centrality measures in complex networks. Physical Review E 78(3):036107
Phua J, Jin SV, Kim JJ (2017) Gratifications of using facebook, twitter, instagram, or snapchat to follow brands: the moderating effect of social comparison, trust, tie strength, and network homophily on brand identification, brand engagement, brand commitment, and membership intention. Telematics Inform 34(1):412–424
Pourebrahim N, Sultana S, Niakanlahiji A, Thill JC (2019) Trip distribution modeling with twitter data. Comput Environ Urban Syst 77:101354
Preoṫiuc-Pietro D., Lampos V, Aletras N (2015) An analysis of the user occupational class through twitter content. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1754–1764
Puranam D, Narayan V, Kadiyali V (2017) The effect of calorie posting regulation on consumer opinion: a flexible latent dirichlet allocation model with informative priors. Mark Sci 36(5):726– 746
Qudar MMA, Mago V (2020) Tweetbert:, A pretrained language representation model for twitter text analysis. arXiv:2010.11091
Robins G, Pattison P, Kalish Y, Lusher D (2007) An introduction to exponential random graph (p*) models for social networks. Social networks 29(2):173–191
Saffer AJ, Yang A, Taylor M (2018) Reconsidering power in multistakeholder relationship management. Manag Commun Q 32(1):121–139
Sandhu M, Vinson CD, Mago VK, Giabbanelli PJ (2019) From associations to sarcasm: Mining the shift of opinions regarding the supreme court on twitter. Online Social Networks and Media 14:100054
Šćepanović S, Mishkovski I, Gonçalves B, Nguyen TH, Hui P (2017) Semantic homophily in online communication: evidence from twitter. Online Social Networks and Media 2:1–18
Shaghaghi A, Bhopal RS, Sheikh A (2011) Approaches to recruiting ’hard-to-reach’populations into research: a review of the literature. Health promotion perspectives 1(2):86
Singla P, Richardson M (2008) Yes, there is a correlation: -from social networks to personal behavior on the web. In: Proceedings of the 17th international conference on World Wide Web, pp. 655–664
Snijders TA (2002) Markov chain monte carlo estimation of exponential random graph models. Journal of Social Structure 3(2):1–40
Song Y, Dai XY, Wang J (2016) Not all emotions are created equal: Expressive behavior of the networked public on china’s social media site. Comput Hum Behav 60:525–533
Sørensen T, Sørensen T, Sørensen T, SORENSEN T, Sorensen T, Sorensen T, Biering-sørensen T (1948) A method of establishing groups of equal amplitude in plant sociology based on similarity of species content and its application to analyses of the vegetation on danish commons
Steyvers M, Griffiths T (2007) Probabilistic topic models. Handbook of latent semantic analysis 427(7):424–440
Stivala A, Robins G, Lomi A (2020) Exponential random graph model parameter estimation for very large directed networks. Plos one 15(1) e0227804
Tamburrini N, Cinnirella M, Jansen VA, Bryden J (2015) Twitter users change word usage according to conversation-partner social identity. Soc Networks 40:84–89
Tang J, Gao H, Hu X, Liu H (2013) Exploiting homophily effect for trust prediction. In: Proceedings of the sixth ACM international conference on Web search and data mining, pp. 53–62
VanderWeele TJ (2017) Sensitivity analysis for contagion effects in social networks. Sociological Methods & Research 54(13):3058–3070
Warren K, Campbell B, Cranmer S, De Leon G, Doogan N, Weiler M, Doherty F (2020) Building the community: Endogenous network formation, homophily and prosocial sorting among therapeutic community residents. Drug Alcohol Depend 207:107773
Williams Hywel TPEA (2015) Network analysis reveals open forums and echo chambers in social media discussions of climate change. Global environmental change 32:126–138
Xiong J, Feng X, Tang Z (2020) Understanding user-to-user interaction on government microblogs: An exponential random graph model with the homophily and emotional effect. Information Processing & Management 57(4):102229
Xu S, Zhou A (2020) Hashtag homophily in twitter network: Examining a controversial cause-related marketing campaign. Comput Hum Behav 102:87–96
Xu Y, Belyi A, Santi P, Ratti C (2019) Quantifying segregation in an integrated urban physical-social space. Journal of the Royal Society Interface 16(160):20190536
Yang W, Boyd-Graber J, Resnik P (2015) Birds of a feather linked together: a discriminative topic model using link-based priors. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 261–266
Yap J, Harrigan N (2015) Why does everybody hate me? balance, status, and homophily: the triumvirate of signed tie formation. Soc Networks 40:103–122
Yaqub U, Chun SA, Atluri V, Vaidya J (2017) Analysis of political discourse on twitter in the context of the 2016 us presidential elections. Gov Inf Q 34(4):613–626
Zhang A, Zheng M, Pang B (2018) Structural diversity effect on hashtag adoption in twitter. Physica A:, Statistical Mechanics and its Applications 493:267–275
Zhang D, Yin J, Zhu X, Zhang C (2016) Homophily, structure, and content augmented network representation learning. In: 2016 IEEE 16Th international conference on data mining (ICDM), pp. 609–618. IEEE
Zhang J, Bareinboim E (2018) Equality of opportunity in classification: a causal approach. In: Advances in neural information processing systems, pp. 3671–3681
Zhang S, Yao L, Sun A, Tay Y (2019) Deep learning based recommender system: a survey and new perspectives. ACM Computing Surveys (CSUR) 52(1):1–38
Zhou Z, Xu K, Zhao J (2018) Homophily of music listening in online social networks of china. Soc Networks 55:160–169
Zhu J, Ahmed A, Xing EP (2012) Medlda: maximum margin supervised topic models. J Mach Learn Res 13(Aug):2237–2278
Zhu J, Chen N, Perkins H, Zhang B (2014) Gibbs max-margin topic models with data augmentation. The Journal of Machine Learning Research 15 (1):1073–1110
Zhu YQ, Chen HG (2015) Social media and human need satisfaction: Implications for social media marketing. Business horizons 58(3):335–345
Acknowledgments
We would like to thank the reviewers for their helpful comments on our work. This work is supported by the Natural Sciences and Engineering Research Council of Canada (NSERC).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A: Table of References
Rights and permissions
About this article
Cite this article
Khanam, K.Z., Srivastava, G. & Mago, V. The homophily principle in social network analysis: A survey. Multimed Tools Appl 82, 8811–8854 (2023). https://doi.org/10.1007/s11042-021-11857-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-021-11857-1