Skip to main content
Log in

Leader identification in an online health community for cancer survivors: a social network-based classification approach

  • Original Article
  • Published:
Information Systems and e-Business Management Aims and scope Submit manuscript

Abstract

Online health communities (OHCs) are an important source of social support for cancer survivors and their informal caregivers. This research attempted to identify leaders in a popular online forum for cancer survivors and caregivers using classification techniques. We first extracted user features from many different perspectives, including contributions, network centralities, and linguistic features. Based on these features, we leveraged the structure of the social network among users and generated new neighborhood-based and cluster-based features. Classification results revealed that these features are discriminative for leader identification. Using these features, we developed a hybrid approach based on an ensemble classifier that performs better than many traditional metrics. This research has implications for understanding and managing OHCs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. As one-class SVM only provides binary decisions, we specify the “outlier” ratio in the classification process, so that the classifier identifies K influential users.

  2. For esthetic purposes, the threshold of 15 was chosen to better illustrate the relationship between discussion boards. A lower threshold will increase the number of edges in the figure and a higher threshold will make the network sparser.

References

  • Agarwal N, Liu H, Tang L, Yu PS (2008) Identifying the influential bloggers in a community. In: Proceedings of the international conference on web search and web data mining. ACM, pp 207–218

  • Albert R, Jeong H, Barabasi AL (2000) Error and attack tolerance of complex networks. Nature 406:378–382

    Article  Google Scholar 

  • Anonymous (2008) Calling all patients. Nat Biotechnol 26:953. doi:10.1038/nbt0908-953

  • Bambina A (2007) Online social support: the interplay of social networks and computer-mediated communication. Cambria Press, Youngstown

    Google Scholar 

  • Barraclough J (1999) Cancer and emotion: A practical guide to psycho-oncology, 3rd edn. Wiley, London

    Google Scholar 

  • Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022. doi:10.1162/jmlr.2003.3.4-5.993

    Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45:5–32. doi:10.1023/a:1010933404324

    Article  Google Scholar 

  • Brin S, Page L (1998) The anatomy of a large-scale hypertextual Web search engine. Comput Netw ISDN Syst 30:107–117

    Article  Google Scholar 

  • Brownstein CA, Brownstein JS, Williams DS et al (2009) The power of social networking in medicine. Nat Biotechnol 27:888–890. doi:10.1038/nbt1009-888

    Article  Google Scholar 

  • Büttcher S, Clarke CLA, Cormack GV (2010) Information retrieval: implementing and evaluating search energies. MIT Press, Cambridge

    Google Scholar 

  • Cha M, Haddadi H, Benevenuto F, Gummadi KP (2010) Measuring user influence in twitter: the million follower fallacy. In: Proceedings of the fourth international AAAI conference on weblogs and social media (ICWSM’10), pp 10–17

  • Chawla NV, Japkowicz N, Kotcz A (2004) Editorial: special issue on learning from imbalanced data sets. ACM SIGKDD Explor Newsl 6:1–6. doi:10.1145/1007730.1007733

    Article  Google Scholar 

  • Chou WY, Hunt YM, Beckjord EB et al (2009) Social media use in the United States: implications for health communication. J Med Internet Res 11:e48. doi:10.2196/jmir.1249

    Article  Google Scholar 

  • Cobb NK, Graham AL, Abrams DB (2010) Social network structure of a large online community for smoking cessation. Am J Public Health 100:1282–1289. doi:10.2105/ajph.2009.165449

    Article  Google Scholar 

  • Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297

    Google Scholar 

  • Dunkel-Schetter C (1984) Social support and cancer: findings based on patient interviews and their implications. J Soc Issues 40:77–98. doi:10.1111/j.1540-4560.1984.tb01108.x

    Article  Google Scholar 

  • Fox S (2011) The social life of health information, 2011. Pew Research Center’s Internet & American Life Project

  • Freeman LC (1977) A set of measures of centrality based on betweenness. Sociometry 40:35–41

    Article  Google Scholar 

  • Goldenberg J, Libai B, Muller E (2001) Talk of the network: a complex systems look at the underlying process of word-of-mouth. Mark Lett 12:211–223

    Article  Google Scholar 

  • Hu M, Liu B (2004) Mining and summarizing customer reviews. ACM 1014073:168–177

    Google Scholar 

  • Kempe D, Kleinberg J, Tardos E (2003) Maximizing the spread of influence through a social network. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 137–146

  • Kononenko I, Simec E, Sikonja MR- (1997) Overcoming the myopia of inductive learning algorithms with RELIEFF. Appl Intell 7:39–55

    Article  Google Scholar 

  • Ma X, Chen G, Xiao J (2010) Analysis of an online health social network. In: Proceedings of the 1st ACM international health informatics symposium. ACM, pp 297–306

  • Mitchell TM (1997) Machine learning. McGraw-Hill, New York

    Google Scholar 

  • Muchnik L, Aral S, Taylor SJ (2013) Social influence bias: a randomized experiment. Science 341:647–651. doi:10.1126/science.1240466

    Article  Google Scholar 

  • Newman MEJ (2006) Modularity and community structure in networks. Proc Natl Acad Sci 103:8577–8582. doi:10.1073/pnas.0601602103

    Article  Google Scholar 

  • Ng A, Jordan M (2002) On discriminative vs. generative classifiers: a comparison of logistic regression and naive bayes. Adv Neural Inf Process Syst 2:841–848

    Google Scholar 

  • Qiu B, Zhao K, Mitra P et al (2011) Get online support, feel better—sentiment analysis and dynamics in an online cancer survivor community. In: Proceedings of the third IEEE third international conference on social computing (SocialCom’11), pp 274–281

  • Rolia J, Yao W, Basu S et al (2013) Tell me what i don’t know-making the most of social health forums. HP Labs

  • Seni G, Elder JF (2010) Ensemble methods in data mining: improving accuracy through combining predictions. Synth Lect Data Min Knowl Discov 2:1–126

    Article  Google Scholar 

  • Shannon C (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–423

    Article  Google Scholar 

  • Tax DMJ (2001) One-class classification: concept-learning in the absence of counter-examples. Technische Universiteit Delft

  • Watts D (2002) A simple model of global cascades on random networks. Proc Natl Acad Sci USA 99:5766–5771

    Article  Google Scholar 

  • WHO (2011) Cancer. Retrieved from http://www.who.int/mediacentre/factsheets/fs297/en/

  • Yang CC, Tang X, Thuraisingham BM (2010) An analysis of user influence ranking algorithms on dark web forums. In: ACM SIGKDD Workshop on Intelligence and Security Informatics. ACM, Washington DC

  • Zhang J, Ackerman MS, Adamic L (2007) Expertise networks in online communities: structure and algorithms. In: Proceedings of the 16th international conference on World Wide Web. ACM, pp 221–230

  • Zhao K, Kumar A (2013) Who blogs what: understanding the publishing behavior of bloggers. World Wide Web 16:621–644. doi:10.1007/s11280-012-0167-3

    Article  Google Scholar 

  • Zhao K, Yen J, Greer G et al (2014) Finding influential users of online health communities: a new metric based on sentiment influence. J Am Med Inform Assoc. 21(e2):e212–e218. doi:10.1136/amiajnl-2013-002282

    Article  Google Scholar 

  • Zhou H, Zeng D, Zhang C (2009) Finding leaders from opinion networks. In: Proceedings of IEEE international conference on intelligence and security informatics (ISI’09), Dallas, TX, pp 266–268

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kang Zhao.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, K., Greer, G.E., Yen, J. et al. Leader identification in an online health community for cancer survivors: a social network-based classification approach. Inf Syst E-Bus Manage 13, 629–645 (2015). https://doi.org/10.1007/s10257-014-0260-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10257-014-0260-5

Keywords

Navigation