Abstract
We study chain-referral methods for sampling in social networks. These methods rely on subjects of the study recruiting other participants among their set of connections. This approach gives us the possibility to perform sampling when the other methods, that imply the knowledge of the whole network or its global characteristics, fail. Chain-referral methods can be implemented with random walks or crawling in the case of online social networks. However, the estimations made on the collected samples can have high variance, especially with small sample size. The other drawback is the potential bias due to the way the samples are collected. We suggest and analyze a subsampling technique, where some users are requested only to recruit other users but do not participate to the study. Assuming that the referral has lower cost than actual participation, this technique takes advantage of exploring a larger variety of population, thus decreasing significantly the variance of the estimator. We test the method on real social networks and on synthetic ones. As by-product, we propose a Gibbs like method for generating synthetic networks with desired properties.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
We are ignoring here the effect of resampling.
- 2.
It could be adopted to model the case where nodes are on a line and social influences are homogeneous.
- 3.
Note that \(Y_i\) = \(g_j\) if the random walk is on node j at the i-th step.
- 4.
Matrix \(P^*\) is always diagonalizable for RW on undirected graph.
References
Freeman, L.C.: Research Professor, Department of Sociology and Institute for Mathematical Behavioral Sciences School of Social Sciences, University of California, Irvine. http://moreno.ss.uci.edu/data.html. Accessed 01 July 2015
The National Longitudinal Study of Adolescent to Adult Health. http://www.cpc.unc.edu/projects/addhealth. Accessed 01 July 2015
The Office of Population Research at Princeton University. https://opr.princeton.edu/archive/p90/. Accessed 01 July 2015
Stanford Large Network Dataset Collection. https://snap.stanford.edu/data/ Accessed 01 July 2015
Brémaud, P.: Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues, vol. 31. Springer Science & Business Media, Berlin (2013)
Nicholas, A.: Christakis and James H Fowler.: The spread of obesity in a large social network over 32 years. New Engl. J. Med. 357(4), 370–379 (2007)
Gile, K.J., Handcock, M.S.: Respondent-driven sampling: an assessment of current methodology. Sociol. Methodol. 40(1), 285–327 (2010)
Goel, S., Salganik, M.J.: Assessing respondent-driven sampling. Proc. Natl. Acad. Sci. 107(15), 6743–6747 (2010)
Heckathorn, D.D., Jeffri, J.: Jazz networks: using respondent-driven sampling to study stratification in two jazz musician communities. In: Unpublished Paper Presented at American Sociological Association Annual Meeting (2003)
Jeon, K.C., Goodson, P.: US adolescents’ friendship networks and health risk behaviors: a systematic review of studies using social network analysis and Add Health data. PeerJ 3, e1052 (2015)
Musyoki, H., Kellogg, T.A., Geibel, S., Muraguri, N., Okal, J., Tun, W., Raymond, H.F., Dadabhai, S., Sheehy, M., Kim, A.A.: Prevalence of HIV, sexually transmitted infections, and risk behaviours among female sex workers in Nairobi, Kenya: results of a respondent driven sampling study. AIDS Behav. 19(1), 46–58 (2015)
Ramirez-Valles, J., Heckathorn, D.D., Vázquez, R., Diaz, R.M., Campbell, R.T.: From networks to populations: the development and application of respondent-driven sampling among IDUs and Latino gay men. AIDS Behav. 9(4), 387–402 (2005)
Volz, E., Heckathorn, D.D.: Probability based estimation theory for respondent driven sampling. J. Off. Stat. 24(1), 79 (2008)
Acknowledgements
This work was supported by CEFIPRA grant no. 5100-IT1 “Monte Carlo and Learning Schemes for Network Analytics,” Inria Nokia Bell Labs ADR “Network Science,” and Inria Brazilian-French research team Thanes.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Avrachenkov, K., Neglia, G., Tuholukova, A. (2016). Subsampling for Chain-Referral Methods. In: Wittevrongel, S., Phung-Duc, T. (eds) Analytical and Stochastic Modelling Techniques and Applications. ASMTA 2016. Lecture Notes in Computer Science(), vol 9845. Springer, Cham. https://doi.org/10.1007/978-3-319-43904-4_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-43904-4_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43903-7
Online ISBN: 978-3-319-43904-4
eBook Packages: Computer ScienceComputer Science (R0)