Extracting representative user subset of social networks towards user characteristics and topological features

Zhou, Yiming; Han, Yuehui; Liu, An; Li, Zhixu; Yin, Hongzhi; Chen, Wei; Zhao, Lei

doi:10.1007/s11280-020-00828-5

Extracting representative user subset of social networks towards user characteristics and topological features

Published: 30 June 2020

Volume 23, pages 2903–2931, (2020)
Cite this article

World Wide Web Aims and scope Submit manuscript

Yiming Zhou¹,
Yuehui Han¹,
An Liu¹,
Zhixu Li¹,
Hongzhi Yin²,
Wei Chen¹ &
…
Lei Zhao ORCID: orcid.org/0000-0002-5123-9279¹

242 Accesses
1 Citation
Explore all metrics

Abstract

Extracting a subset of representative users from the original set in social networks plays a critical role in Social Network Analysis. In existing studies, some researchers focus on preserving users’ characteristics when sampling representative users, while others pay attention to preserving the topology structure. However, both users’ characteristics and the network topology contain abundant information of users. Thus, it is critical to preserve both of them while extracting the representative user subset. To achieve the goal, we propose a novel approach in this study, and formulate the problem as RUS (Representative User Subset) problem that is proved as an NP-Hard problem. To solve RUS problem, we propose two approaches KS (K-Selected) and an optimized method (ACS) that are both consisted of a clustering algorithm and a sampling model, where a greedy heuristic algorithm is proposed to solve the sampling model. In addition, we propose the pruning strategy by taking advantage of MaxHeap structure. To validate the performance of the proposed approach, extensive experiments are conducted on two real-world datasets. Results demonstrate that our methods outperform state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Extracting Representative User Subset of Social Networks Towards User Characteristics and Topological Features

CrawlSN: community-aware data acquisition with maximum willingness in online social networks

Article 08 September 2020

Social Networks Node Mining Algorithm of Based on Greedy Subgraph

Notes

d₁: the number of user’s followers, d₂: the number of user’s friends, d₃: the score of user’s influence, d₄: the score of user’s activity, d₅: the number of tweets, d₆: times of tweets being “liked”, d₇: times of tweets being “retweeted”, d₈: address, d₉: words and phrases
http://scikit-learn.org/stable/modules/naive_bayesḣtml#multinomial-naive-bayes
http://scikit-learn.org/stable/modules/ensembleḣtml#random-forests
http://networkx.github.io/

References

Anagnostopoulos, A., Kumar, R., Mahdian, M.: Influence and correlation in social networks. In: KDD, pp 7–15 (2008)
Aslam, J.A., Montague, M.: Models for metasearch. SIGIR 276–284 (2001)
Chen, W., Wang, Y., Yang, S.: Efficient influence maximization in social networks. In: KDD, pp 199–208 (2009)
Chen, Y.-C.: A novel algorithm for mining opinion leaders in social networks. World Wide Web 22(3), 1279–1295 (2019)
Article Google Scholar
Clauset, A., Newman, M.E., Moore, C: Finding community structure in very large networks. Phys. Rev. E 70(2), 066111 (2004)
Article Google Scholar
Crandall, D.J., Cosley, D., Huttenlocher, D.P., Kleinberg, J.M., Suri, S.: Feedback effects between similarity and social influence in online communities. In: KDD, pp 160–168 (2008)
Duda, R.O., Hart, P.E.: Pattern classification and scene analysis. Tronto A Wiley-Interscience Publication, New York (1973)
MATH Google Scholar
Elhamifar, E., Sapiro, G., Sastry, S.S: Dissimilarity-based sparse subset selection. IEEE Trans. Pattern Anal. Intell. 38(11), 2182–2197 (2016)
Article Google Scholar
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, pp 226–231 (1996)
Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)
Article MathSciNet Google Scholar
Girvan, M, Newman, M.E.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA 99(12), 7821 (2002)
Article MathSciNet Google Scholar
Goyal, A., Bonchi, F., Lakshmanan, L.V.S.: Discovering leaders from community actions. In: CIKM, pp 499–508 (2008)
Han, Y., Tang, J.: Probabilistic community and role model for social networks. In: KDD, pp 407–416 (2015)
Hinton, G.E.: Visualizing high-dimensional data using t-sne. Vigiliae Christianae 9, 2579–2605,01 (2008)
MATH Google Scholar
Kaufmann, L., Rousseeuw, P.J.: Clustering by means of medoids. In: Statistical Data Analysis Based on the L1-norm & Related Methods, pp 405–416 (1987)
Ke, S., Morrison, D., Bruno, E.: Stėphane marchand-maillet Learning representative nodes in social networks. In: PAKDD, pp 25–36 (2013)
Maiya, A.S., Tanya, Y.: Berger-wolf. Sampling community structure. In: WWW, pp 701–710 (2010)
Megiddo, N., Supowit, K.J.: On the complexity of some common geometric location problems. SIAM 13(1), 182–196 (1984)
Article MathSciNet Google Scholar
Newman, M.E.J.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74(3), 036104 (2006)
Article MathSciNet Google Scholar
Lawrence Page: The pagerank citation ranking : Bringing order to the Web. Stanford Digital Libraries Working Paper 9(1), 1–14 (1998)
Google Scholar
Papagelis, M., Das, G., Koudas, N.: Sampling online social networks. IEEE TKDE 25(3), 662–676 (2013)
Google Scholar
Scripps, J., Tan, P.-N., Esfahanian, A.-H.: Measuring the effects of preprocessing decisions and network forces in dynamic network analysis. In: KDD, pp 747–756 (2009)
Song, S., Meng, Y., Shi, Z., Zheng, Z., Chen, H.: A simple yet effective method for summarizing microblogging users with their representative tweets. In: IALP, pp 310–313 (2017)
Song, X., Chi, Y., Hino, K., Tseng, B.L.: Identifying opinion leaders in the blogosphere. In: CIKM, pp 971–974 (2007)
Stein, J., Song, H.H., Baldi, M., Li, J.: On the most representative summaries of network user activities. Comput. Netw. 113, 205–217 (2017)
Article Google Scholar
Tang, J., Sun, J., Wang, C., Zi, Y.: Social influence analysis in large-scale networks. In: KDD, pp 807–816 (2009)
Tang, J., Zhang, C., Cai, K., Li, Z., Zhong, S.: Sampling representative users from large social networks. In: AAAI, pp 304–310 (2015)
Tang, M.-C., Hsiao, T.-K., Ou, I.-A.: Not all books in the user profile are created equal: Measuring the preference "representativeness” of books in anobii online bookshelves. In: HCI, pp 424–433 (2017)
Ugander, J., Karrer, B., Backstrom, L., Kleinberg, J.M.: Graph cluster randomization: network exposure to multiple universes. In: KDD, pp 329–337 (2013)
Vazirani, V.V.: Approximation algorithms. Springer, berlin (2003)
Book Google Scholar
Xiao, M., Jie, W., Huang, L., Cheng, R., Wang, Y.: Online task assignment for crowdsensing in predictable mobile social networks. IEEE Trans. Mob. Comput. 10, 1–1 (2016)
Google Scholar
Xiao, M., Ma, K., Liu, A., Zhao, H., Li, Z., Zheng, K., Zhou, X.: SRA: Secure Reverse auction for task assignment in spatial crowdsourcing. IEEE Trans. Knowl. Data Eng. 32(4), 782–796 (2020)
Article Google Scholar
Xiao, M., Wu, J., Huang, L.: Community-aware opportunistic routing in mobile social networks. IEEE Trans. Comput. 63(7), 1682–1695 (2014)
Article MathSciNet Google Scholar
Ye, R.C., Kim, Y., Kim, S., Park, K., Park, J.: An on-device gender prediction method for mobile users using representative wordsets. Expert Syst. Appl. 64, 423–433 (2016)
Article Google Scholar
Yin, H., Chen, H., Sun, X., Wang, H., Wang, Y., Nguyen, Q.V.H.: SPTF: A scalable probabilistic tensor factorization model for semantic-aware behavior prediction. In: ICDM, pp 585–594 (2017)
Yin, H., Cui, B., Huang, Y.: Finding a wise group of experts in social networks. In: ADMA, pp 381–394 (2011)
Yin, H., Zhiting, H., Zhou, X., Wang, H., Zheng, K., Hung, Ng.Q.V., Sadiq, S.W.: Discovering interpretable geo-social communities for user behavior prediction. In: ICDE, pp 942–953 (2016)
Yin, H., Zhou, X., Cui, B., Wang, H., Zheng, K., Hung, N.Q.V.: Adapting to user interest drift for POI recommendation. TKDE 28(10), 2566–2581 (2016)
Google Scholar
Zhao, Z., Li, C., Zhang, X., Chiclana, F., Herrera-viedma, E.: An incremental method to detect communities in dynamic evolving social networks. Knowl.-Based Syst. 163, 404–415 (2019)
Article Google Scholar
Zhou, Y., Han, Y., An, L., Li, Z., Yin, H., Zhao, L.: Extracting representative user subset of social networks towards user characteristics and topological features. In: WISE, pp 213–229 (2018)

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Grant No. 61572335, 61572336, 61902270), and the Major Program of Natural Science Foundation, Educational Commission of Jiangsu Province, China (Grant No. 19KJA610002), and the Natural Science Foundation, Educational Commission of Jiangsu Province, China (Grant No. 19KJB520052, 19KJB520050), and Collaborative Innovation Center of Novel Software Technology and Industrialization, Jiangsu, China.

Author information

Authors and Affiliations

School of Computer Science and Technology, Soochow University, Suzhou, China
Yiming Zhou, Yuehui Han, An Liu, Zhixu Li, Wei Chen & Lei Zhao
School of Information Technology and Electrical Engineering Brisbane, The University of Queensland, Brisbane, Australia
Hongzhi Yin

Authors

Yiming Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Yuehui Han
View author publications
You can also search for this author in PubMed Google Scholar
An Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhixu Li
View author publications
You can also search for this author in PubMed Google Scholar
Hongzhi Yin
View author publications
You can also search for this author in PubMed Google Scholar
Wei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Lei Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Wei Chen or Lei Zhao.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special Issue on Web Information Systems Engineering 2018

Guest Editors: Hakim Hacid, Wojciech Cellary, Hua Wang and Yanchun Zhang

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, Y., Han, Y., Liu, A. et al. Extracting representative user subset of social networks towards user characteristics and topological features. World Wide Web 23, 2903–2931 (2020). https://doi.org/10.1007/s11280-020-00828-5

Download citation

Received: 22 March 2019
Revised: 28 May 2020
Accepted: 01 June 2020
Published: 30 June 2020
Issue Date: September 2020
DOI: https://doi.org/10.1007/s11280-020-00828-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Extracting representative user subset of social networks towards user characteristics and topological features

Abstract

Access this article

Similar content being viewed by others

Extracting Representative User Subset of Social Networks Towards User Characteristics and Topological Features

CrawlSN: community-aware data acquisition with maximum willingness in online social networks

Social Networks Node Mining Algorithm of Based on Greedy Subgraph

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Extracting representative user subset of social networks towards user characteristics and topological features

Abstract

Access this article

Similar content being viewed by others

Extracting Representative User Subset of Social Networks Towards User Characteristics and Topological Features

CrawlSN: community-aware data acquisition with maximum willingness in online social networks

Social Networks Node Mining Algorithm of Based on Greedy Subgraph

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation