Extracting Representative User Subset of Social Networks Towards User Characteristics and Topological Features

Zhou, Yiming; Han, Yuehui; Liu, An; Li, Zhixu; Yin, Hongzhi; Zhao, Lei

doi:10.1007/978-3-030-02922-7_15

Extracting Representative User Subset of Social Networks Towards User Characteristics and Topological Features

Yiming Zhou¹⁸,
Yuehui Han¹⁸,
An Liu¹⁸,
Zhixu Li¹⁸,
Hongzhi Yin¹⁹ &
…
Lei Zhao¹⁸

Conference paper
First Online: 20 October 2018

1580 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11233))

Abstract

Extracting a subset of representative users from the original set in social networks plays a critical role in Social Network Analysis. In existing studies, some researchers focus on preserving users’ characteristics when sampling representative users, while others pay attention to preserving the topology structure. However, both users’ characteristics and the network topology contain abundant information of users. Thus, it is critical to preserve both of them while extracting the representative user subset. To achieve the goal, we propose a novel approach in this study, and formulate the problem as RUS (Representative User Subset) problem that is proved to be NP-Hard. To solve RUS problem, we propose a method KS (K-Selected) that is consisted of a clustering algorithm and a sampling model, where a greedy heuristic algorithm is proposed to solve the sampling model. To validate the performance of the proposed approach, extensive experiments are conducted on two real-world datasets. Results demonstrate that our method outperforms state-of-the-art approaches.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
\(d_1\): the number of user’s followers, \(d_2\): the number of user’s friends, \(d_3\): the score of user’s influence, \(d_4\): the score of user’s activity, \(d_5\): the number of tweets, \(d_6\): times of tweets being “liked”, \(d_7\): times of tweets being “retweeted”, \(d_8\): address, \(d_9\): words and phrases.
2.
http://scikit-learn.org/stable/modules/naive_bayes.html#multinomial-naive-bayes.
3.
http://scikit-learn.org/stable/modules/ensemble.html#random-forests.

References

Anagnostopoulos, A., Kumar, R., Mahdian, M.: Influence and correlation in social networks. In: KDD, pp. 7–15 (2008)
Google Scholar
Aslam, J.A., Montague, M.: Models for metasearch. In: SIGIR, pp. 276–284 (2001)
Google Scholar
Chen, W., Wang, Y., Yang, S.: Efficient influence maximization in social networks. In: KDD, pp. 199–208 (2009)
Google Scholar
Clauset, A., Newman, M.E., Moore, C.: Finding community structure in very large networks. Phys. Rev. E 70(2), 066111 (2004)
Article Google Scholar
Crandall, D.J., Cosley, D., Huttenlocher, D.P., Kleinberg, J.M., Suri, S.: Feedback effects between similarity and social influence in online communities. In: KDD, pp. 160–168 (2008)
Google Scholar
Dan, G.: Partition-distance: a problem and class of perfect graphs arising in clustering. Info. Proc. Lett. 82(3), 159–164 (2002)
Article MathSciNet Google Scholar
Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. A Wiley-Interscience Publication, Tronto (1973)
MATH Google Scholar
Elhamifar, E., Sapiro, G., Sastry, S.S.: Dissimilarity-based sparse subset selection. IEEE Trans. Pattern Anal. Mach. Intell. 38(11), 2182–2197 (2016)
Article Google Scholar
Ester, M., Kriegel, H., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, pp. 226–231 (1996)
Google Scholar
Girvan, M., Newman, M.E.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA 99(12), 7821 (2002)
Article MathSciNet Google Scholar
Goyal, A., Bonchi, F., Lakshmanan, L.V.S.: Discovering leaders from community actions. In: CIKM, pp. 499–508 (2008)
Google Scholar
Han, Y., Tang, J.: Probabilistic community and role model for social networks. In: KDD, pp. 407–416 (2015)
Google Scholar
Kaufmann, L., Rousseeuw, P.J.: Clustering by means of medoids. In: Statistical Data Analysis Based on the L1-norm & Related Methods, pp. 405–416 (1987)
Google Scholar
Maiya, A.S., Berger-Wolf, T.Y.: Sampling community structure. In: WWW, pp. 701–710 (2010)
Google Scholar
Megiddo, N., Supowit, K.J.: On the complexity of some common geometric location problems. SIAM 13(1), 182–196 (1984)
Article MathSciNet Google Scholar
Newman, M.E.J.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74(3), 036104 (2006)
Article MathSciNet Google Scholar
Page, L.: The pagerank citation ranking: bringing order to the web. Stanf. Digit. Libr. Work. Pap. 9(1), 1–14 (1998)
Google Scholar
Papagelis, M., Das, G., Koudas, N.: Sampling online social networks. IEEE TKDE 25(3), 662–676 (2013)
Google Scholar
Scripps, J., Tan, P., Esfahanian, A.: Measuring the effects of preprocessing decisions and network forces in dynamic network analysis. In: KDD, pp. 747–756 (2009)
Google Scholar
Song, X., Chi, Y., Hino, K., Tseng, B.L.: Identifying opinion leaders in the blogosphere. In: CIKM, pp. 971–974 (2007)
Google Scholar
Sun, K., Morrison, D., Bruno, E., Marchand-Maillet, S.: Learning representative nodes in social networks. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013. LNCS (LNAI), vol. 7819, pp. 25–36. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37456-2_3
Chapter Google Scholar
Tang, J., Sun, J., Wang, C., Yang, Z.: Social influence analysis in large-scale networks. In: KDD, pp. 807–816 (2009)
Google Scholar
Tang, J., Zhang, C., Cai, K., Zhang, L., Su, Z.: Sampling representative users from large social networks. In: AAAI, pp. 304–310 (2015)
Google Scholar
Ugander, J., Karrer, B., Backstrom, L., Kleinberg, J.M.: Graph cluster randomization: network exposure to multiple universes. In: KDD, pp. 329–337 (2013)
Google Scholar
Vazirani, V.V.: Approximation Algorithms. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-662-04565-7
Book Google Scholar
Yin, H., Chen, H., Sun, X., Wang, H., Wang, Y., Nguyen, Q.V.H.: SPTF: a scalable probabilistic tensor factorization model for semantic-aware behavior prediction. In: ICDM, pp. 585–594 (2017)
Google Scholar
Yin, H., Cui, B., Huang, Y.: Finding a wise group of experts in social networks. In: Tang, J., King, I., Chen, L., Wang, J. (eds.) ADMA 2011. LNCS (LNAI), vol. 7120, pp. 381–394. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25853-4_29
Chapter Google Scholar
Yin, H., et al.: Discovering interpretable geo-social communities for user behavior prediction. In: ICDE, pp. 942–953 (2016)
Google Scholar
Yin, H., Zhou, X., Cui, B., Wang, H., Zheng, K., Hung, N.Q.V.: Adapting to user interest drift for POI recommendation. TKDE 28(10), 2566–2581 (2016)
Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant Nos. 61572335, 61572336, 61472263, 61402312 and 61402313, the Natural Science Foundation of Jiangsu Province of China under Grant No. BK20151223, and Collaborative Innovation Center of Novel Software Technology and Industrialization, Jiangsu, China.

Author information

Authors and Affiliations

School of Computer Science and Technology, Soochow University, Suzhou, China
Yiming Zhou, Yuehui Han, An Liu, Zhixu Li & Lei Zhao
School of Information Technology and Electrical Engineering, The University of Queensland, Brisbane, Australia
Hongzhi Yin

Authors

Yiming Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Yuehui Han
View author publications
You can also search for this author in PubMed Google Scholar
An Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhixu Li
View author publications
You can also search for this author in PubMed Google Scholar
Hongzhi Yin
View author publications
You can also search for this author in PubMed Google Scholar
Lei Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lei Zhao .

Editor information

Editors and Affiliations

Zayed University, Dubai, United Arab Emirates
Hakim Hacid
Poznan University of Economics, Poznan, Poland
Wojciech Cellary
University of Victoria, Footscray, VIC, Australia
Hua Wang
UNSW Australia, Sydney, NSW, Australia
Hye-Young Paik
Swinburne University of Technology, Hawthorn, VIC, Australia
Rui Zhou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, Y., Han, Y., Liu, A., Li, Z., Yin, H., Zhao, L. (2018). Extracting Representative User Subset of Social Networks Towards User Characteristics and Topological Features. In: Hacid, H., Cellary, W., Wang, H., Paik, HY., Zhou, R. (eds) Web Information Systems Engineering – WISE 2018. WISE 2018. Lecture Notes in Computer Science(), vol 11233. Springer, Cham. https://doi.org/10.1007/978-3-030-02922-7_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-02922-7_15
Published: 20 October 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02921-0
Online ISBN: 978-3-030-02922-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics