Skip to main content

Extracting Representative User Subset of Social Networks Towards User Characteristics and Topological Features

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11233))

Abstract

Extracting a subset of representative users from the original set in social networks plays a critical role in Social Network Analysis. In existing studies, some researchers focus on preserving users’ characteristics when sampling representative users, while others pay attention to preserving the topology structure. However, both users’ characteristics and the network topology contain abundant information of users. Thus, it is critical to preserve both of them while extracting the representative user subset. To achieve the goal, we propose a novel approach in this study, and formulate the problem as RUS (Representative User Subset) problem that is proved to be NP-Hard. To solve RUS problem, we propose a method KS (K-Selected) that is consisted of a clustering algorithm and a sampling model, where a greedy heuristic algorithm is proposed to solve the sampling model. To validate the performance of the proposed approach, extensive experiments are conducted on two real-world datasets. Results demonstrate that our method outperforms state-of-the-art approaches.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    \(d_1\): the number of user’s followers, \(d_2\): the number of user’s friends, \(d_3\): the score of user’s influence, \(d_4\): the score of user’s activity, \(d_5\): the number of tweets, \(d_6\): times of tweets being “liked”, \(d_7\): times of tweets being “retweeted”, \(d_8\): address, \(d_9\): words and phrases.

  2. 2.

    http://scikit-learn.org/stable/modules/naive_bayes.html#multinomial-naive-bayes.

  3. 3.

    http://scikit-learn.org/stable/modules/ensemble.html#random-forests.

References

  1. Anagnostopoulos, A., Kumar, R., Mahdian, M.: Influence and correlation in social networks. In: KDD, pp. 7–15 (2008)

    Google Scholar 

  2. Aslam, J.A., Montague, M.: Models for metasearch. In: SIGIR, pp. 276–284 (2001)

    Google Scholar 

  3. Chen, W., Wang, Y., Yang, S.: Efficient influence maximization in social networks. In: KDD, pp. 199–208 (2009)

    Google Scholar 

  4. Clauset, A., Newman, M.E., Moore, C.: Finding community structure in very large networks. Phys. Rev. E 70(2), 066111 (2004)

    Article  Google Scholar 

  5. Crandall, D.J., Cosley, D., Huttenlocher, D.P., Kleinberg, J.M., Suri, S.: Feedback effects between similarity and social influence in online communities. In: KDD, pp. 160–168 (2008)

    Google Scholar 

  6. Dan, G.: Partition-distance: a problem and class of perfect graphs arising in clustering. Info. Proc. Lett. 82(3), 159–164 (2002)

    Article  MathSciNet  Google Scholar 

  7. Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. A Wiley-Interscience Publication, Tronto (1973)

    MATH  Google Scholar 

  8. Elhamifar, E., Sapiro, G., Sastry, S.S.: Dissimilarity-based sparse subset selection. IEEE Trans. Pattern Anal. Mach. Intell. 38(11), 2182–2197 (2016)

    Article  Google Scholar 

  9. Ester, M., Kriegel, H., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, pp. 226–231 (1996)

    Google Scholar 

  10. Girvan, M., Newman, M.E.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA 99(12), 7821 (2002)

    Article  MathSciNet  Google Scholar 

  11. Goyal, A., Bonchi, F., Lakshmanan, L.V.S.: Discovering leaders from community actions. In: CIKM, pp. 499–508 (2008)

    Google Scholar 

  12. Han, Y., Tang, J.: Probabilistic community and role model for social networks. In: KDD, pp. 407–416 (2015)

    Google Scholar 

  13. Kaufmann, L., Rousseeuw, P.J.: Clustering by means of medoids. In: Statistical Data Analysis Based on the L1-norm & Related Methods, pp. 405–416 (1987)

    Google Scholar 

  14. Maiya, A.S., Berger-Wolf, T.Y.: Sampling community structure. In: WWW, pp. 701–710 (2010)

    Google Scholar 

  15. Megiddo, N., Supowit, K.J.: On the complexity of some common geometric location problems. SIAM 13(1), 182–196 (1984)

    Article  MathSciNet  Google Scholar 

  16. Newman, M.E.J.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74(3), 036104 (2006)

    Article  MathSciNet  Google Scholar 

  17. Page, L.: The pagerank citation ranking: bringing order to the web. Stanf. Digit. Libr. Work. Pap. 9(1), 1–14 (1998)

    Google Scholar 

  18. Papagelis, M., Das, G., Koudas, N.: Sampling online social networks. IEEE TKDE 25(3), 662–676 (2013)

    Google Scholar 

  19. Scripps, J., Tan, P., Esfahanian, A.: Measuring the effects of preprocessing decisions and network forces in dynamic network analysis. In: KDD, pp. 747–756 (2009)

    Google Scholar 

  20. Song, X., Chi, Y., Hino, K., Tseng, B.L.: Identifying opinion leaders in the blogosphere. In: CIKM, pp. 971–974 (2007)

    Google Scholar 

  21. Sun, K., Morrison, D., Bruno, E., Marchand-Maillet, S.: Learning representative nodes in social networks. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013. LNCS (LNAI), vol. 7819, pp. 25–36. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37456-2_3

    Chapter  Google Scholar 

  22. Tang, J., Sun, J., Wang, C., Yang, Z.: Social influence analysis in large-scale networks. In: KDD, pp. 807–816 (2009)

    Google Scholar 

  23. Tang, J., Zhang, C., Cai, K., Zhang, L., Su, Z.: Sampling representative users from large social networks. In: AAAI, pp. 304–310 (2015)

    Google Scholar 

  24. Ugander, J., Karrer, B., Backstrom, L., Kleinberg, J.M.: Graph cluster randomization: network exposure to multiple universes. In: KDD, pp. 329–337 (2013)

    Google Scholar 

  25. Vazirani, V.V.: Approximation Algorithms. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-662-04565-7

    Book  Google Scholar 

  26. Yin, H., Chen, H., Sun, X., Wang, H., Wang, Y., Nguyen, Q.V.H.: SPTF: a scalable probabilistic tensor factorization model for semantic-aware behavior prediction. In: ICDM, pp. 585–594 (2017)

    Google Scholar 

  27. Yin, H., Cui, B., Huang, Y.: Finding a wise group of experts in social networks. In: Tang, J., King, I., Chen, L., Wang, J. (eds.) ADMA 2011. LNCS (LNAI), vol. 7120, pp. 381–394. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25853-4_29

    Chapter  Google Scholar 

  28. Yin, H., et al.: Discovering interpretable geo-social communities for user behavior prediction. In: ICDE, pp. 942–953 (2016)

    Google Scholar 

  29. Yin, H., Zhou, X., Cui, B., Wang, H., Zheng, K., Hung, N.Q.V.: Adapting to user interest drift for POI recommendation. TKDE 28(10), 2566–2581 (2016)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant Nos. 61572335, 61572336, 61472263, 61402312 and 61402313, the Natural Science Foundation of Jiangsu Province of China under Grant No. BK20151223, and Collaborative Innovation Center of Novel Software Technology and Industrialization, Jiangsu, China.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Zhao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhou, Y., Han, Y., Liu, A., Li, Z., Yin, H., Zhao, L. (2018). Extracting Representative User Subset of Social Networks Towards User Characteristics and Topological Features. In: Hacid, H., Cellary, W., Wang, H., Paik, HY., Zhou, R. (eds) Web Information Systems Engineering – WISE 2018. WISE 2018. Lecture Notes in Computer Science(), vol 11233. Springer, Cham. https://doi.org/10.1007/978-3-030-02922-7_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-02922-7_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-02921-0

  • Online ISBN: 978-3-030-02922-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics