skip to main content
10.1145/3018661.3018693acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Leveraging Behavioral Factorization and Prior Knowledge for Community Discovery and Profiling

Published: 02 February 2017 Publication History

Abstract

Recently community detection has attracted much interest in social media to understand the collective behaviours of users and allow individuals to be modeled in the context of the group. Most existing approaches for community detection exploit either users' social links or their published content, aiming at discovering groups of densely connected or highly similar users. They often fail to find effective communities due to excessive noise in content, sparsity in links, and heterogenous behaviours of users in social media. Further, they are unable to provide insights and rationales behind the formation of the group and the collective behaviours of the users. To tackle these challenges, we propose to discover communities in a low- dimensional latent space in which we simultaneously learn the representation of users and communities. In particular, we integrated different social views of the network into a low-dimensional latent space in which we sought dense clusters of users as communities. By imposing a Laplacian regularizer into affiliation matrix, we further incorporated prior knowledge into the community discovery process. Finally community profiles were computed by a linear operator integrating the profiles of members. Taking the wellness domain as an example, we conducted experiments on a large scale real world dataset of users tweeting about diabetes and its related concepts, which demonstrate the effectiveness of our approach in discovering and profiling user communities.

References

[1]
M. Akbari, X. Hu, L. Nie, and T.-S. Chua. Towards organizing health knowledge on community-based health services. EURASIP JBSB, 2016.
[2]
M. Akbari, X. Huc, N. Liqianga, and T.-S. Chua. From tweets to wellness: Wellness event detection from twitter streams. In AAAI, 2016.
[3]
M. Akbari, L. Nie, and T.-S. Chua. amm: Towards adaptive ranking of multi-modal documents. IJMIR, 2015.
[4]
G. Eysenbach, J. Powell, M. Englesakis, C. Rizo, and A. Stern. Health related virtual communities and electronic support groups: systematic review of the effects of online peer to peer interactions. BMJ, 2004.
[5]
M. Harvey, F. Crestani, and M. J. Carman. Building user profiles from topic models for personalised search. In CIKM, 2013.
[6]
X. He, M.-Y. Kan, P. Xie, and X. Chen. Comment-based multi-view clustering of web 2.0 items. In WWW, 2014.
[7]
X. Hu, L. Tang, J. Tang, and H. Liu. Exploiting social relations for sentiment analysis in microblogging. In WSDM, 2013.
[8]
A. K. Jain. Data clustering: 50 years beyond k-means. Pattern Recogn. Lett., 2010.
[9]
T. Joachims et al. Transductive learning via spectral graph partitioning. In ICML, 2003.
[10]
A. Kumar, P. Rai, and H. Daume. Co-regularized multi-view spectral clustering. In NIPS, 2011.
[11]
T. Lappas, K. Liu, and E. Terzi. Finding a team of experts in social networks. In KDD, 2009.
[12]
J. Leskovec, K. J. Lang, and M. Mahoney. Empirical comparison of algorithms for network community detection. In WWW, 2010.
[13]
J. Li, X. Hu, J. Tang, and H. Liu. Unsupervised streaming feature selection in social media. In CIKM, 2015.
[14]
C. Lu, X. Chen, and E. Park. Exploit the tripartite network of social tagging for web clustering. In CIKM, 2009.
[15]
A. Majumder and N. Shrivastava. Know your personalization: learning topic level personalization in online services. In WWW, 2013.
[16]
G. M. Namata, B. Staats, L. Getoor, and B. Shneiderman. A dual-view approach to interactive network visualization. In CIKM, 2007.
[17]
M. E. Newman. Modularity and community structure in networks. PNAS, 2006.
[18]
L. Nie, Y.-L. Zhao, M. Akbari, J. Shen, and T.-S. Chua. Bridging the vocabulary gap between health seekers and healthcare knowledge. TKDE, 2015.
[19]
G.-J. Qi, C. C. Aggarwal, and T. Huang. Community detection with edge content in social media networks. In ICDE, 2012.
[20]
Y. Ruan, D. Fuhry, and S. Parthasarathy. Efficient community detection in large networks using content and links. In WWW, 2013.
[21]
C. R. Shalizi and A. C. Thomas. Homophily and contagion are generically confounded in observational social network studies. Sociological Methods Res, 2011.
[22]
J. Shi and J. Malik. Normalized cuts and image segmentation. PAMI, 2000.
[23]
D. Shin, S. Cetintas, K.-C. Lee, and I. S. Dhillon. Tumblr blog recommendation with boosted inductive matrix completion. In CIKM, 2015.
[24]
X. Song, L. Nie, L. Zhang, M. Akbari, and T.-S. Chua. Multiple social network learning and its application in volunteerism tendency prediction. In SIGIR, 2015.
[25]
J. Tang and H. Liu. Unsupervised feature selection for linked social media data. In KDD, 2012.
[26]
G. Ver Steeg, A. Galstyan, and A. E. Allahverdyan. Statistical mechanics of semi-supervised clustering in sparse graphs. Journal of Statistical Mechanics: Theory and Experiment, 2011.
[27]
P. Wang, J. Guo, and Y. Lan. Modeling retail transaction data for personalized shopping recommendation. In CIKM, 2014.
[28]
X. Wang, B. Qian, and I. Davidson. Labels vs. pairwise constraints: A unified view of label propagation and constrained spectral clustering. In ICDM, 2012.
[29]
L. Wu, L. Yang, N. Yu, and X.-S. Hua. Learning to tag. In WWW, 2009.
[30]
L. Yang, X. Cao, D. Jin, X. Wang, and D. Meng. A unified semi-supervised community detection framework using latent space graph regularization. CYB, 2015.
[31]
T. Yang, R. Jin, Y. Chi, and S. Zhu. Combining link and content for community detection: a discriminative approach. In KDD, 2009.
[32]
Y. Yang, C. Lan, X. Li, B. Luo, and J. Huan. Automatic social circle detection using multi-view clustering. In CIKM, 2014.
[33]
P. S. Yu and J. Zhang. Mcd: Mutual clustering across multiple social networks. In TBD, 2015.
[34]
H. Zha, X. He, C. Ding, M. Gu, and H. D. Simon. Spectral relaxation for k-means clustering. In NIPS, 2001.
[35]
J. Zhang and P. S. Yu. Community detection for emerging networks. In SDM, 2015.
[36]
Z.-Y. Zhang, K.-D. Sun, and S.-Q. Wang. Enhanced community structure detection in complex networks with partial background information. Scientific reports, 2013.
[37]
T. Zhao, J. McAuley, and I. King. Improving latent factor models via personalized feature projection for one class recommendation. In CIKM, 2015.
[38]
Z. Zhao, Z. Cheng, L. Hong, and E. H. Chi. Improving user topic interest profiles by behavior factorization. In WWW, 2015.
[39]
W. Zhou, H. Jin, and Y. Liu. Community discovery and profiling with social messages. In KDD, 2012.
[40]
Y. Zhou and L. Liu. Social influence based clustering of heterogeneous information networks. In KDD, 2013.

Cited By

View all
  • (2023)Multi-aspect Data Learning: Overview, Challenges and ApproachesMulti-aspect Learning10.1007/978-3-031-33560-0_1(1-25)Online publication date: 28-Jul-2023
  • (2022)Modeling and Analysis of Group User Portrait through WeChat Mini ProgramWireless Communications & Mobile Computing10.1155/2022/25159622022Online publication date: 1-Jan-2022
  • (2021)An Improved Community Detection Algorithm via Fusing Topology and Attribute Information2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD)10.1109/CSCWD49262.2021.9437681(1069-1074)Online publication date: 5-May-2021
  • Show More Cited By

Index Terms

  1. Leveraging Behavioral Factorization and Prior Knowledge for Community Discovery and Profiling

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        WSDM '17: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining
        February 2017
        868 pages
        ISBN:9781450346757
        DOI:10.1145/3018661
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 02 February 2017

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. factorization
        2. group profiling
        3. latent space learning
        4. social networks
        5. user profiling

        Qualifiers

        • Research-article

        Funding Sources

        • NEXT++

        Conference

        WSDM 2017

        Acceptance Rates

        WSDM '17 Paper Acceptance Rate 80 of 505 submissions, 16%;
        Overall Acceptance Rate 498 of 2,863 submissions, 17%

        Upcoming Conference

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)14
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 17 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2023)Multi-aspect Data Learning: Overview, Challenges and ApproachesMulti-aspect Learning10.1007/978-3-031-33560-0_1(1-25)Online publication date: 28-Jul-2023
        • (2022)Modeling and Analysis of Group User Portrait through WeChat Mini ProgramWireless Communications & Mobile Computing10.1155/2022/25159622022Online publication date: 1-Jan-2022
        • (2021)An Improved Community Detection Algorithm via Fusing Topology and Attribute Information2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD)10.1109/CSCWD49262.2021.9437681(1069-1074)Online publication date: 5-May-2021
        • (2020)Discovering Communities with SGNS Modelling-based Network connections and Text communications Clustering2020 IEEE Symposium Series on Computational Intelligence (SSCI)10.1109/SSCI47803.2020.9308190(1770-1777)Online publication date: 1-Dec-2020
        • (2020)The role of knowledge in determining identity of long-tail entitiesJournal of Web Semantics10.1016/j.websem.2020.100565(100565)Online publication date: Apr-2020
        • (2020)Big data analytics meets social media: A systematic review of techniques, open issues, and future directionsTelematics and Informatics10.1016/j.tele.2020.101517(101517)Online publication date: Oct-2020
        • (2020)Temporal Latent Space Modeling for Community PredictionAdvances in Information Retrieval10.1007/978-3-030-45439-5_49(745-759)Online publication date: 8-Apr-2020
        • (2019)Characterising and evaluating dynamic online communities from live microblogging user interactionsSocial Network Analysis and Mining10.1007/s13278-019-0576-89:1Online publication date: 3-Jul-2019
        • (2019)Multi-type Relational Data Clustering for Community Detection by Exploiting Content and Structure Information in Social NetworksPRICAI 2019: Trends in Artificial Intelligence10.1007/978-3-030-29911-8_42(541-554)Online publication date: 23-Aug-2019
        • (2019)Learning Wellness Profiles of Users on Social Networks: The Case of DiabetesSocial Web and Health Research10.1007/978-3-030-14714-3_8(139-169)Online publication date: 29-Jun-2019
        • Show More Cited By

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media