skip to main content
10.1145/3269206.3271706acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Insights from the Long-Tail: Learning Latent Representations of Online User Behavior in the Presence of Skew and Sparsity

Published: 17 October 2018 Publication History

Abstract

This paper proposes an approach to learn robust behavior representations in online platforms by addressing the challenges of user behavior skew and sparse participation. Latent behavior models are important in a wide variety of applications: recommender systems; prediction; user profiling; community characterization. Our framework is the first to jointly address skew and sparsity across graphical behavior models. We propose a generalizable bayesian approach to partition users in the presence of skew while simultaneously learning latent behavior profiles over these partitions to address user-level sparsity. Our behavior profiles incorporate the temporal activity and links between participants, although the proposed framework is flexible to introduce other definitions of participant behavior. Our approach explicitly discounts frequent behaviors and learns variable size partitions capturing diverse behavior trends. The partitioning approach is data-driven with no rigid assumptions, adapting to varying degrees of skew and sparsity.
A qualitative analysis indicates our ability to discover niche and informative user groups on large online platforms. Results on User Characterization (+6-22% AUC); Content Recommendation (+6-43% AUC) and Future Activity Prediction (+12-25% RMSE) indicate significant gains over state-of-the-art baselines. Furthermore, user cluster quality is validated with magnified gains in the characterization of users with sparse activity.

References

[1]
David J. Aldous, Illdar A. Ibragimov, and Jean Jacod. 2006. Ecole d'Ete de Probabilites de Saint-Flour XIII, 1983. Vol. 1117. Springer.
[2]
Ashton Anderson, Daniel Huttenlocher, Jon Kleinberg, and Jure Leskovec. 2014. Engaging with massive online courses. In Proceedings of the 23rd international conference on World wide web. ACM, 687--698.
[3]
Albert-Laszlo Barabasi. 2005. The origin of bursts and heavy tails in human dynamics. Nature, Vol. 435, 7039 (2005), 207--211.
[4]
Alex Beutel, Kenton Murray, Christos Faloutsos, and Alexander J. Smola. 2014. Cobafi: collaborative bayesian filtering. In Proceedings of the 23rd international conference on World wide web. ACM, 97--108.
[5]
Hongyun Cai, Vincent W Zheng, Fanwei Zhu, Kevin Chen-Chuan Chang, and Zi Huang. 2017. From community detection to community profiling. Proceedings of the VLDB Endowment, Vol. 10, 7 (2017), 817--828.
[6]
Qiming Diao, Jing Jiang, Feida Zhu, and Ee-Peng Lim. 2012. Finding bursty topics from microblogs. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers-Volume 1. Association for Computational Linguistics, 536--544.
[7]
Ali Mamdouh Elkahky, Yang Song, and Xiaodong He. 2015. A multi-view deep learning approach for cross domain user modeling in recommendation systems. In Proceedings of the 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 278--288.
[8]
Meng Jiang, Peng Cui, Fei Wang, Xinran Xu, Wenwu Zhu, and Shiqiang Yang. 2014. Fema: flexible evolutionary multi-faceted analysis for dynamic behavioral pattern discovery. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 1186--1195.
[9]
Adit Krishnan, Ashish Sharma, and Hari Sundaram. 2017. Improving Latent User Models in Online Social Media. arXiv preprint arXiv:1711.11124 (2017).
[10]
Jure Leskovec, Daniel Huttenlocher, and Jon Kleinberg. 2010. Predicting positive and negative links in online social networks. In Proceedings of the 19th international conference on World wide web. ACM, 641--650.
[11]
Zongyang Ma, Aixin Sun, Quan Yuan, and Gao Cong. 2015. A Tri-Role Topic Model for Domain-Specific Question Answering. In AAAI. 224--230.
[12]
Thomas Minka. 2000. Estimating a Dirichlet distribution. (2000).
[13]
Weike Pan, Evan Wei Xiang, Nathan Nan Liu, and Qiang Yang. 2010. Transfer Learning in Collaborative Filtering for Sparsity Reduction. In AAAI, Vol. 10. 230--235.
[14]
Jim Pitman and Marc Yor. 1997. The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator. The Annals of Probability (1997), 855--900.
[15]
Jiezhong Qiu, Jie Tang, Tracy Xiao Liu, Jie Gong, Chenhui Zhang, Qian Zhang, and Yufei Xue. 2016. Modeling and predicting learning behavior in MOOCs. In Proceedings of the Ninth ACM International Conference on Web Search and Data Mining. ACM, 93--102.
[16]
Minghui Qiu, Feida Zhu, and Jing Jiang. 2013. It is not just what we say, but how we say them: Lda-based behavior-topic model. In Proceedings of the 2013 SIAM International Conference on Data Mining. SIAM, 794--802.
[17]
Qiang Qu, Cen Chen, Christian S Jensen, and Anders Skovsgaard. 2015. Space-Time Aware Behavioral Topic Modeling for Microblog Posts. IEEE Data Eng. Bull., Vol. 38, 2 (2015), 58--67.
[18]
Xiaojun Quan, Chunyu Kit, Yong Ge, and Sinno Jialin Pan. 2015. Short and Sparse Text Topic Modeling via Self-Aggregation. In IJCAI. 2270--2276.
[19]
Issei Sato and Hiroshi Nakagawa. 2010. Topic models with power-law using Pitman-Yor process. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 673--682.
[20]
Yee Whye Teh. 2006. A hierarchical Bayesian language model based on Pitman-Yor processes. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 985--992.
[21]
John Towns, Timothy Cockerill, Maytal Dahan, Ian Foster, Kelly Gaither, Andrew Grimshaw, Victor Hazlewood, Scott Lathrop, Dave Lifka, Gregory D Peterson, and others. 2014. XSEDE: accelerating scientific discovery. Computing in Science & Engineering, Vol. 16, 5 (2014), 62--74.
[22]
Hanna M Wallach, David M Mimno, and Andrew McCallum. 2009. Rethinking LDA: Why priors matter. In Advances in neural information processing systems. 1973--1981.
[23]
Xuerui Wang and Andrew McCallum. 2006. Topics over time: a non-Markov continuous-time model of topical trends. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 424--433.
[24]
Gui-Rong Xue, Chenxi Lin, Qiang Yang, WenSi Xi, Hua-Jun Zeng, Yong Yu, and Zheng Chen. 2005. Scalable collaborative filtering using cluster-based smoothing. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 114--121.
[25]
Chunfeng Yang, Huan Yan, Donghan Yu, Yong Li, and Dah Ming Chiu. 2017. Multi-site User Behavior Modeling and Its Application in Video Recommendation. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 175--184.
[26]
Hongzhi Yin, Yizhou Sun, Bin Cui, Zhiting Hu, and Ling Chen. 2013. Lcars: a location-content-aware recommender system. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 221--229.
[27]
Jianhua Yin and Jianyong Wang. 2014. A dirichlet multinomial mixture model-based approach for short text clustering. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 233--242.
[28]
Zhijun Yin, Liangliang Cao, Jiawei Han, Chengxiang Zhai, and Thomas Huang. 2011. Geographical topic discovery and comparison. In Proceedings of the 20th international conference on World wide web. ACM, 247--256.
[29]
Zhe Zhao, Zhiyuan Cheng, Lichan Hong, and Ed H. Chi. 2015. Improving user topic interest profiles by behavior factorization. In Proceedings of the 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1406--1416.

Cited By

View all
  • (2024)Accurate and Scalable Graph Convolutional Networks for Recommendation Based on Subgraph PropagationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.346733336:12(7556-7568)Online publication date: Dec-2024
  • (2023)DIRS-KG: a KG-enhanced interactive recommender system based on deep reinforcement learningWorld Wide Web10.1007/s11280-022-01135-x26:5(2471-2493)Online publication date: 1-Apr-2023
  • (2022)GCN-Denoiser: Mesh Denoising with Graph Convolutional NetworksACM Transactions on Graphics10.1145/348016841:1(1-14)Online publication date: 9-Feb-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management
October 2018
2362 pages
ISBN:9781450360142
DOI:10.1145/3269206
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. behavior analysis
  2. behavior skew
  3. data sparsity
  4. interactive media platforms
  5. probabilistic graphical models

Qualifiers

  • Research-article

Conference

CIKM '18
Sponsor:

Acceptance Rates

CIKM '18 Paper Acceptance Rate 147 of 826 submissions, 18%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)18
  • Downloads (Last 6 weeks)1
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Accurate and Scalable Graph Convolutional Networks for Recommendation Based on Subgraph PropagationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.346733336:12(7556-7568)Online publication date: Dec-2024
  • (2023)DIRS-KG: a KG-enhanced interactive recommender system based on deep reinforcement learningWorld Wide Web10.1007/s11280-022-01135-x26:5(2471-2493)Online publication date: 1-Apr-2023
  • (2022)GCN-Denoiser: Mesh Denoising with Graph Convolutional NetworksACM Transactions on Graphics10.1145/348016841:1(1-14)Online publication date: 9-Feb-2022
  • (2022)Multi-task Knowledge Graph Representations via Residual FunctionsAdvances in Knowledge Discovery and Data Mining10.1007/978-3-031-05933-9_21(262-275)Online publication date: 16-May-2022
  • (2021)PCEDNet: A Lightweight Neural Network for Fast and Interactive Edge Detection in 3D Point CloudsACM Transactions on Graphics10.1145/348180441:1(1-21)Online publication date: 10-Nov-2021
  • (2021)SPEX: A Generic Framework for Enhancing Neural Social RecommendationACM Transactions on Information Systems10.1145/347333840:2(1-33)Online publication date: 27-Sep-2021
  • (2021)Bilateral Filtering Graph Convolutional Network for Multi-relational Social Recommendation in the Power-law NetworksACM Transactions on Information Systems10.1145/346979940:2(1-24)Online publication date: 27-Sep-2021
  • (2021)Self-supervised transfer learning of physiological representations from free-living wearable dataProceedings of the Conference on Health, Inference, and Learning10.1145/3450439.3451863(69-78)Online publication date: 8-Apr-2021
  • (2021)Stacked Convolutional Sparse Auto-Encoders for Representation LearningACM Transactions on Knowledge Discovery from Data10.1145/343476715:2(1-21)Online publication date: 5-Mar-2021
  • (2021)A Polishing Robot Force Control System Based on Time Series Data in Industrial Internet of ThingsACM Transactions on Internet Technology10.1145/341946921:2(1-22)Online publication date: 8-Mar-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media