skip to main content
10.1145/3219819.3219821acmotherconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

I Know You'll Be Back: Interpretable New User Clustering and Churn Prediction on a Mobile Social Application

Published: 19 July 2018 Publication History

Abstract

As online platforms are striving to get more users, a critical challenge is user churn, which is especially concerning for new users. In this paper, by taking the anonymous large-scale real-world data from Snapchat as an example, we develop ClusChurn, a systematic two-step framework for interpretable new user clustering and churn prediction, based on the intuition that proper user clustering can help understand and predict user churn. Therefore, ClusChurn firstly groups new users into interpretable typical clusters, based on their activities on the platform and ego-network structures. Then we design a novel deep learning pipeline based on LSTM and attention to accurately predict user churn with very limited initial behavior data, by leveraging the correlations among users' multi- dimensional activities and the underlying user types. ClusChurn is also able to predict user types, which enables rapid reactions to different types of user churn. Extensive data analysis and experiments show that ClusChurn provides valuable insight into user behaviors, and achieves state-of-the-art churn prediction performance. The whole framework is deployed as a data analysis pipeline, delivering real-time data analysis and prediction results to multiple relevant teams for business intelligence uses. It is also general enough to be readily adopted by any online systems with user behavior data.

Supplementary Material

MP4 File (yang_interpretable_clustering.mp4)

References

[1]
Fabian Abel, Qi Gao, Geert-Jan Houben, and Ke Tao. 2013. Twitter-Based User Modeling for News Recommendations. IJCAI, Vol. Vol. 13. 2962--2966.
[2]
Arvind Arasu, Jasmine Novak, Andrew Tomkins, and John Tomlin. 2002. PageRank computation and the structure of the web: Experiments and algorithms WWW, Poster Track. 107--117.
[3]
Wai-Ho Au, Keith C. C. Chan, and Xin Yao. 2003. A novel evolutionary data mining algorithm with applications to churn prediction. IEEE transactions on evolutionary computation, Vol. 7, 6 (2003), 532--545.
[4]
Lars Backstrom and Jon Kleinberg. 2014. Romantic partnerships and the dispersion of social ties: a network analysis of relationship status on facebook. In Proceedings of the 17th ACM conference on Computer supported cooperative work &social computing. ACM, 831--841.
[5]
Mohamed Bouaziz, Mohamed Morchid, Richard Dufour, Georges Linarès, and Renato De Mori. 2016. Parallel Long Short-Term Memory for multi-stream classification Spoken Language Technology Workshop (SLT). IEEE, 218--223.
[6]
George E. P. Box, Gwilym M. Jenkins, Gregory C. Reinsel, and Greta M. Ljung. 2015. Time series analysis: forecasting and control. John Wiley &Sons.
[7]
Shaosheng Cao, Wei Lu, and Qiongkai Xu. 2015. GraRep:Learning Graph Representations with Global Structural Information CIKM. 891--900.
[8]
Simla Ceyhan, Xiaolin Shi, and Jure Leskovec. 2011. Dynamics of bidding in a P2P lending service: effects of herding and predicting loan success WWW. ACM, 547--556.
[9]
Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In EMNLP.
[10]
John L Daly. 2002. Pricing for profitability: Activity-based pricing for competitive advantage. Vol. Vol. 11. John Wiley &Sons.
[11]
Cristian Danescu-Niculescu-Mizil, Robert West, Dan Jurafsky, Jure Leskovec, and Christopher Potts. 2013. No country for old members: User lifecycle and linguistic change in online communities WWW. ACM, 307--318.
[12]
Michaëll Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering NIPS.
[13]
Misha Denil, Loris Bazzani, Hugo Larochelle, and Nando de Freitas. 2012. Learning where to attend with deep architectures for image tracking. Neural computation, Vol. 24, 8 (2012), 2151--2184.
[14]
Alessandro Epasto, Silvio Lattanzi, Vahab Mirrokni, Ismail Oner Sebe, Ahmed Taei, and Sunita Verma. 2015. Ego-net community mining applied to friend suggestion. VLDB, Vol. 9, 4 (2015), 324--335.
[15]
Terry Gillen. 2005. Winning new business in construction. Gower Publishing, Ltd.
[16]
Jennifer Golbeck, James Hendler, and others. 2006. Filmtrust: Movie recommendations using trust in web-based social networks CCNC, Vol. Vol. 96. 282--286.
[17]
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. KDD. ACM, 855--864.
[18]
Ramanathan Guha, Vineet Gupta, Vivek Raghunathan, and Ramakrishnan Srikant. 2015. User modeling for a personal assistant. In WSDM. ACM, 275--284.
[19]
Jiawei Han, Jian Pei, and Micheline Kamber. 2011. Data mining: concepts and techniques. Elsevier.
[20]
Neil Harris. 2006. Method for customizing multi-media advertisement for targeting specific demographics. (May 26. 2006). US Patent App. 11/441,529.
[21]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780.
[22]
Yuheng Hu, Lydia Manikonda, Subbarao Kambhampati, and others. 2014. What We Instagram: A First Analysis of Instagram Photo Content and User Types. ICWSM.
[23]
Jaya Kawale, Aditya Pal, and Jaideep Srivastava. 2009. Churn prediction in MMORPGs: A social influence based approach CSE, Vol. Vol. 4. IEEE, 423--428.
[24]
Thomas N. Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks ICLR.
[25]
Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media? WWW. ACM, 591--600.
[26]
Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. 2005. Graphs over time: densification laws, shrinking diameters and possible explanations KDD. ACM, 177--187.
[27]
Huayu Li, Yong Ge, and Hengshu Zhu. 2016. Point-of-Interest Recommendations: Learning Potential Check-ins from Friends KDD.
[28]
Jiahui Liu, Peter Dolan, and Elin Rønby Pedersen. 2010. Personalized news recommendation based on click behavior ICIUI. ACM, 31--40.
[29]
Caroline Lo, Dan Frankowski, and Jure Leskovec. 2016. Understanding Behaviors that Lead to Purchasing: A Case Study of Pinterest. KDD. 531--540.
[30]
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. JMLR, Vol. 9, Nov (2008), 2579--2605.
[31]
Junhua Mao, Wei Xu, Yi Yang, Jiang Wang, Zhiheng Huang, and Alan Yuille. 2015. Deep captioning with multimodal recurrent neural networks (m-rnn) ICLR.
[32]
Julian John McAuley and Jure Leskovec. 2013. From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews. In WWW. ACM, 897--908.
[33]
Tomas Mikolov, Martin Karafiát, Lukas Burget, Jan Cernockỳ, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In Interspeech, Vol. Vol. 2. 3.
[34]
Sandra Moriarty, Nancy D. Mitchell, William D. Wells, Robert Crawford, Linda Brennan, and Ruth Spence-Stone. 2014. Advertising: Principles and practice. Pearson Australia.
[35]
Mathias Niepert, Mohamed Ahmed, and Konstantin Kutzkov. 2016. Learning convolutional neural networks for graphs. ICML. 2014--2023.
[36]
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations KDD. ACM, 701--710.
[37]
Elaine Rich. 1979. User modeling via stereotypes. Cognitive science, Vol. 3, 4 (1979), 329--354.
[38]
Peter J. Rousseeuw. 1987. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics Vol. 20 (1987), 53--65.
[39]
Mike Schuster and Kuldip K. Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing Vol. 45, 11 (1997), 2673--2681.
[40]
Xiaolin Shi, Matthew Bonner, Lada A. Adamic, and Anna C. Gilbert. 2008. The very small world of the well-connected. In Proceedings of the nineteenth ACM conference on Hypertext and hypermedia. ACM, 61--70.
[41]
Mirco Speretta and Susan Gauch. 2005. Personalized search based on user search histories ICWI. IEEE, 622--628.
[42]
Kai Sheng Tai, Richard Socher, and Christopher D. Manning. 2015. Improved semantic representations from tree-structured long short-term memory networks ACL.
[43]
Duyu Tang, Bing Qin, and Ting Liu. 2015 a. Document Modeling with Gated Recurrent Neural Network for Sentiment Classification. EMNLP. 1422--1432.
[44]
Duyu Tang, Bing Qin, Ting Liu, and Yuekui Yang. 2015 b. User Modeling with Neural Network for Review Rating Prediction. IJCAI. 1340--1346.
[45]
Jiliang Tang, Xia Hu, Huiji Gao, and Huan Liu. 2013. Exploiting Local and Global Social Context for Recommendation IJCAI. 264--269.
[46]
Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015 c. Line: Large-scale information network embedding. WWW. ACM, 1067--1077.
[47]
Grigorios Tsoumakas and Ioannis Katakis. 2006. Multi-label classification: An overview. International Journal of Data Warehousing and Mining, Vol. 3, 3 (2006).
[48]
Geoffrey I. Webb, Michael J. Pazzani, and Daniel Billsus. 2001. Machine learning for user modeling. User modeling and user-adapted interaction Vol. 11, 1 (2001), 19--29.
[49]
Carl Yang, Lanxiao Bai, Chao Zhang, Quan Yuan, and Jiawei Han. 2017. Bridging Collaborative Filtering and Semi-Supervised Learning: A Neural Approach for POI Recommendation. In KDD. ACM, 1245--1254.
[50]
Cheng Yang, Zhiyuan Liu, Deli Zhao, Maosong Sun, and Edward Y. Chang. 2015. Network Representation Learning with Rich Text Information. IJCAI. 2111--2117.
[51]
Carl Yang, Hanqing Lu, and Kevin Chang Chang. 2017. CONE: Community Oriented Network Embedding. arXiv preprint arXiv:1709.01554 (2017).
[52]
Carl Yang, Chao Zhang, Xuewen Chen, Jieping Ye, and Jiawei Han. 2018. Did You Enjoy the Ride: Understanding Passenger Experience via Heterogeneous Network Embedding ICDE. IEEE.
[53]
Carl Yang, Lin Zhong, Li-Jia Li, and Luo Jie. 2017. Bi-directional Joint Inference for User Links and Attributes on Large Social Graphs WWW. 564--573.
[54]
Hongzhi Yin, Bin Cui, Ling Chen, Zhiting Hu, and Xiaofang Zhou. 2015. Dynamic user modeling in social media systems. ACM Transactions on Information Systems (TOIS), Vol. 33, 3 (2015), 10.
[55]
Manzil Zaheer, Amr Ahmed, and Alexander J. Smola. 2017. Latent LSTM Allocation: Joint Clustering and Non-Linear Dynamic Modeling of Sequence Data ICML. 3967--3976.
[56]
Jia-Dong Zhang and Chi-Yin Chow. 2013. iGSLR: personalized geo-social location recommendation: a kernel density estimation approach. In SIGSPATIAL. ACM, 334--343.

Cited By

View all
  • (2024)Early Attrition Prediction for Web-Based Interpretation Bias Modification to Reduce Anxious Thinking: A Machine Learning StudyJMIR Mental Health10.2196/5156711(e51567)Online publication date: 20-Dec-2024
  • (2024)TIM: Temporal Interaction Model in Notification SystemProceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3657614(1120-1124)Online publication date: 30-May-2024
  • (2024)General-Purpose User Modeling with Behavioral Logs: A Snapchat Case StudyProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657908(2431-2436)Online publication date: 10-Jul-2024
  • Show More Cited By

Index Terms

  1. I Know You'll Be Back: Interpretable New User Clustering and Churn Prediction on a Mobile Social Application

                        Recommendations

                        Comments

                        Information & Contributors

                        Information

                        Published In

                        cover image ACM Other conferences
                        KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
                        July 2018
                        2925 pages
                        ISBN:9781450355520
                        DOI:10.1145/3219819
                        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                        Sponsors

                        Publisher

                        Association for Computing Machinery

                        New York, NY, United States

                        Publication History

                        Published: 19 July 2018

                        Permissions

                        Request permissions for this article.

                        Check for updates

                        Author Tags

                        1. churn prediction
                        2. interpretable model
                        3. user clustering

                        Qualifiers

                        • Research-article

                        Conference

                        KDD '18
                        Sponsor:

                        Acceptance Rates

                        KDD '18 Paper Acceptance Rate 107 of 983 submissions, 11%;
                        Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

                        Contributors

                        Other Metrics

                        Bibliometrics & Citations

                        Bibliometrics

                        Article Metrics

                        • Downloads (Last 12 months)119
                        • Downloads (Last 6 weeks)14
                        Reflects downloads up to 02 Mar 2025

                        Other Metrics

                        Citations

                        Cited By

                        View all
                        • (2024)Early Attrition Prediction for Web-Based Interpretation Bias Modification to Reduce Anxious Thinking: A Machine Learning StudyJMIR Mental Health10.2196/5156711(e51567)Online publication date: 20-Dec-2024
                        • (2024)TIM: Temporal Interaction Model in Notification SystemProceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3657614(1120-1124)Online publication date: 30-May-2024
                        • (2024)General-Purpose User Modeling with Behavioral Logs: A Snapchat Case StudyProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657908(2431-2436)Online publication date: 10-Jul-2024
                        • (2024)No Two Users Are Alike: Generating Audiences with Neural Clustering for Temporal Point ProcessesDoklady Mathematics10.1134/S1064562423701661108:S2(S511-S528)Online publication date: 25-Mar-2024
                        • (2024)Characterizing Internet Card User Portraits for Efficient Churn Prediction Model DesignIEEE Transactions on Mobile Computing10.1109/TMC.2023.324120623:2(1735-1752)Online publication date: Feb-2024
                        • (2024)Learning from Uncertainty: Improving Churn Prediction using Conformal Confidence Intervals2024 International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA61862.2024.00008(9-16)Online publication date: 18-Dec-2024
                        • (2024)Customer Churn Prediction in Telecommunication and Banking using Machine Learning: A Systematic Literature Review2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS)10.1109/ICETSIS61505.2024.10459439(483-490)Online publication date: 28-Jan-2024
                        • (2024)Explaining customer churn prediction in telecom industry using tabular machine learning modelsMachine Learning with Applications10.1016/j.mlwa.2024.10056717(100567)Online publication date: Sep-2024
                        • (2023)Revealing the Roles of Part-of-Speech Taggers in Alzheimer Disease Detection: Scientific Discovery Using One-Intervention Causal ExplanationJMIR Formative Research10.2196/365907(e36590)Online publication date: 2-May-2023
                        • (2023)Latent Aspect Detection via Backtranslation AugmentationProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615205(3943-3947)Online publication date: 21-Oct-2023
                        • Show More Cited By

                        View Options

                        Login options

                        View options

                        PDF

                        View or Download as a PDF file.

                        PDF

                        eReader

                        View online with eReader.

                        eReader

                        Figures

                        Tables

                        Media

                        Share

                        Share

                        Share this Publication link

                        Share on social media