skip to main content
research-article

Telco User Activity Level Prediction with Massive Mobile Broadband Data

Published: 02 May 2016 Publication History

Abstract

Telecommunication (telco) operators aim to provide users with optimized services and bandwidth in a timely manner. The goal is to increase user experience while retaining profit. To do this, knowing the changing behavior patterns of users through their activity levels in advance can be a great help for operators to adjust their management strategies and reduce operational risk. To achieve this goal, the operators can make use of knowledge discovered from telco’s historical mobile broadband (MBB) records to predict mobile access activity level at an early stage. In this article, we report our research in a real-world telco setting involving more than one million telco users. Our novel contribution includes representing users as documents containing a collection of changing spatiotemporal “words” that express user behavior. By extracting users’ space-time access records in MBB data, we use latent Dirichlet allocation (LDA) to learn user-specific compact topic features for user activity level prediction. We propose a scalable online expectation-maximization (OEM) algorithm that can scale LDA to massive MBB data, which is significantly faster than several state-of-the-art online LDA algorithms. Using these real-world MBB data, we confirm high performance in user activity level prediction. In addition, we show that the inferred topics indicate that future activity level anomalies correlate highly with early skewed bandwidth supply and demand relations. Thus, our prediction system can also guide the telco operators to balance the telecommunication network in terms of supply-demand relations, saving deployment costs and energy of cell towers in the future.

References

[1]
Jae-Hyeon Ahna, Sang-Pil Hana, and Yung-Seop Lee. 2006. Customer churn analysis: Churn determinants and mediation effects of partial defection in the Korean mobile telecommunications service industry. Telecommunications Policy 30, 552--568.
[2]
Arthur Asuncion, Max Welling, Padhraic Smyth, and Yee Whye Teh. 2009. On smoothing and inference for topic models. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence (UAI’09). 27--34.
[3]
C. M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer.
[4]
David Blei, Andrew Y. Ng, and Michael Jordan. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3, 993--1022.
[5]
David M. Blei. 2012. Introduction to probabilistic topic models. Communications of the ACM 55, 4, 77--84.
[6]
Leo Breiman. 2001. Random forests. Machine Learning 45, 5--32.
[7]
Olivier Cappé and Eric Moulines. 2009. Online expectation-maximization algorithm for latent data models. Journal of the Royal Statistical Society: Series B 71, 3, 593--613.
[8]
Tianqi Chen. 2015. Large-Scale and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and More. Retrieved March 13, 2016, from https://github.com/dmlc/xgboost.
[9]
Hong Cheng, Jihang Ye, and Zhe Zhu. 2013. What’s your next move: User activity prediction in location-based social networks. In Proceedings of the 2013 SIAM International Conference on Data Mining (SDM’13). 171--179.
[10]
Koustuv Dasgupta, Rahul Singh, Balaji Viswanathan, Dipanjan Chakraborty, Sougata Mukherjea, Amit A. Nanavati, and Anupam Joshi. 2008. Social ties and their relevance to churn in mobile telecom networks. In Proceedings of the 11th International Conference on Extending Database Technology (EDBT’08). 668--677.
[11]
N. de Freitas and K. Barnard. 2001. Bayesian Latent Semantic Analysis of Multimedia Databases. Technical Report. University of British Columbia.
[12]
A. P. Dempster, N. M. Laird, and D. B. Rubin. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B 39, 1--38.
[13]
Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research 9, 1871--1874.
[14]
James Foulds, Levi Boyles, Christopher DuBois, Padhraic Smyth, and Max Welling. 2013. Stochastic collapsed variational Bayesian inference for latent Dirichlet allocation. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’13). 446--454.
[15]
Fosca Giannotti, Mirco Nanni, Fabio Pinelli, and Dino Pedreschi. 2007. Trajectory pattern mining. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’07). 330--339.
[16]
Marta C. Gonzalez, Cesar A. Hidalgo, and Albert-Laszlo Barabasi. 2008. Understanding individual human mobility patterns. Nature 453, 7196, 779--782.
[17]
T. L. Griffiths and M. Steyvers. 2004. Finding scientific topics. Proceedings of the National Academy of Sciences 101, 5228--5235.
[18]
Isabelle Guyon, Vincent Lemaire, Marc Boullé, Gideon Dror, and David Vogel. 2009. Analysis of the KDD Cup 2009: Fast scoring on a large orange customer database. Journal of Machine Learning Research 7 1--22.
[19]
Geoffrey E. Hinton and Ruslan R. Salakhutdinov. 2006. Reducing the dimensionality of data with neural networks. Science 313, 5786, 504--507.
[20]
Yap Kok Ho. 2011. Managing user experience for MBB. Huawei Communicate 60, 19--21.
[21]
M. Hoffman, D. Blei, and F. Bach. 2010. Online learning for latent Dirichlet allocation. In Proceedings of the 24th Annual Conference on Neural Information Processing Systems (NIPS’10). 856--864.
[22]
Baoxing Huai, Enhong Chen, Hengshu Zhu, Hui Xiong, Tengfei Bao, Qi Liu, and Jilei Tian. 2014. Toward personalized context recognition for mobile users: A semisupervised Bayesian HMM approach. ACM Transactions on Knowledge Discovery from Data 9, 2, 10.
[23]
Shu Huang, Min Chen, Bo Luo, and Dongwon Lee. 2012. Predicting aggregate social activities using continuous-time stochastic process. In Proceedings of the 21st ACM Conference on Information and Knowledge Management (CIKM’12). 982--991.
[24]
Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the ACM International Conference on Multimedia. ACM, New York, NY, 675--678.
[25]
Shan Jiang, Joseph Ferreira Jr, and Marta C. Gonzalez. 2012. Discovering urban spatial-temporal structure from human activity patterns. In Proceedings of the KDD Workshop on Urban Computing. 95--102.
[26]
Shan Jiang, Gaston A. Fiore, Yingxiang Yang, Joseph Ferreira Jr, Emilio Frazzoli, and Marta C. González. 2013. A review of urban computing for mobile phone traces: Current methods, challenges and opportunities. In Proceedings of the KDD Workshop on Urban Computing. 2--9.
[27]
Enric Junqeé de Fortuny, David Martens, and Foster Provost. 2013. Predictive modeling with big data: Is bigger really better. Big Data 1, 215--226.
[28]
Marcel Karnstedt, Matthew Rowe, Jeff Chan, Harith Alani, and Conor Hayes. 2011. The effect of user features on churn in social networks. In Proceedings of the ACM Web Science Conference. 14--17.
[29]
P. Liang and D. Klein. 2009. Online EM for unsupervised models. In Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the ACL. 611--619.
[30]
Zhiyuan Liu, Yuzhou Zhang, Edward Y. Chang, and Maosong Sun. 2011. PLDA+: Parallel latent Dirichlet allocation with data placement and pipeline processing. ACM Transactions on Intelligent Systems and Technology 2, 3, 26.
[31]
Kevin P. Murphy. 2012. Machine Learning: A Probabilistic Perspective. MIT Press, Cambridge, MA.
[32]
R. M. Neal and G. E. Hinton. 1998. A view of the EM algorithm that justifies incremental, sparse, and other variants. Learning in Graphical Models 89, 355--368.
[33]
Andrew Y. Ng, Michael I. Jordan, and Yair Weiss. 2001. On spectral clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems 14 (NIPS’01).
[34]
Huy Pham, Cyrus Shahabi, and Yan Liu. 2013. EBM: An entropy-based model to infer social strength from spatiotemporal data. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. ACM, New York, NY, 265--276.
[35]
I. Porteous, D. Newman, A. Ihler, A. Asuncion, P. Smyth, and M. Welling. 2008. Fast collapsed Gibbs sampling for latent Dirichlet allocation. In Proceedings of the KDD Conference. 569--577.
[36]
Yossi Richter, Elad Yom-Tov, and Noam Slonim. 2010. Predicting customer churn in mobile networks through analysis of social groups. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’10). 732--741.
[37]
H. Robbins and S. Monro. 1951. A stochastic approximation method. Annals of Mathematical Statistics 22, 3, 400--407.
[38]
C. Song, T. Koren, P. Wang, and A.-L. Barabási. 2010. Modelling the scaling properties of human mobility. Nature Physics 6, 10, 818--823.
[39]
Lu-An Tang, Yu Zheng, Jing Yuan, Jiawei Han, Alice Leung, Wen-Chih Peng, and Thomas La Porta. 2013. A framework of traveling companion discovery on trajectory data streams. ACM Transactions on Intelligent Systems and Technology 5, 3.
[40]
Yee Whye Teh, David Newman, and Max Welling. 2006. A collapsed variational Bayesian inference algorithm for latent Dirichlet allocation. In Proceedings of the 20th Annual Conference on Neural Information Processing Systems (NIPS’06). 1353--1360.
[41]
Jameson L. Toole, Michael Ulm, Marta C. González, and Dietmar Bauer. 2012. Inferring land use from mobile phone activity. In Proceedings of the KDD Workshop on Urban Computing. 1--8.
[42]
P. Wang, T. Hunter, A. M. Bayen, K. Schechtner, and M. C. González. 2012. Understanding road usage patterns in urban areas. Scientific Reports 2, 1001.
[43]
Yi Wang, Xuemin Zhao, Zhenlong Sun, Hao Yan, Lifeng Wang, Zhihui Jin, Liubin Wang, Yang Gao, Ching Law, and Jia Zeng. 2015. Peacock: Learning long-tail topic features for industrial applications. ACM Transactions on Intelligent Systems and Technology 6, 4, Article No. 47.
[44]
Kuan-Wei Wu, Chun-Sung Ferng, Chia-Hua Ho, An-Chun Liang, Chun-Heng Huang, Wei-Yuan Shen, Jyun-Yu Jiang, Ming-Hao Yang, Ting-Wei Lin, Ching-Pei Lee, and others. 2012. A two-stage ensemble of diverse models for advertisement ranking in KDD Cup 2012. In Proceedings of the KDD Cup Workshop.
[45]
Limin Yao, David Mimno, and Andrew McCallum. 2009. Efficient methods for topic model inference on streaming document collections. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’09). 937--946.
[46]
Hsiang-Fu Yu, Hung-Yi Lo, Hsun-Ping Hsieh, Jing-Kai Lou, Todd G McKenzie, Jung-Wei Chou, Po-Han Chung, Chia-Hua Ho, Chun-Fu Chang, Yin-Hsuan Wei, and others. 2010. Feature engineering and classifier ensemble for KDD Cup 2010. In Proceedings of the KDD Cup Workshop.
[47]
J. Yuan, Y. Zheng, and X. Xie. 2012. Discovering regions of different functions in a city using human mobility and POIs. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’12). 186--194.
[48]
Mingxuan Yuan, Ke Deng, Jia Zeng, Yanhua Li, Bing Ni, Xiuqiang He, Fei Wang, Wenyuan Dai, and Qiang Yang. 2014. OceanST: A distributed analytic system for large-scale spatiotemporal mobile broadband data. In Proceedings of the 40th International Conference on Very Large Data Bases (VLDB’14). 1561--1564.
[49]
Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: Cluster computing with working sets. In Proceedings of the 2nd USENIX Workshop on Hot Topics in Cloud Computing (HotCloud’10).
[50]
Jia Zeng, William K. Cheung, and Jiming Liu. 2013. Learning topic models by belief propagation. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 5, 1121--1134.
[51]
Yu Zheng, Licia Capra, Ouri Wolfson, and Hai Yang. 2014. Urban computing: Concepts, methodologies, and applications. ACM Transactions on Intelligent Systems and Technology 5, 3, Article No. 38.
[52]
Yu Zheng and Xing Xie. 2011. Learning travel recommendations from user-generated GPS traces. ACM Transactions on Intelligent Systems and Technology 2, 2.
[53]
Yu Zheng, Xiuwen Yi, Ming Li, Ruiyuan Li, Zhangqing Shan, Eric Chang, and Tianrui Li. 2015. Forecasting fine-grained air quality based on big data. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2267--2276.
[54]
Yu Zheng and Xiaofang Zhou. 2011. Computing with Spatial Trajectories. Springer.
[55]
Hengshu Zhu, Enhong Chen, Hui Xiong, Kuifei Yu, Huanhuan Cao, and Jilei Tian. 2014. Mining mobile user preferences for personalized context-aware recommendation. ACM Transactions on Intelligent Systems and Technology 5, 4, 58.
[56]
Yin Zhu, Erheng Zhong, Sinno Jialin Pan, Xiao Wang, Minzhe Zhou, and Qiang Yang. 2013. Predicting user activity level in social networks. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (CIKM’13). 159--168.

Cited By

View all
  • (2024)Characterizing Internet Card User Portraits for Efficient Churn Prediction Model DesignIEEE Transactions on Mobile Computing10.1109/TMC.2023.324120623:2(1735-1752)Online publication date: 1-Feb-2024
  • (2024)Ensemble prediction of RRC session duration in real-world NR/LTE networksMachine Learning with Applications10.1016/j.mlwa.2024.10056417(100564)Online publication date: Sep-2024
  • (2023)Towards an E-commerce Personalized Recommendation System with KNN Classification MethodInternational Conference on Advanced Intelligent Systems for Sustainable Development10.1007/978-3-031-26384-2_32(364-382)Online publication date: 10-Jun-2023
  • Show More Cited By

Index Terms

  1. Telco User Activity Level Prediction with Massive Mobile Broadband Data

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Intelligent Systems and Technology
    ACM Transactions on Intelligent Systems and Technology  Volume 7, Issue 4
    Special Issue on Crowd in Intelligent Systems, Research Note/Short Paper and Regular Papers
    July 2016
    498 pages
    ISSN:2157-6904
    EISSN:2157-6912
    DOI:10.1145/2906145
    • Editor:
    • Yu Zheng
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 02 May 2016
    Accepted: 01 December 2015
    Revised: 01 November 2015
    Received: 01 February 2015
    Published in TIST Volume 7, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Mobile broadband
    2. OEM algorithm
    3. activity level prediction
    4. big spatiotemporal data
    5. latent Dirichlet allocation
    6. user-specific topic features

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • NSFC
    • National Grant Fundamental Research (973 Program) of China
    • Natural Science Foundation of the Jiangsu Higher Education Institutions of China
    • Innovative Research Team in Soochow University
    • Collaborative Innovation Center of Novel Software Technology and Industrialization

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)10
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 16 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Characterizing Internet Card User Portraits for Efficient Churn Prediction Model DesignIEEE Transactions on Mobile Computing10.1109/TMC.2023.324120623:2(1735-1752)Online publication date: 1-Feb-2024
    • (2024)Ensemble prediction of RRC session duration in real-world NR/LTE networksMachine Learning with Applications10.1016/j.mlwa.2024.10056417(100564)Online publication date: Sep-2024
    • (2023)Towards an E-commerce Personalized Recommendation System with KNN Classification MethodInternational Conference on Advanced Intelligent Systems for Sustainable Development10.1007/978-3-031-26384-2_32(364-382)Online publication date: 10-Jun-2023
    • (2019)Telco Big Data Research and Open Problems2019 IEEE 35th International Conference on Data Engineering (ICDE)10.1109/ICDE.2019.00238(2056-2059)Online publication date: Apr-2019
    • (2019)Continuous decaying of telco big data with data postdictionGeoInformatica10.1007/s10707-019-00364-z23:4(533-557)Online publication date: 21-Jun-2019
    • (2018)An Incentive Mechanism in Mobile Crowdsourcing Based on Multi-Attribute Reverse AuctionsSensors10.3390/s1810345318:10(3453)Online publication date: 14-Oct-2018
    • (2018)Decaying Telco Big Data with Data Postdiction2018 19th IEEE International Conference on Mobile Data Management (MDM)10.1109/MDM.2018.00027(106-115)Online publication date: Jun-2018
    • (2018)Telco Big Data: Current State & Future Directions2018 19th IEEE International Conference on Mobile Data Management (MDM)10.1109/MDM.2018.00016(11-14)Online publication date: Jun-2018
    • (2017)Towards Real-Time Road Traffic Analytics using Telco Big DataProceedings of the International Workshop on Real-Time Business Intelligence and Analytics10.1145/3129292.3129296(1-5)Online publication date: 28-Aug-2017
    • (2017)Learning k for kNN ClassificationACM Transactions on Intelligent Systems and Technology10.1145/29905088:3(1-19)Online publication date: 12-Jan-2017
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media