Abstract
The telecommunications industry is particularly rich in customer data, and telecom companies want to use this data to prevent customer churn, and improve the revenue per user through personalization and customer acquisition. Massive-scale analytics tools provide an opportunity to achieve this in is a flexible and scalable way. In this context, we have developed IBM Customer Analyst, a components library to analyze customer behavioral data and enable new insights and business scenarios based on the analysis of the relationship between users and the content they create and consume. Due to the massive amount of data and large number of users, this technology is built on IBM Infosphere BigInsights and Apache Hadoop. In this work, we first describe an efficient user profiling framework, with high user profiling quality guarantees, based on mobile web browsing log analysis. We describe the use of the Open Directory Project categories to generate user profiles. We then describe an end-to-end analysis flow and discuss its challenges. Last, we validate our methods through extensive experiments based on real data sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Some countries in fact require some kind of registration to take place but it is not a technological or business requirement.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
References
Cetintemel, U., Franklin, M.J., Giles, C.L.: Self-adaptive user profiles for large-scale data delivery. In: ICDE, San Diego, pp. 622–633 (2000)
Chen, L., Sycara, K.: Webmate: a personal agent for browsing and searching. In: AGENTS ’98, St. Paul. ACM, New York (1998)
Chen, Y., Pavlov, D., Canny, J.F.: Large-scale behavioral targeting. In: KDD ’09, Paris. ACM, New York (2009)
Chirita, P.A., Nejdl, W., Paiu, R., Kohlschütter, C.: Using ODP metadata to personalize search. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’05, Salvador, pp. 178–185. ACM, New York (2005)
Cohn, D., Hofmann, T.: The missing link – a probabilistic model of document content and hypertext connectivity. In: Advances in Neural Information Processing Systems, Vancouver (2001)
Davidov, D., Gabrilovich, E., Markovitch, S.: Parameterized generation of labeled datasets for text categorization based on a hierarchical directory. In: SIGIR ’04, Sheffield (2004)
Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
de Reuver, M., Haaker, T.: Designing viable business models for context-aware mobile services. Telemat. Inform. 26(3), 240–248 (2009)
Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google file system. In: Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles, SOSP ’03, Bolton Landing, pp. 29–43. ACM, New York (2003)
Gyöngyi, Z., Garcia-Molina, H., Pedersen, J.: Combating web spam with trustrank. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases – Volume 30, VLDB ’04, Toronto, pp. 576–587 (2004)
https://www.eff.org/deeplinks/2012/03/best-practices-respect-mobile-user-billrights
Hung, S.-Y., Yen, D.C., Wang, H.-Y.: Applying data mining to telecom churn management. Expert Syst. Appl. 31, 515–524 (2006)
Ingrid, D.: Weighted voting systems. Voting and Social Choice (2002)
Kaasinen, E.: User needs for location-aware mobile services. Pers. Ubiquitous Comput. 7, 70–79 (2003)
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)
Larry, B.: Weighted Voting Systems (2001)
Manning, C.D., Raghavan, P., Schutze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)
Nunes, M., Cabral, L., Lima, R., Freitas, F., Reinaldo, G., Prudencio, R.: Docs-Clustering: A System for Hierarchical Clustering and Document Labeling (2008)
Oishi, T., Kambara, Y., Mine, T., Hasegawa, R., Fujita, H., Koshimura, M.: Personalized search using ODP-based user profiles created from user bookmark. In: PRICAI 2008: Trends in Artificial Intelligence, Hanoi. Volume 5351 of Lecture Notes in Computer Science, pp. 839–848 (2008)
Pearson, K.: The problem of the random walk. Nature 72, 294 (1905)
Qi, X., Davison, B.D.: Web page classification: features and algorithms. ACM Comput. Surv. 41(2), 1–31 (2009)
Richter, Y., Yom-Tov, E., Slonim, N.: Predicting customer churn in mobile networks through analysis of social groups. In: SDM, Columbus (2010)
Shmueli-Scheuer, M., Roitman, H., Carmel, D., Mass, Y., Konopnicki, D.: Extracting user profiles from large scale data. In: MDAC, Raleigh (2010)
Sugiyama, K., Hatano, K., Yoshikawa, M.: Adaptive web search based on user profile constructed without any effort from users. In: WWW, Manhattan, pp. 675–684 (2004)
Tanudjaja, F., Mui, L.: Persona: a contextualized and personalized web search. In: Proceedings of the 35th Annual Hawaii International Conference on System Sciences, Big Island, p. 67 (2001)
van Setten, M., Pokraev, S., Koolwaaij, J.: Context-aware recommendations in the mobile tourist application compass. In: Adaptive Hypermedia and Adaptive Web-Based Systems, Eindhoven, vol. 3137, pp. 515–548 (2004)
Williamson, M.: Using DMOZ open directory project lists with novell bordermanager (2003)
Zhou, Y., Wilkinson, D., Schreiber, R., Pan, R.: Large-scale parallel collaborative filtering for the netflix prize. In: AAIM ’08, Shanghai, pp. 337–348. Springer, Berlin/Heidelberg (2008)
Acknowledgements
The authors would like to thank Shai Erera and Gilad Barkai for the useful discussions about implementation issues. We also thank Haggai Roitman for sharing thoughts and ideas. Finally, we thank Matin Jouzdani for his support to make it a successful project.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this chapter
Cite this chapter
Konopnicki, D., Shmueli-Scheuer, M. (2014). Customer Analyst for the Telecom Industry. In: Gkoulalas-Divanis, A., Labbi, A. (eds) Large-Scale Data Analytics. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-9242-9_4
Download citation
DOI: https://doi.org/10.1007/978-1-4614-9242-9_4
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-9241-2
Online ISBN: 978-1-4614-9242-9
eBook Packages: Computer ScienceComputer Science (R0)