Abstract
The online recommendation service has a wide range of usages for the various applications of Telecommunication companies. For such applications, the user base is usually tremendous with a variety of user characteristics and habits. Therefore, it is a challenge to achieve the high click through rate (CTR) for the online recommendations. In this paper, we proposed an approach of combining the technologies of ensemble trees and logistic regression (LR). The ensemble trees are effective in capturing the joint information of different features, which are then used by the LR scheme. In addition, to deal with the scalability issues, we implemented our system with both Apache Storm (for real-time prediction and classification) and Apache Spark (for fast off-line model training). A group of experiments were carried out with real-world data sets and the results show the efficiency and effectiveness of our proposed approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Huang, X.: Four directions of big data analytics in telecommunication industry. J. Telecommun. Tech. 6 (2013)
Neter, J., Kutner, M.H., Nachtsheim, C.J. Wasserman, W.: Applied linear statistical models. Irwin Chicago, vol. 4 (1996)
Richardson, M., Dominowska, E., Ragno, R.: Predicting clicks: estimating the click-through rate for new ads. In: Proceedings of the 16th international conference on World Wide Web, pp. 521–530. ACM (2007)
Graepel, T., Candela, J.Q., Borchert, T., Herbrich, R.: Web-scale bayesian click-through rate prediction for sponsored search advertising in microsoft’s bing search engine. In: Proceedings of the 27th International Conference on Machine Learning (ICML- 2010), pp. 13–20 (2010)
Agarwal, D., Agrawal, R., Khanna, R., Kota, N.: Estimating rates of rare events with multiple hierarchies through scalable log-linear models. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 213–222. ACM (2010)
Lee, K.C., Orten, B.B., Dasdan, A., Li, W.: Estimating conversion rate in display advertising from past performance data, uS Patent App. 13/584,545, August 2012
Menon, A.K., Chitrapura, K.P., Garg, S., Agarwal, D., Kota, N.: Response prediction using collaborative filtering with hierarchies and side-information. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 141–149. ACM (2011)
Yan, L., Li, W.J., Xue, G.R., Han, D.: Coupled group lasso for web-scale ctr prediction in display advertising. In: Proceedings of the 31st International Conference on Machine Learning (ICML-2014), pp. 802–810 (2014)
Stern, D.H., Herbrich, R., Graepel, T.: Matchbox: large scale online bayesian recommendations. In: Proceedings of the 18th International Conference on World Wide Web, p. 111120. ACM(2009)
He, X., Pan, J., Jin, O., Xu, T., Liu, B., Xu, T., Shi, Y., Atallah, A., Herbrich, R., Bowers S, et al., Practical lessons from predicting clicks on ads at facebook. In: Proceedings of 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 1–9. ACM (2014)
Gradient-Boosted Decision Trees, https://spark.apache.org/docs/1.2.1/mllib-ensembles.html#gradient-boosted-trees-gbts
Toshniwal, A., Taneja, S., Shukla, A., Ramasamy, K., Patel, J.M., Kulkarni, S., Jackson, J., Gade, K., Fu, M., Donham J. et al., Storm@ Twitter. In: Proceedings of ACM SIGMOD, pp. 147–156 (2014)
Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation. USENIX Association, pp. 2–2 (2012)
Kreps, J., Narkhede, N., Rao, J. et al.: Kafka: a distributed messaging system for log processing. In: Proceedings of 6th International Workshop on Networking Meets Databases (NetDB), Athens, Greece (2011)
Limited-memory BFGS, http://en.wikipedia.org/wiki/Limited-memory_BFGS
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Fu, H., Chen, K., Ding, J. (2015). An Empirical Study of a Large Scale Online Recommendation System. In: Cai, R., Chen, K., Hong, L., Yang, X., Zhang, R., Zou, L. (eds) Web Technologies and Applications. APWeb 2015. Lecture Notes in Computer Science(), vol 9461. Springer, Cham. https://doi.org/10.1007/978-3-319-28121-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-28121-6_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-28120-9
Online ISBN: 978-3-319-28121-6
eBook Packages: Computer ScienceComputer Science (R0)