Abstract
The application of traditional machine learning algorithms on click data has met great challenges working with severe sparse and transient ID features, which tend to bloat data size and prolong training time considerably. On the other hand, due to the data size requirement, training and publishing overhead, the bottleneck of minute-level incremental model updates has emerged. We propose a novel real-time click through rate (CTR) prediction model based on empirical CTRs with a set of pre-learned priors, upon which a Minimum Variance Unbiased Estimator is constructed as the CTR prediction. The dimensions of the empirical CTRs are in the sparsest and finest ID levels, which can be strong indicators but are generally unsuited as machine learning features. Experiments on real-life click data show that our prior-based real-time estimator, combined with traditional machine learning model, gains significant improvement in both prediction accuracy and ranking capability, especially with latest data beyond the time-effectiveness of the machine learning model.


Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Depending on the application, the ad sponsoring engines usually wait from several minutes to an hour for users’ click.
As for Sponsor Search, search session IDs and queries can be considered counterparts of the former two.
A category of features; most of the paper is focused on three dimensions: cookie, URL, and ad-id.
We limit our attention to Google-like rank-by-revenue auctions, i.e., the ads are displayed in the decreasing order of CTR * bid.
Most commercial implementations of Machine Learning models (e.g. Google and Baidu’s Logistic Regression, Microsoft’s Bayesian Probit Regression) binarize all features. For example, PLSA score between query and ad is a common feature (class). After binarization, initial PLSA values of 0.1 and 0.099 might become two totally unrelated features.
References
Andrew G, Gao JF (2007) Scalable training of l 1-regularized log-linear models. In: Proceedings of the 24th international conference on Machine learning, ACM, pp 33-40
Crammer K, Fern MD, Pereira O, (2008) Exact convex confidence-weighted learning. In: Advances in Neural Information Processing Systems, p 22
Crammer K, Kulesza A, Dredze M, (2009) Adaptive regularization of weighted vectors. In: Advances in Neural Information Processing Systems, p 23
Dredze M, Crammer K, Pereira F (2008) Confidence-weighted linear classification. In: Proceedings of the 25th international conference on Machine learning (ACM), pp 264–271
Duchi J, Singer Y (2009) Efficient online and batch learning using forward backward splitting. J Mach Learn Res 10:2899–2934
Farahat A, Bailey MC (2012) How effective is targeted advertising? In: Proceedings of the 21st international conference on World Wide Web (ACM), pp 111–120
He H, Garcia EA (2009) Learning from imbalanced data. Knowl Data Eng IEEE Trans 21(9):1263–1284
Heskes T (1998) Bias/variance decompositions for likelihood-based estimators. Neural Comput 10:1425–1433
Liu Y (2011) The phoenix nest. In: Invited Talk, The 34th Annual ACM SIGIR Conference (ACM)
McMahan HB (2011) Follow-the-regularized-leader and mirror descent: Equivalence theorems and l 1-regularization. In: International Conference on Artifficial Intelligence and Statistics, pp 525–533
McMahan HB, Holt G, Sculley D, Young M, Ebner D, Grady J, Nie L, Phillips T, Davydov E, Golovin D et al. (2013) Ad click prediction: a view from the trenches. In: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 1222-1230
Musa AB (2013) A comparison of l 1-regularizion, pca, kpca and ica for dimensionality reduction in logistic regression. Int J Mach Learn Cybern 1–13. doi:10.1007/s13042-013-0171-7
Musa AB (2013) Comparative study on classification performance between support vector machine and logistic regression. Int J Mach Learn Cybern 4(1):13–24
Owen S, Anil R, Dunning T, Friedman E (2011) Mahout in action. Manning
Oza NC, Kagan T (2008) Classifier ensembles: select real-world applications. Inf Fusion 9:4–20
Strehl AL, Littman ML (2008) Online linear regression and its application to model-based reinforcement learning. Adv Neural Inf Process Syst 20:1417–1424
Todorovski L, Sašo D (2000) Combining multiple models with meta decision trees. Springer, Berlin Heidelberg
Xiao L (2010) Dual averaging methods for regularized stochastic learning and online optimization. J Mach Learn Res 9999:2543–2596
Acknowledgements
This work is sponsored by Shanghai University of International Business and Economics (SUIBE) under both Cooperation Projects funded by the Central Finance and Project Z085YYJJ13063. We are especially grateful to collaborators at mediav.com and anonymous reviewers.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fang, Y., Liu, J. A novel prior-based real-time click through rate prediction model. Int. J. Mach. Learn. & Cyber. 5, 887–895 (2014). https://doi.org/10.1007/s13042-014-0231-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-014-0231-7