Abstract
Online media provides opportunities for marketers through which they can deliver effective brand messages to a wide range of audiences at scale. Advertising technology platforms enable advertisers to reach their target audience by delivering ad impressions to online users in real time. In order to identify the best marketing message for a user and to purchase impressions at the right price, we rely heavily on bid prediction and optimization models. Even though the bid prediction models are well studied in the literature, the equally important subject of model evaluation is usually overlooked or not discussed in detail. Effective and reliable evaluation of an online bidding model is crucial for making faster model improvements as well as for utilizing the marketing budgets more efficiently. In this paper, we present an experimentation framework for bid prediction models where our focus is on the practical aspects of model evaluation. Specifically, we outline the unique challenges we encounter in our platform due to a variety of factors such as heterogeneous goal definitions, varying budget requirements across different campaigns, high seasonality and the auction-based environment for inventory purchasing. Then, we introduce return on investment as a unified model performance (i.e., success) metric and explain its merits over more traditional metrics such as click-through rate or conversion rate. Most importantly, we discuss commonly used evaluation and metric summarization approaches in detail and propose a more accurate method for online evaluation of new experimental models against the baseline. Our meta-analysis-based approach addresses various shortcomings of other methods and yields statistically robust conclusions that allow us to conclude experiments more quickly in a reliable manner. We demonstrate the effectiveness of our evaluation strategy on real campaign data through some experiments.
Similar content being viewed by others
Notes
We will use experiment or study and campaign interchangeably.
Note that at the time of writing this paper no public dataset was available to conduct a comprehensive experimentation. The code that implements our approach and a sample data set is available at http://github.com/turn/ModelEvaluation.
References
Borenstein M, Hedges LV, Higgins JP, Rothstein HR (2011) Introduction to meta-analysis. Wiley, London
Casella G, Berger RL (2002) Statistical inference, vol 2. Duxbury Pacific Grove, Belmont
Cochran WG (1954) The combination of estimates from different experiments. Biometrics 10(1):101–129
DerSimonian R, Laird N (1986) Meta-analysis in clinical trials. Controll Clin Trials 7(3):177–188
Evans DS (2008) The economics of the online advertising industry. Rev Netw Econ 7(3):37–60
Graepel T, Candela JQ, Borchert T, Herbrich R (2010) Web-scale bayesian click-through rate prediction for sponsored search advertising in microsoft’s bing search engine. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 13–20
Hedges LV (1981) Distribution theory for glass’s estimator of effect size and related estimators. J Educ Behav Stat 6(2):107–128
Karp R (1972) Reducibility among combinatorial problems. The IBM research symposia series. Springer
Kohavi R, Henne RM, Sommerfield D (2007) Practical guide to controlled experiments on the web: listen to your customers not to the hippo. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 959–967
Kohavi R, Longbotham R, Sommerfield D, Henne RM (2009) Controlled experiments on the web: survey and practical guide. Data Min Knowl Discov 18:140–181
Kohavi R, Deng A, Frasca B, Walker T, Xu Y, Pohlmann N (2013) Online controlled experiments at large scale. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1168–1176
Langford J, Strehl A, Wortman J (2008) Exploration scavenging. In: Proceedings of the 25th international conference on machine learning, ACM, pp 528–535
Lee KC, Orten B, Dasdan A, Li W (2012) Estimating conversion rate in display advertising from past erformance data. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 768–776
Lenth RV (1989) Algorithm as 243: cumulative distribution function of the non-central t distribution. Appl Stat 38:185–189
Perlich C, Dalessandro B, Hook R, Stitelman O, Raeder T, Provost F (2012) Bid optimizing and inventory scoring in targeted online advertising. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 804–812
Raudenbush SW (2009) Analyzing effect sizes: random-effects models. Handb Res Synth Meta Anal 2:295–316
Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv CSUR 34(1):1–47
Shaffer JP (1995) Multiple hypothesis testing. Annu Rev Psychol 46(1):561–584
Simpson EH (1951) The interpretation of interaction in contingency tables. J R Stat Soc Ser B 13(2):238–241
Storey JD (2002) A direct approach to false discovery rates. J R Stat Soc Ser B Stat Methodol 64(3):479–498
Storey JD, Tibshirani R (2003) Statistical significance for genomewide studies. Proc Natl Acad Sci 100(16):9440–9445
Tang D, Agarwal A, O’Brien D, Meyer M (2010) Overlapping experiment infrastructure: more, better, faster experimentation. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 17–26
Yi J, Chen Y, Li J, Sett S, Yan TW (2013) Predictive model performance: offline and online evaluations. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1294–1302
Acknowledgments
We would like to thank Sahin Geyik, Jianqiang Shen and Sina Jafarour for their helpful comments.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Shariat, S., Orten, B. & Dasdan, A. Online evaluation of bid prediction models in a large-scale computational advertising platform: decision making and insights. Knowl Inf Syst 51, 37–60 (2017). https://doi.org/10.1007/s10115-016-0972-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-016-0972-6