Skip to main content
Log in

Online evaluation of bid prediction models in a large-scale computational advertising platform: decision making and insights

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Online media provides opportunities for marketers through which they can deliver effective brand messages to a wide range of audiences at scale. Advertising technology platforms enable advertisers to reach their target audience by delivering ad impressions to online users in real time. In order to identify the best marketing message for a user and to purchase impressions at the right price, we rely heavily on bid prediction and optimization models. Even though the bid prediction models are well studied in the literature, the equally important subject of model evaluation is usually overlooked or not discussed in detail. Effective and reliable evaluation of an online bidding model is crucial for making faster model improvements as well as for utilizing the marketing budgets more efficiently. In this paper, we present an experimentation framework for bid prediction models where our focus is on the practical aspects of model evaluation. Specifically, we outline the unique challenges we encounter in our platform due to a variety of factors such as heterogeneous goal definitions, varying budget requirements across different campaigns, high seasonality and the auction-based environment for inventory purchasing. Then, we introduce return on investment as a unified model performance (i.e., success) metric and explain its merits over more traditional metrics such as click-through rate or conversion rate. Most importantly, we discuss commonly used evaluation and metric summarization approaches in detail and propose a more accurate method for online evaluation of new experimental models against the baseline. Our meta-analysis-based approach addresses various shortcomings of other methods and yields statistically robust conclusions that allow us to conclude experiments more quickly in a reliable manner. We demonstrate the effectiveness of our evaluation strategy on real campaign data through some experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. We will use experiment or study and campaign interchangeably.

  2. Note that at the time of writing this paper no public dataset was available to conduct a comprehensive experimentation. The code that implements our approach and a sample data set is available at http://github.com/turn/ModelEvaluation.

References

  1. Borenstein M, Hedges LV, Higgins JP, Rothstein HR (2011) Introduction to meta-analysis. Wiley, London

    MATH  Google Scholar 

  2. Casella G, Berger RL (2002) Statistical inference, vol 2. Duxbury Pacific Grove, Belmont

    MATH  Google Scholar 

  3. Cochran WG (1954) The combination of estimates from different experiments. Biometrics 10(1):101–129

    Article  Google Scholar 

  4. DerSimonian R, Laird N (1986) Meta-analysis in clinical trials. Controll Clin Trials 7(3):177–188

    Article  Google Scholar 

  5. Evans DS (2008) The economics of the online advertising industry. Rev Netw Econ 7(3):37–60

  6. Graepel T, Candela JQ, Borchert T, Herbrich R (2010) Web-scale bayesian click-through rate prediction for sponsored search advertising in microsoft’s bing search engine. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 13–20

  7. Hedges LV (1981) Distribution theory for glass’s estimator of effect size and related estimators. J Educ Behav Stat 6(2):107–128

    Article  Google Scholar 

  8. Karp R (1972) Reducibility among combinatorial problems. The IBM research symposia series. Springer

  9. Kohavi R, Henne RM, Sommerfield D (2007) Practical guide to controlled experiments on the web: listen to your customers not to the hippo. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 959–967

  10. Kohavi R, Longbotham R, Sommerfield D, Henne RM (2009) Controlled experiments on the web: survey and practical guide. Data Min Knowl Discov 18:140–181

    Article  MathSciNet  Google Scholar 

  11. Kohavi R, Deng A, Frasca B, Walker T, Xu Y, Pohlmann N (2013) Online controlled experiments at large scale. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1168–1176

  12. Langford J, Strehl A, Wortman J (2008) Exploration scavenging. In: Proceedings of the 25th international conference on machine learning, ACM, pp 528–535

  13. Lee KC, Orten B, Dasdan A, Li W (2012) Estimating conversion rate in display advertising from past erformance data. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 768–776

  14. Lenth RV (1989) Algorithm as 243: cumulative distribution function of the non-central t distribution. Appl Stat 38:185–189

    Article  Google Scholar 

  15. Perlich C, Dalessandro B, Hook R, Stitelman O, Raeder T, Provost F (2012) Bid optimizing and inventory scoring in targeted online advertising. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 804–812

  16. Raudenbush SW (2009) Analyzing effect sizes: random-effects models. Handb Res Synth Meta Anal 2:295–316

    Google Scholar 

  17. Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv CSUR 34(1):1–47

    Article  MathSciNet  Google Scholar 

  18. Shaffer JP (1995) Multiple hypothesis testing. Annu Rev Psychol 46(1):561–584

    Article  Google Scholar 

  19. Simpson EH (1951) The interpretation of interaction in contingency tables. J R Stat Soc Ser B 13(2):238–241

  20. Storey JD (2002) A direct approach to false discovery rates. J R Stat Soc Ser B Stat Methodol 64(3):479–498

    Article  MathSciNet  MATH  Google Scholar 

  21. Storey JD, Tibshirani R (2003) Statistical significance for genomewide studies. Proc Natl Acad Sci 100(16):9440–9445

    Article  MathSciNet  MATH  Google Scholar 

  22. Tang D, Agarwal A, O’Brien D, Meyer M (2010) Overlapping experiment infrastructure: more, better, faster experimentation. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 17–26

  23. Yi J, Chen Y, Li J, Sett S, Yan TW (2013) Predictive model performance: offline and online evaluations. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1294–1302

Download references

Acknowledgments

We would like to thank Sahin Geyik, Jianqiang Shen and Sina Jafarour for their helpful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shahriar Shariat.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shariat, S., Orten, B. & Dasdan, A. Online evaluation of bid prediction models in a large-scale computational advertising platform: decision making and insights. Knowl Inf Syst 51, 37–60 (2017). https://doi.org/10.1007/s10115-016-0972-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-016-0972-6

Keywords

Navigation