ABSTRACT
Most video advertising campaigns today are still evaluated based on aggregate demographic audience metrics, rather than measures of individual impact or even individual demographic reach. To fit in with advertisers' evaluations, campaigns must be optimized toward validation by third-party measurement companies, which act as "oracles" in assessing ground truth. However, information is only available from such oracles in aggregate, leading to a setting with incomplete ground truth. We explore methods for building probabilistic classification models using these aggregate data. If they perform well, such models can be used to create new "engineered" segments that perform better than existing segments, in terms of lift and/or reach. We focus on the setting where companies already have machinery in place for high-performance predictive modeling from traditional, individual-level data. We show that model building, evaluation, and selection can be reliably carried out even with access only to aggregate ground truth data. We show various concrete results, highlighting confounding aspects of the problem, such as the tendency for pre-existing "in-target" segments actually to comprise biased subpopulations, which has implications both for campaign performance and modeling performance. The paper's main results show that these methods lead to engineered segments that can substantially improve lift and/or reach---as verified by a leading third-party oracle. For example, for lifts of 2-3X, segment reach can be increased to 57 times that of comparable, pre-existing segments.
- Y. Amichai-Hamburger and G. Vinitzky. Social network use and personality. Computers in Human Behavior, 26(6):1289--1295, November 2010. Google ScholarDigital Library
- Horace B Barlow. Unsupervised learning. Neural computation, 1(3):295--311, 1989. Google ScholarDigital Library
- U.S. Census Bureau. Current population survey, annual social and economic supplement, 2012.Google Scholar
- B Dalessandro, D Chen, T Raeder, M Williams, C Perlich, and F Provost. Scalable Hands-Free Transfer Learning for Online Advertising. In Proceedings of ACM SIGKDD. ACM, 2014. Google ScholarDigital Library
- J. C. Gittins. Bandit processes and dynamic allocation indices. In Journal of the Royal Statistical Society. Series B, volume 41, pages 148--177, 1979.Google Scholar
- T. Huang, C. Lin, and R. Weng. Ranking individuals by group comparisons. In 23rd International Conference on Machine Learning, pages 425--432. ACM, 2006. Google ScholarDigital Library
- M. Kosinski, Y. Bachrach, P. Kohli, D. Stillwell, and T. Graepel. Manifestations of user personality in website choice and behaviour on online social networks. Machine Learning, pages 1--24, 2013. Google ScholarDigital Library
- M. Kosinski, D. Stillwell, and T. Graepel. Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences, 110(15):5802--5805, 2013.Google ScholarDigital Library
- D. Markovikj, S. Gievska, M. Kosinski, and DS. Stillwell. Mining facebook data for predictive personality modeling. In 7th International AIII Conference On Weblogs And Social Media, 2013.Google Scholar
- J. Menke and T. Martinez. A bradley--terry artificial neural network model for individual ratings in group competitions. Neural computing and Applications, 17(2):175--186, 2008. Google ScholarDigital Library
- J. Menke and T. Martinez. Artificial neural network reduction through oracle learning. Intelligent Data Analysis, 13(1):135--149, 2009. Google ScholarDigital Library
- J. Neff. Nielsen, comscore pitted in ratings race, 2012.Google Scholar
- C. Perlich, B. Dalessandro, R. Hook, O. Stitelman, T. Raeder, and F. Provost. Bid optimizing and inventory scoring in targeted online advertising. In Proceedings of SIGKDD, pages 804--812. ACM, 2012. Google ScholarDigital Library
- C. Perlich, B. Dalessandro, T. Raeder, O. Stitelman, and F. Provost. Machine learning for targeted display advertising: Transfer learning in action. Machine Learning, pages 1--25, 2013. Google ScholarDigital Library
- M. Shields. Google and Nielsen Partner on Online Campaign Ratings, 2013.Google Scholar
- Padhraic Smyth. Learning with Probabilistic Supervision. MIT Press Cambridge, MA, USA, 1995.Google Scholar
- Padhraic Smyth, MC Burl, UM Fayyad, and Pietro Perona. Knowledge Discovery in Large Image Databases: Dealing with Uncertainties in Ground Truth. KDD Workshop, pages 109--120, 1994.Google Scholar
Index Terms
- Pleasing the advertising oracle: Probabilistic prediction from sampled, aggregated ground truth
Recommendations
Bid optimizing and inventory scoring in targeted online advertising
KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data miningBillions of online display advertising spots are purchased on a daily basis through real time bidding exchanges (RTBs). Advertising companies bid for these spots on behalf of a company or brand in order to purchase these spots to display banner ...
Online Display Advertising: Modeling the Effects of Multiple Creatives and Individual Impression Histories
Online advertising campaigns often consist of multiple ads, each with different creative content. We consider how various creatives in a campaign differentially affect behavior given the targeted individual's ad impression history, as characterized by ...
Online Advertising: Experimental Facts on Ethics, Involvement, and Product Type
The purpose of this chapter is to provide some insights into advertisements on the Iranian websites. Firstly, in publisher side, is the ethic a matter of fact in accepting Internet advertisements to publish? Second, to provide a preliminary insight into ...
Comments