tutorial

Pleasing the advertising oracle: Probabilistic prediction from sampled, aggregated ground truth

Authors:
Melinda Han Williams

Dstillery, 470 Park Ave South, New York, NY, 10016

Dstillery, 470 Park Ave South, New York, NY, 10016
View Profile

,
Claudia Perlich

Dstillery, 470 Park Ave South, New York, NY, 10016

Dstillery, 470 Park Ave South, New York, NY, 10016
View Profile

,
Brian Dalessandro

Dstillery, 470 Park Ave South, New York, NY, 10016

Dstillery, 470 Park Ave South, New York, NY, 10016
View Profile

,
Foster Provost

NYU/Stern School & Dstillery Research, 44 W. 4th Street, New York, NY, 10012

NYU/Stern School & Dstillery Research, 44 W. 4th Street, New York, NY, 10012
View Profile

ADKDD'14: Proceedings of the Eighth International Workshop on Data Mining for Online AdvertisingAugust 2014Pages 1–9https://doi.org/10.1145/2648584.2648587

Published:24 August 2014Publication History

ADKDD'14: Proceedings of the Eighth International Workshop on Data Mining for Online Advertising

Pages 1–9

ABSTRACT

Most video advertising campaigns today are still evaluated based on aggregate demographic audience metrics, rather than measures of individual impact or even individual demographic reach. To fit in with advertisers' evaluations, campaigns must be optimized toward validation by third-party measurement companies, which act as "oracles" in assessing ground truth. However, information is only available from such oracles in aggregate, leading to a setting with incomplete ground truth. We explore methods for building probabilistic classification models using these aggregate data. If they perform well, such models can be used to create new "engineered" segments that perform better than existing segments, in terms of lift and/or reach. We focus on the setting where companies already have machinery in place for high-performance predictive modeling from traditional, individual-level data. We show that model building, evaluation, and selection can be reliably carried out even with access only to aggregate ground truth data. We show various concrete results, highlighting confounding aspects of the problem, such as the tendency for pre-existing "in-target" segments actually to comprise biased subpopulations, which has implications both for campaign performance and modeling performance. The paper's main results show that these methods lead to engineered segments that can substantially improve lift and/or reach---as verified by a leading third-party oracle. For example, for lifts of 2-3X, segment reach can be increased to 57 times that of comparable, pre-existing segments.

References

Y. Amichai-Hamburger and G. Vinitzky. Social network use and personality. Computers in Human Behavior, 26(6):1289--1295, November 2010. Google ScholarDigital Library
Horace B Barlow. Unsupervised learning. Neural computation, 1(3):295--311, 1989. Google ScholarDigital Library
U.S. Census Bureau. Current population survey, annual social and economic supplement, 2012.Google Scholar
B Dalessandro, D Chen, T Raeder, M Williams, C Perlich, and F Provost. Scalable Hands-Free Transfer Learning for Online Advertising. In Proceedings of ACM SIGKDD. ACM, 2014. Google ScholarDigital Library
J. C. Gittins. Bandit processes and dynamic allocation indices. In Journal of the Royal Statistical Society. Series B, volume 41, pages 148--177, 1979.Google Scholar
T. Huang, C. Lin, and R. Weng. Ranking individuals by group comparisons. In 23rd International Conference on Machine Learning, pages 425--432. ACM, 2006. Google ScholarDigital Library
M. Kosinski, Y. Bachrach, P. Kohli, D. Stillwell, and T. Graepel. Manifestations of user personality in website choice and behaviour on online social networks. Machine Learning, pages 1--24, 2013. Google ScholarDigital Library
M. Kosinski, D. Stillwell, and T. Graepel. Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences, 110(15):5802--5805, 2013.Google ScholarDigital Library
D. Markovikj, S. Gievska, M. Kosinski, and DS. Stillwell. Mining facebook data for predictive personality modeling. In 7th International AIII Conference On Weblogs And Social Media, 2013.Google Scholar
J. Menke and T. Martinez. A bradley--terry artificial neural network model for individual ratings in group competitions. Neural computing and Applications, 17(2):175--186, 2008. Google ScholarDigital Library
J. Menke and T. Martinez. Artificial neural network reduction through oracle learning. Intelligent Data Analysis, 13(1):135--149, 2009. Google ScholarDigital Library
J. Neff. Nielsen, comscore pitted in ratings race, 2012.Google Scholar
C. Perlich, B. Dalessandro, R. Hook, O. Stitelman, T. Raeder, and F. Provost. Bid optimizing and inventory scoring in targeted online advertising. In Proceedings of SIGKDD, pages 804--812. ACM, 2012. Google ScholarDigital Library
C. Perlich, B. Dalessandro, T. Raeder, O. Stitelman, and F. Provost. Machine learning for targeted display advertising: Transfer learning in action. Machine Learning, pages 1--25, 2013. Google ScholarDigital Library
M. Shields. Google and Nielsen Partner on Online Campaign Ratings, 2013.Google Scholar
Padhraic Smyth. Learning with Probabilistic Supervision. MIT Press Cambridge, MA, USA, 1995.Google Scholar
Padhraic Smyth, MC Burl, UM Fayyad, and Pietro Perona. Knowledge Discovery in Large Image Databases: Dealing with Uncertainties in Ground Truth. KDD Workshop, pages 109--120, 1994.Google Scholar

Index Terms

Pleasing the advertising oracle: Probabilistic prediction from sampled, aggregated ground truth
1. Computing methodologies
  1. Machine learning

Recommendations

Bid optimizing and inventory scoring in targeted online advertising
KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

Billions of online display advertising spots are purchased on a daily basis through real time bidding exchanges (RTBs). Advertising companies bid for these spots on behalf of a company or brand in order to purchase these spots to display banner ...
Read More
Online Display Advertising: Modeling the Effects of Multiple Creatives and Individual Impression Histories

Online advertising campaigns often consist of multiple ads, each with different creative content. We consider how various creatives in a campaign differentially affect behavior given the targeted individual's ad impression history, as characterized by ...
Read More
Online Advertising: Experimental Facts on Ethics, Involvement, and Product Type

The purpose of this chapter is to provide some insights into advertisements on the Iranian websites. Firstly, in publisher side, is the ethic a matter of fact in accepting Internet advertisements to publish? Second, to provide a preliminary insight into ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ADKDD'14: Proceedings of the Eighth International Workshop on Data Mining for Online Advertising
August 2014
65 pages
ISBN:9781450329996
DOI:10.1145/2648584

Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 August 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Audience Targeting
Logistic Regression
Online Advertising
Probability Estimation
Qualifiers
- tutorial
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate12of21submissions,57%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

KDD '24: The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 245
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Pleasing the advertising oracle: Probabilistic prediction from sampled, aggregated ground truth

ADKDD'14: Proceedings of the Eighth International Workshop on Data Mining for Online Advertising

ABSTRACT

References

Cited By

Index Terms

Recommendations

Bid optimizing and inventory scoring in targeted online advertising

Online Display Advertising: Modeling the Effects of Multiple Creatives and Individual Impression Histories

Online Advertising: Experimental Facts on Ethics, Involvement, and Product Type

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Pleasing the advertising oracle: Probabilistic prediction from sampled, aggregated ground truth

ADKDD'14: Proceedings of the Eighth International Workshop on Data Mining for Online Advertising

ABSTRACT

References

Cited By

Index Terms

Recommendations

Bid optimizing and inventory scoring in targeted online advertising

Online Display Advertising: Modeling the Effects of Multiple Creatives and Individual Impression Histories

Online Advertising: Experimental Facts on Ethics, Involvement, and Product Type

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media