skip to main content
10.1145/2648584.2648587acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
tutorial

Pleasing the advertising oracle: Probabilistic prediction from sampled, aggregated ground truth

Published:24 August 2014Publication History

ABSTRACT

Most video advertising campaigns today are still evaluated based on aggregate demographic audience metrics, rather than measures of individual impact or even individual demographic reach. To fit in with advertisers' evaluations, campaigns must be optimized toward validation by third-party measurement companies, which act as "oracles" in assessing ground truth. However, information is only available from such oracles in aggregate, leading to a setting with incomplete ground truth. We explore methods for building probabilistic classification models using these aggregate data. If they perform well, such models can be used to create new "engineered" segments that perform better than existing segments, in terms of lift and/or reach. We focus on the setting where companies already have machinery in place for high-performance predictive modeling from traditional, individual-level data. We show that model building, evaluation, and selection can be reliably carried out even with access only to aggregate ground truth data. We show various concrete results, highlighting confounding aspects of the problem, such as the tendency for pre-existing "in-target" segments actually to comprise biased subpopulations, which has implications both for campaign performance and modeling performance. The paper's main results show that these methods lead to engineered segments that can substantially improve lift and/or reach---as verified by a leading third-party oracle. For example, for lifts of 2-3X, segment reach can be increased to 57 times that of comparable, pre-existing segments.

References

  1. Y. Amichai-Hamburger and G. Vinitzky. Social network use and personality. Computers in Human Behavior, 26(6):1289--1295, November 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Horace B Barlow. Unsupervised learning. Neural computation, 1(3):295--311, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. U.S. Census Bureau. Current population survey, annual social and economic supplement, 2012.Google ScholarGoogle Scholar
  4. B Dalessandro, D Chen, T Raeder, M Williams, C Perlich, and F Provost. Scalable Hands-Free Transfer Learning for Online Advertising. In Proceedings of ACM SIGKDD. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. C. Gittins. Bandit processes and dynamic allocation indices. In Journal of the Royal Statistical Society. Series B, volume 41, pages 148--177, 1979.Google ScholarGoogle Scholar
  6. T. Huang, C. Lin, and R. Weng. Ranking individuals by group comparisons. In 23rd International Conference on Machine Learning, pages 425--432. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. Kosinski, Y. Bachrach, P. Kohli, D. Stillwell, and T. Graepel. Manifestations of user personality in website choice and behaviour on online social networks. Machine Learning, pages 1--24, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Kosinski, D. Stillwell, and T. Graepel. Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences, 110(15):5802--5805, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. D. Markovikj, S. Gievska, M. Kosinski, and DS. Stillwell. Mining facebook data for predictive personality modeling. In 7th International AIII Conference On Weblogs And Social Media, 2013.Google ScholarGoogle Scholar
  10. J. Menke and T. Martinez. A bradley--terry artificial neural network model for individual ratings in group competitions. Neural computing and Applications, 17(2):175--186, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Menke and T. Martinez. Artificial neural network reduction through oracle learning. Intelligent Data Analysis, 13(1):135--149, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Neff. Nielsen, comscore pitted in ratings race, 2012.Google ScholarGoogle Scholar
  13. C. Perlich, B. Dalessandro, R. Hook, O. Stitelman, T. Raeder, and F. Provost. Bid optimizing and inventory scoring in targeted online advertising. In Proceedings of SIGKDD, pages 804--812. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C. Perlich, B. Dalessandro, T. Raeder, O. Stitelman, and F. Provost. Machine learning for targeted display advertising: Transfer learning in action. Machine Learning, pages 1--25, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Shields. Google and Nielsen Partner on Online Campaign Ratings, 2013.Google ScholarGoogle Scholar
  16. Padhraic Smyth. Learning with Probabilistic Supervision. MIT Press Cambridge, MA, USA, 1995.Google ScholarGoogle Scholar
  17. Padhraic Smyth, MC Burl, UM Fayyad, and Pietro Perona. Knowledge Discovery in Large Image Databases: Dealing with Uncertainties in Ground Truth. KDD Workshop, pages 109--120, 1994.Google ScholarGoogle Scholar

Index Terms

  1. Pleasing the advertising oracle: Probabilistic prediction from sampled, aggregated ground truth

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ADKDD'14: Proceedings of the Eighth International Workshop on Data Mining for Online Advertising
      August 2014
      65 pages
      ISBN:9781450329996
      DOI:10.1145/2648584

      Copyright © 2014 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 August 2014

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • tutorial
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate12of21submissions,57%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader