research-article

Machine learning in an auction environment

Authors:
Patrick Hummel

Google Inc., Mountain View, CA, USA

Google Inc., Mountain View, CA, USA
View Profile

,
Preston McAfee

Google Inc., Mountain View, CA, USA

Google Inc., Mountain View, CA, USA
View Profile

WWW '14: Proceedings of the 23rd international conference on World wide webApril 2014Pages 7–18https://doi.org/10.1145/2566486.2567974

Published:07 April 2014Publication History

WWW '14: Proceedings of the 23rd international conference on World wide web

Pages 7–18

ABSTRACT

We consider a model of repeated online auctions in which an ad with an uncertain click-through rate faces a random distribution of competing bids in each auction and there is discounting of payoffs. We formulate the optimal solution to this explore/exploit problem as a dynamic programming problem and show that efficiency is maximized by making a bid for each advertiser equal to the advertiser's expected value for the advertising opportunity plus a term proportional to the variance in this value divided by the number of impressions the advertiser has received thus far. We then use this result to illustrate that the value of incorporating active exploration into a machine learning system in an auction environment is exceedingly small.

References

D. Agarwal, B.-C. Chen, and P. Elango. Explore/exploit schemes for web content optimization. In Proceedings of the 9th Industrial Conference on Data Mining (ICDM), pages 1--10, 2009. Google ScholarDigital Library
P. Aghion, P. Bolton, C. Harris, and B. Jullien. Optimal learning by experimentation. Review of Economic Studies, 58(4):621--654, 1991.Google ScholarCross Ref
P. Aghion, M. P. Espinosa, and B. Jullien. Dynamic duopoly with learning through market experimentation. Economic Theory, 3(3):517--539, 1993.Google ScholarCross Ref
N. Anthonisen. On learning to cooperate. Journal of Economic Theory, 107(2):253--287, 2002.Google ScholarCross Ref
P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multi-armed bandit problem. Machine Learning, 47(2--3):235--256, 2002. Google ScholarDigital Library
P. Auer, N. Cesa-Bianchi, and P. Fischer. The nonstochastic multi-armed bandit problem. SIAM Journal on Computing, 32(1):48--77, 2003. Google ScholarDigital Library
M. Babaioff, Y. Sharma, and A. Slivkins. Characterizing truthful multi-armed bandit mechanisms. In Proceedings of the 10th ACM Conference on Electronic Commerce (EC), pages 79--88, 2009. Google ScholarDigital Library
A. Banerjee and D. Fudenberg. Word-of-mouth learning. Games and Economic Behavior, 46(1):1--22, 2004.Google ScholarCross Ref
J. S. Banks and R. K. Sundaram. Denumerable-armed bandits. Econometrica, 60(5):1071--1096, 1992.Google ScholarCross Ref
E. Bax, A. Kuratti, P. McAfee, and J. Romero. Comparing predicted prices in auctions for online advertising. International Journal of Industrial Organization, 30(1):80--88, 2011.Google ScholarCross Ref
D. Bergemann and J. Valimkaki. Learning and strategic pricing. Econometrica, 64(5):1125--1149, 1996.Google ScholarCross Ref
D. Bergemann and J. Valimkaki. Market diffusion with two-sided learning. RAND Journal of Economics, 28(4):773--795, 1997.Google ScholarCross Ref
D. Bergemann and J. Valimkaki. Experimentation in markets. Review of Economic Studies, 67(2):213--234, 2000.Google ScholarCross Ref
D. Bergemann and J. Valimkaki. Stationary multi-choice bandit problems. Journal of Economic Dynamics and Control, 25(1):1585--1594, 2001.Google ScholarCross Ref
P. Bolton and C. Harris. Strategic experimentation. Econometrica, 67(2):349--374, 1999.Google ScholarCross Ref
M. Brezzi and T. L. Lai. Optimal learning and experimentation in bandit problems. Journal of Economic Dynamics and Control, 27(1):87--108, 2002.Google ScholarCross Ref
S. Callander. Searching for good policies. American Political Science Review, 105(4):643--662, 2011.Google ScholarCross Ref
N. R. Devanur and S. M. Kakade. The price of truthfulness for pay-per-click auctions. In Proceedings of the 10th ACM Conference on Electronic Commerce (EC), pages 99--106, 2009. Google ScholarDigital Library
A. Fishman and R. Rob. Experimentation and competition. Journal of Economic Theory, 78(2):299--320, 1998.Google ScholarCross Ref
D. Gale. What have we learned from social learning? European Economic Review, 40(3--5):617--628, 1996.Google ScholarCross Ref
D. Gale and R. W. Rosenthal. Experimentation, imitation, and stochastic stability. Journal of Economic Theory, 84(1):1--40, 1999.Google ScholarCross Ref
J. C. Gittins. Bandit processes and dynamic allocation indices. Journal of the Royal Statistical Society, Series B, 41(2):148--177, 1979.Google Scholar
P. Hummel and R. P. McAfee. Machine learning in an auction environment. Google Inc. Typescript, 2013.Google Scholar
K. Iyer, R. Johari, and M. Sundarajan. Mean field equilibria of dynamic auctions with learning. Cornell University Typescript, 2013.Google Scholar
G. Keller and S. Rady. Optimal experimentation in a changing environment. Review of Economic Studies, 66(3):475--503, 1999.Google ScholarCross Ref
G. Keller and S. Rady. Strategic experimentation with poisson bandits. Theoretical Economics, 5(2):275--311, 2010.Google ScholarCross Ref
G. Keller, S. Rady, and M. Cripps. Strategic experimentation with experimental bandits. Econometrica, 73(1):39--68, 2005.Google ScholarCross Ref
S. Lahaie and R. P. McAfee. Efficient ranking in sponsored search. In Proceedings of the 7th International Workshop on Internet and Network Economics (WINE), pages 254--265, 2011. Google ScholarDigital Library
T. L. Lai and H. Robbins. Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6:4--22, 1985.Google ScholarDigital Library
R. A. Lewis. Where's the 'wear-out?' online display ads and the impact of frequency. Massachusetts Institute of Technology Typescript, 2010.Google Scholar
S.-M. Li, M. Mahdian, and R. P. McAfee. Value of learning in sponsored search auctions. In Proceedings of the 6th International Workshop on Internet and Network Economics (WINE), pages 294--305, 2010. Google ScholarDigital Library
L. J. Mirman, L. Samuelson, and A. Urbano. Monopoly experimentation. International Economic Review, 34(3):549--563, 1993.Google ScholarCross Ref
G. Moscarini and L. Smith. The optimal level of experimentation. Econometrica, 69(6):1629--1644, 2001.Google ScholarCross Ref
M. Ostrovsky and M. Schwarz. Reserve prices in internet advertising auctions: A field experiment. Stanford University Typescript, 2009.Google ScholarCross Ref
M. Rothschild. A two-armed bandit theory of market pricing. Journal of Economic Theory, 9(2):185--202, 1974.Google ScholarCross Ref
A. Rusitchini and A. Wolinsky. Learning about variable demand in the long run. Journal of Economic Dynamics and Control, 19(5--7):1283--1292, 1995.Google Scholar
K. H. Schlag. Why imitate, and if so how? a boundedly rational approach to multi-armed bandits. Journal of Economic Theory, 78(1):130--156, 1998.Google ScholarCross Ref
B. Strulovici. Learning while voting: Determinant of collective experimentation. Econometrica, 78(3):933--971, 2010.Google ScholarCross Ref
X. Vives. Learning from others: A welfare analysis. Games and Economic Behavior, 20(2):177--200, 1997.Google ScholarCross Ref
M. L. Weitzman. Optimal search for the best alternative. Econometrica, 47(3):641--654, 1979.Google ScholarCross Ref
J. Wortman, Y. Vorobeychik, L. Li, and J. Langford. Maintaining equilibria during exploration in sponsored search auctions. In Proceedings of the 3rd International Workshop on Internet and Network Economics (WINE), pages 119--130, 2007. Google ScholarDigital Library

Index Terms

Machine learning in an auction environment
1. Applied computing
  1. Law, social and behavioral sciences
    1. Economics

Recommendations

Machine learning in an auction environment

We consider a model of repeated online auctions in which an ad with an uncertain click-through rate faces a random distribution of competing bids in each auction and there is discounting of payoffs. We formulate the optimal solution to this explore/...
Read More
Pricing Rule in a Clock Auction

We analyze a discrete clock auction with lowest-accepted-bid (LAB) pricing and provisional winners, as adopted by India for its 3G spectrum auction. In a perfect Bayesian equilibrium, the provisional winner shades her bid, whereas provisional losers do ...
Read More
On the design of sponsored keyword advertising slot auctions: An analysis of a generalized second-price auction approach

The generalized second-priceauction mechanism is commonly used in research in the context of keyword advertising slot auctioning. The mechanism sets the clearing prices for advertising slots on a search engine's Web pages such that the advertiser will ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '14: Proceedings of the 23rd international conference on World wide web
April 2014
926 pages
ISBN:9781450327442
DOI:10.1145/2566486
General Chair:
Chin-Wan Chung
Korea Advanced Institute of Science and Technology, Korea
,
Program Chairs:
Andrei Broder
Google Inc., USA
,
Kyuseok Shim
Seoul National University, Korea
,
Torsten Suel
New York University, USA
Copyright © 2014 Copyright is held by the International World Wide Web Conference Committee (IW3C2).
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 April 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
auctions
explore/exploit
machine learning
online advertising
Qualifiers
- research-article
Conference

Acceptance Rates
WWW '14 Paper Acceptance Rate84of645submissions,13%Overall Acceptance Rate1,899of8,196submissions,23%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 472
  Total Downloads
- Downloads (Last 12 months)9
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Machine learning in an auction environment

WWW '14: Proceedings of the 23rd international conference on World wide web

ABSTRACT

References

Cited By

Index Terms

Recommendations

Machine learning in an auction environment

Pricing Rule in a Clock Auction

On the design of sponsored keyword advertising slot auctions: An analysis of a generalized second-price auction approach