skip to main content
10.1145/3583780.3615502acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

The Price is Right: Removing A/B Test Bias in a Marketplace of Expirable Goods

Published: 21 October 2023 Publication History

Abstract

Pricing Guidance tools at Airbnb aim to help hosts maximize the earning for each night of stay. For a given listing, the earning-maximization price point of a night can vary greatly with lead-day - the number of days from now until the night of stay. This introduces systematic bias in running marketplace A/B tests to compare the performances of two pricing strategies. Lead-day bias can cause the short-term experiment result to move in the opposite direction to the long-term impact, possibly leading to the suboptimal business decision and customer dissatisfaction. We propose an efficient experimentation approach that corrects for the bias, minimizes the possible negative impact of experimenting, and greatly accelerates the R&D cycle. This paper is the first of its kind to lays out the theoretical framework along with the real-world example that demonstrates the magnitude of the bias. It serves as a conversation starter for such insidious type of experimentation bias that is likely present in other marketplaces of expirable goods such as vacation nights, car rentals, and airline tickets, concert passes, or ride-hailings.

References

[1]
Norman H Anderson and Alfred A Barrios. 1961. Primacy effects in personality impression formation. The Journal of Abnormal and Social Psychology, Vol. 63, 2 (1961), 346.
[2]
Susan Athey, Raj Chetty, and Guido Imbens. 2020. Combining experimental and observational data to estimate treatment effects on long term outcomes. arXiv preprint arXiv:2006.09676 (2020).
[3]
Susan Athey, Raj Chetty, Guido W Imbens, and Hyunseung Kang. 2019. The surrogate index: Combining short-term proxies to estimate long-term treatment effects more rapidly and precisely. Technical Report. National Bureau of Economic Research.
[4]
Paolo Bartolomeo. 1997. The novelty effect in recovered hemineglect. Cortex, Vol. 33, 2 (1997), 323--333.
[5]
Keith Battocchi, Eleanor Dillon, Maggie Hei, Greg Lewis, Miruna Oprescu, and Vasilis Syrgkanis. 2021. Estimating the long-term effects of novel treatments. Advances in Neural Information Processing Systems, Vol. 34 (2021), 2925--2935.
[6]
Cameron A Belton and Robert Sugden. 2018. Attention and novelty: An experimental investigation of order effects in multiple valuation tasks. Journal of Economic Psychology, Vol. 67 (2018), 103--115.
[7]
Thomas Blake and Dominic Coey. 2014. Why marketplace experimentation is harder than it seems: The role of test-control interference. In Proceedings of the fifteenth ACM conference on Economics and computation. 567--582.
[8]
Iavor Bojinov, David Simchi-Levi, and Jinglong Zhao. 2022. Design and analysis of switchback experiments. Management Science (2022).
[9]
Nanyu Chen, Min Liu, and Ya Xu. 2019. How A/B tests could go wrong: Automatic diagnosis of invalid online experiments. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. 501--509.
[10]
Lu Cheng, Ruocheng Guo, and Huan Liu. 2021. Long-term effect estimation with surrogate representation. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining. 274--282.
[11]
Alex Deng, Ulf Knoblich, and Jiannan Lu. 2018. Applying the Delta method in metric analytics: A practical guide with novel ideas. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 233--242.
[12]
Alex Deng, Ya Xu, Ron Kohavi, and Toby Walker. 2013. Improving the sensitivity of online controlled experiments by utilizing pre-experiment data. In Proceedings of the sixth ACM international conference on Web search and data mining. 123--132.
[13]
Pavel Dmitriev, Somit Gupta, Dong Woo Kim, and Garnet Vaz. 2017. A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. 1427--1436.
[14]
Dean Eckles, Brian Karrer, and Johan Ugander. 2017. Design and analysis of experiments in networks: Reducing bias from interference. Journal of Causal Inference, Vol. 5, 1 (2017).
[15]
Bradley Efron. 1982. The jackknife, the bootstrap and other resampling plans. SIAM.
[16]
Arne Feddersen, Wolfgang Maennig, Malte Borcherding, et al. 2006. The novelty effect of new soccer stadia: The case of Germany. International Journal of Sport Finance, Vol. 1, 3 (2006), 174--188.
[17]
Somit Gupta, Ronny Kohavi, Diane Tang, Ya Xu, Reid Andersen, Eytan Bakshy, Niall Cardin, Sumita Chandran, Nanyu Chen, Dominic Coey, et al. 2019. Top challenges from the first practical online controlled experiments summit. ACM SIGKDD Explorations Newsletter, Vol. 21, 1 (2019), 20--35.
[18]
Henning Hohnhold, Deirdre O'Brien, and Diane Tang. 2015. Focusing on the long-term: It's good for users and business. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1849--1858.
[19]
David Holtz, Ruben Lobel, Inessa Liskovich, and Sinan Aral. 2020. Reducing interference bias in online marketplace pricing experiments. arXiv preprint arXiv:2004.12489 (2020).
[20]
Dennis R Howard and John L Crompton. 2003. An Empirical Review of the Stadium Novelty Effect. Sport Marketing Quarterly, Vol. 12, 2 (2003).
[21]
Edward E Jones, Leslie Rock, Kelly G Shaver, George R Goethals, and Lawrence M Ward. 1968. Pattern of performance and ability attribution: An unexpected primacy effect. Journal of Personality and Social Psychology, Vol. 10, 4 (1968), 317.
[22]
Raphael Lopez Kaufman, Jegar Pitchforth, and Lukas Vermeer. 2017. Democratizing online controlled experiments at Booking. com. arXiv preprint arXiv:1710.08217 (2017).
[23]
Ron Kohavi. 2015. Online controlled experiments: Lessons from running a/b/n tests for 12 years. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1--1.
[24]
Ron Kohavi, Alex Deng, Brian Frasca, Roger Longbotham, Toby Walker, and Ya Xu. 2012. Trustworthy online controlled experiments: Five puzzling outcomes explained. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. 786--794.
[25]
Ron Kohavi, Alex Deng, Roger Longbotham, and Ya Xu. 2014. Seven rules of thumb for web site experimenters. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. 1857--1866.
[26]
Ron Kohavi and Roger Longbotham. 2017. Online Controlled Experiments and A/B Testing. Encyclopedia of machine learning and data mining, Vol. 7, 8 (2017), 922--929.
[27]
Ron Kohavi, Diane Tang, and Ya Xu. 2020. Trustworthy online controlled experiments: A practical guide to a/b testing. Cambridge University Press.
[28]
Hannah Li, Geng Zhao, Ramesh Johari, and Gabriel Y Weintraub. 2022. Interference, bias, and variance in two-sided marketplace experimentation: Guidance for platforms. In Proceedings of the ACM Web Conference 2022. 182--192.
[29]
Min Liu, Jialiang Mao, and Kang Kang. 2020. Trustworthy online marketplace experimentation with budget-split design. arXiv preprint arXiv:2012.08724 (2020).
[30]
Adity U Mutsuddi and Kay Connelly. 2012. Text messages for encouraging physical activity are they effective after the novelty effect wears off?. In 2012 6th International Conference on Pervasive Computing Technologies for Healthcare (PervasiveHealth) and Workshops. IEEE, 33--40.
[31]
Cameron R Peterson and Wesley M DuCharme. 1967. A primacy effect in subjective probability revision. Journal of Experimental Psychology, Vol. 73, 1 (1967), 61.
[32]
Jordan Poppenk, Stefan Köhler, and Morris Moscovitch. 2010. Revisiting the novelty effect: when familiarity, not novelty, enhances memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, Vol. 36, 5 (2010), 1321.
[33]
Soheil Sadeghi, Somit Gupta, Stefan Gramatovici, Jiannan Lu, Hao Ai, and Ruhan Zhang. 2022. Novelty and primacy: a long-term estimator for online experiments. Technometrics, Vol. 64, 4 (2022), 524--534.
[34]
Martin Saveski, Jean Pouget-Abadie, Guillaume Saint-Jacques, Weitao Duan, Souvik Ghosh, Ya Xu, and Edoardo M Airoldi. 2017. Detecting network effects: Randomizing over randomized experiments. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. 1027--1035.
[35]
Tilo Strutz. 2011. Data fitting and uncertainty: A practical introduction to weighted least squares and beyond. Springer.
[36]
Lydia Tan and Geoff Ward. 2000. A recency-based account of the primacy effect in free recall. Journal of Experimental Psychology: Learning, Memory, and Cognition, Vol. 26, 6 (2000), 1589.
[37]
Edward L Thorndike. 1898. Animal intelligence: An experimental study of the associative processes in animals. The Psychological Review: Monograph Supplements, Vol. 2, 4 ( 1898), i.
[38]
Patrick FA Van Erkel and Peter Thijssen. 2016. The first one wins: Distilling the primacy effect. Electoral Studies, Vol. 44 (2016), 245--254.
[39]
Tyler J VanderWeele. 2013. Surrogate measures and consistent surrogates. Biometrics, Vol. 69, 3 (2013), 561--565.
[40]
Stefan Wager and Susan Athey. 2018. Estimation and inference of heterogeneous treatment effects using random forests. J. Amer. Statist. Assoc., Vol. 113, 523 (2018), 1228--1242.
[41]
Huizhi Xie and Juliette Aurisset. 2016. Improving the sensitivity of online controlled experiments: Case studies at netflix. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 645--654.
[42]
Ya Xu, Nanyu Chen, Addrian Fernandez, Omar Sinno, and Anmol Bhasin. 2015. From infrastructure to culture: A/B testing challenges in large scale social networks. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2227--2236.
[43]
Peng Ye, Julian Qian, Jieying Chen, Chen-hung Wu, Yitong Zhou, Spencer De Mars, Frank Yang, and Li Zhang. 2018. Customized regression model for airbnb dynamic pricing. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 932--940.

Cited By

View all
  • (2024)Facets of Disparate Impact: Evaluating Legally Consistent Bias in Machine LearningProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679925(3637-3641)Online publication date: 21-Oct-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
October 2023
5508 pages
ISBN:9798400701245
DOI:10.1145/3583780
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. A/B testing
  2. dynamic pricing
  3. heterogeneous treatment effect
  4. marketplace experimentation
  5. online controlled experiment

Qualifiers

  • Research-article

Conference

CIKM '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)54
  • Downloads (Last 6 weeks)1
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Facets of Disparate Impact: Evaluating Legally Consistent Bias in Machine LearningProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679925(3637-3641)Online publication date: 21-Oct-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media