Skip to main content

Parameter-Free Bayesian Decision Trees for Uplift Modeling

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13936))

Included in the following conference series:

Abstract

Uplift modeling aims to estimate the incremental impact of a treatment, such as a marketing campaign or a drug, on an individual’s behavior. These approaches are very useful in several applications such as personalized medicine and advertising, as it allows targeting the specific proportion of a population on which the treatment will have the greatest impact. Uplift modeling is a challenging task because data are partially known (for an individual, responses to alternative treatments cannot be observed). In this paper, we present a new tree algorithm named UB-DT designed for uplift modeling. We propose a Bayesian evaluation criterion for uplift decision trees T by defining the posterior probability of T given uplift data. We transform the learning problem into an optimization one to search for the uplift tree model leading to the best evaluation of the criterion. A search algorithm is then presented as well as an extension for random forests. Large scale experiments on real and synthetic datasets show the efficiency of our methods over other state-of-art uplift modeling approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Code, datasets and complementary results are at https://github.com/MinaWagdi/UB-DT.

  2. 2.

    http://blog.minethatdata.com/2008/03/minethatdata-e-mail-analytics-and-data.html.

  3. 3.

    https://cran.r-project.org/web/packages/Information/index.html.

  4. 4.

    https://ods.ai/tracks/df21-megafon/competitions/megafon-df21-comp/data.

  5. 5.

    https://github.com/joshxinjie/Data_Scientist_Nanodegree/tree/master/starbucks_portfolio_exercisejoshxinjie.

References

  1. Athey, S., Imbens, G.: Recursive partitioning for heterogeneous causal effects. Proc. Nat. Acad. Sci. 113(27), 7353–7360 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  2. Athey, S., Tibshirani, J., Wager, S.: Generalized random forests. Ann. Stat. 47(2), 1148–1178 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  3. Connors, A.F., et al.: The effectiveness of right heart catheterization in the initial care of critically ill patients. support investigators. JAMA 276 11, 889–97 (1996)

    Google Scholar 

  4. Devriendt, F., Van Belle, J., Guns, T., Verbeke, W.: Learning to rank for uplift modeling. IEEE Trans. Knowl. Data Eng., pp. 1–1 (2020)

    Google Scholar 

  5. Diemert, E., Betlei, A., Renaudin, C., Amini, M.R.: A Large Scale Benchmark for Uplift Modeling. In: KDD. London, United Kingdom (2018)

    Google Scholar 

  6. Gerber, A.S., Green, D.P., Larimer, C.W.: Social pressure and voter turnout: Evidence from a large-scale field experiment. Am. Polit. Sci. Rev. 102(1), 33–48 (2008)

    Google Scholar 

  7. Kennedy, E.H.: Towards optimal doubly robust estimation of heterogeneous causal effects (2020). https://arxiv.org/abs/2004.14497

  8. Künzel, S.R., Sekhon, J.S., Bickel, P.J., Yu, B.: Meta-learners for estimating heterogeneous treatment effects using machine learning. Proc. Nat. Acad. Sci. 116(10), 4156–4165 (2019)

    Google Scholar 

  9. Moro, S., Cortez, P., Rita, P.: A data-driven approach to predict the success of bank telemarketing. Decis. Support Syst. 62, 22–31 (2014)

    Article  Google Scholar 

  10. Naumov, G.: Np-completeness of problems of construction of optimal decision trees. In: Soviet Physics Doklady. 36, 270 (1991)

    Google Scholar 

  11. Radcliffe, N., Surry, P.: Differential response analysis: modeling true responses by isolating the effect of a single action. Credit Scoring and Credit Control IV (1999)

    Google Scholar 

  12. Radcliffe, N.J., Surry, P.D.: Real-world uplift modelling with significance-based uplift trees. Stochastic Solutions (2011)

    Google Scholar 

  13. Rafla, M., Voisine, N., Crémilleux, B., Boullé, M.: A non-parametric bayesian approach for uplift discretization and feature selection. In: ECML PKDD (2022) https://doi.org/10.1007/978-3-031-26419-1_15

  14. Rubin, D.B.: Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 66, 688–701 (1974)

    Article  Google Scholar 

  15. Rzepakowski, P., Jaroszewicz, S.: Decision trees for uplift modeling with single and multiple treatments. Knowl. Inf. Syst. 32(2), 303–327 (2012)

    Article  Google Scholar 

  16. Voisine, N., Boullé, M., Hue, C.: A bayes evaluation criterion for decision trees. In: Advances in knowledge discovery and management, pp. 21–38. Springer (2009) https://doi.org/10.1007/978-3-642-00580-0_2

  17. Zhao, Y., Fang, X., Simchi-Levi, D.: Uplift modeling with multiple treatments and general response types. In: SIAM Int. Conf. on Data Mining. SIAM (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mina Rafla .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rafla, M., Voisine, N., Crémilleux, B. (2023). Parameter-Free Bayesian Decision Trees for Uplift Modeling. In: Kashima, H., Ide, T., Peng, WC. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2023. Lecture Notes in Computer Science(), vol 13936. Springer, Cham. https://doi.org/10.1007/978-3-031-33377-4_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-33377-4_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-33376-7

  • Online ISBN: 978-3-031-33377-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics