Skip to main content

A Non-parametric Bayesian Approach for Uplift Discretization and Feature Selection

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2022)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13717))

  • 860 Accesses

Abstract

Uplift modeling aims to estimate the incremental impact of a treatment, such as a marketing campaign or a drug, on an individual’s outcome. Bank or Telecom uplift data often have hundreds to thousands of features. In such situations, detection of irrelevant features is an essential step to reduce computational time and increase model performance. We present a parameter-free feature selection method for uplift modeling founded on a Bayesian approach. We design an automatic feature discretization method for uplift based on a space of discretization models and a prior distribution. From this model space, we define a Bayes optimal evaluation criterion of a discretization model for uplift. We then propose an optimization algorithm that finds near-optimal discretization for estimating uplift in \(O(n \log n)\) time. Experiments demonstrate the high performances obtained by this new discretization method. Then we describe a parameter-free feature selection method for uplift. Experiments show that the new method both removes irrelevant features and achieves better performances than state of the art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    The terms treatment effect and uplift address the same notion. CATE is an estimation of uplift and we use “CATE” for speaking of the estimated uplift values.

  2. 2.

    Our implementation is provided at https://github.com/MinaWagdi/UMODL.

  3. 3.

    Other patterns can be found using the github link provided previously.

  4. 4.

    https://doi.org/10.5281/zenodo.3653141.

References

  1. Boullé, M.: MODL: a bayes optimal discretization method for continuous attributes. Mach. Learn. 65(1), 131–165 (2006)

    Article  MATH  Google Scholar 

  2. Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16–28 (2014)

    Article  Google Scholar 

  3. Devriendt, F., Van Belle, J., Guns, T., Verbeke, W.: Learning to rank for uplift modeling. IEEE Trans. Knowl. Data Eng. 34(10), 4888–4904 (2020)

    Article  Google Scholar 

  4. Diemert, E., Betlei, A., Renaudin, C., Amini, M.R.: A large scale benchmark for uplift modeling. In: KDD, London, United Kingdom (2018)

    Google Scholar 

  5. Glover, S., Dixon, P.: Likelihood ratios: a simple and flexible statistic for empirical psychologists. Psychon. Bull. Rev. 11, 791–806 (2004)

    Article  Google Scholar 

  6. Grünwald, P.: The Minimum Description Length Principle. Adaptive Computation and Machine Learning. MIT Press, Cambridge (2007)

    Google Scholar 

  7. Guelman, L.: Optimal personalized treatment learning models with insurance applications. Ph.D. thesis, Universitat de Barcelona (2015)

    Google Scholar 

  8. Gutierrez, P., Gérardy, J.Y.: Causal inference and uplift modelling: a review of the literature. In: PAPIs (2016)

    Google Scholar 

  9. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)

    MATH  Google Scholar 

  10. Habbema, J., Hermans, J.: Selection of variables in discriminant analysis by F-statistic and error rate. Technometrics 19(4), 487–493 (1977)

    Article  MATH  Google Scholar 

  11. Hitsch, G.J., Misra, S.: Heterogeneous treatment effects and optimal targeting policy evaluation. Randomized Soc. Exp. eJournal (2018)

    Google Scholar 

  12. Hu, J.: Customer feature selection from high-dimensional bank direct marketing data for uplift modeling. J. Mark. Anal. 1–12 (2022)

    Google Scholar 

  13. Jacob, D.: Cate meets ML. Digit. Finance 3(2), 99–148 (2021)

    Article  Google Scholar 

  14. Jaskowski, M., Jaroszewicz, S.: Uplift modeling for clinical trial data. In: ICML Workshop on Clinical Data Analysis (2012)

    Google Scholar 

  15. Kennedy, E.H.: Towards optimal doubly robust estimation of heterogeneous causal effects (2020). https://arxiv.org/abs/2004.14497

  16. Liu, H., Setiono, R.: Feature selection via discretization. IEEE Trans. Knowl. Data Eng. 9(4), 642–645 (1997)

    Article  Google Scholar 

  17. Lo, V.: Pachamanova: from predictive uplift modeling to prescriptive uplift analytics: a practical approach to treatment optimization while accounting for estimation risk. J. Mark. Anal. 3, 79–95 (2015)

    Article  Google Scholar 

  18. Lunceford, J.K., Davidian, M.: Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Stat. Med. 23(19), 2937–60 (2004)

    Article  Google Scholar 

  19. Radcliffe, N.: Using control groups to target on predicted lift: building and assessing uplift model. Direct Mark. Anal. J. 14–21 (2007)

    Google Scholar 

  20. Radcliffe, N., Surry, P.: Differential response analysis: modeling true responses by isolating the effect of a single action. Credit Scoring and Credit Control IV (1999)

    Google Scholar 

  21. Radcliffe, N.J., Surry, P.D.: Real-world uplift modelling with significance-based uplift trees. Stochastic Solutions (2011)

    Google Scholar 

  22. Rissanen, J.: Modeling by shortest data description. Automatica 14(5), 465–471 (1978)

    Article  MATH  Google Scholar 

  23. Rubin, D.B.: Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 66, 688–701 (1974)

    Article  Google Scholar 

  24. Rzepakowski, P., Jaroszewicz, S.: Decision trees for uplift modeling with single and multiple treatments. Knowl. Inf. Syst. 32(2), 303–327 (2012)

    Article  Google Scholar 

  25. Sharmin, S., Shoyaib, M., Ali, A.A., Khan, M.A.H., Chae, O.: Simultaneous feature selection and discretization based on mutual information. Pattern Recognit. 91, 162–174 (2019)

    Article  Google Scholar 

  26. Zhao, Y., Fang, X., Simchi-Levi, D.: Uplift modeling with multiple treatments and general response types. In: Chawla, N.V., Wang, W. (eds.) SIAM International Conference on Data Mining, Houston, Texas, USA, 27–29 April 2017, pp. 588–596. SIAM (2017)

    Google Scholar 

  27. Zhao, Z., Zhang, Y., Harinen, T., Yung, M.: Feature selection methods for uplift modeling. CoRR abs/2005.03447 (2020). https://arxiv.org/abs/2005.03447

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mina Rafla .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rafla, M., Voisine, N., Crémilleux, B., Boullé, M. (2023). A Non-parametric Bayesian Approach for Uplift Discretization and Feature Selection. In: Amini, MR., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2022. Lecture Notes in Computer Science(), vol 13717. Springer, Cham. https://doi.org/10.1007/978-3-031-26419-1_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-26419-1_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-26418-4

  • Online ISBN: 978-3-031-26419-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics