Skip to main content

Regularization for Uplift Regression

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases: Research Track (ECML PKDD 2023)

Abstract

We address the problem of regularization of linear regression models in uplift modeling and heterogeneous treatment effect estimation. We consider interaction models which are commonly used by statisticians in medicine and social sciences to estimate the causal effect of a treatment, and introduce a new type of such a model. We demonstrate the equivalence of all interaction models when no regularization is present, and that this is no longer the case when the model is regularized. Interaction terms introduce implicit correlations between treatment and control coefficients into the regularizer, a fact which has not been previously noted. The correlations depend on the type of interaction model, and by interpreting the regularizer as a prior distribution we were able to pinpoint cases when a given regularized interaction model is most appropriate. An interesting property of the proposed new interaction type is that it allows for smooth interpolation between two types of uplift regression models: the double model and the transformed target model. Our results are valid for both ridge (\(L_2\)) and Lasso (\(L_1\)) regularization. Experiments on synthetic data fully confirm our analyses. We also compare the usefulness of various regularization schemes on real data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/RudasKAP/ECML_PKDD_2023_supplementary.

  2. 2.

    The most popular definition of the multivariate Laplace distribution is based on the square root of a quadratic form, see e.g. [15].

References

  1. Athey, S., Imbens, G.: Recursive partitioning for heterogeneous causal effects. Proc. Natl. Acad. Sci. 113(27), 7353–7360 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  2. Belloni, A., Chernozhukov, V., Hansen, C.: High-dimensional methods and inference on structural and treatment effects. J. Econ. Perspect. 28(2), 1–23 (2014)

    Article  MATH  Google Scholar 

  3. Betlei, A., Diemert, E., Amini, M.-R.: Uplift prediction with dependent feature representation in imbalanced treatment and control conditions. In: Cheng, L., Leung, A.C.S., Ozawa, S. (eds.) ICONIP 2018. LNCS, vol. 11305, pp. 47–57. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-04221-9_5

    Chapter  Google Scholar 

  4. Chambers, J.M., Hastie, T.J.: Statistical Models in S. Chapman & Hall, Boca Raton (1993)

    Google Scholar 

  5. Gross, S.M., Tibshirani, R.: Data shared Lasso: a novel tool to discover uplift. Comput. Stat. Data Anal. 101, 226–235 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  6. Hahn, P.R., Murray, J.S., Carvalho, C.M.: Bayesian regression tree models for causal inference: regularization, confounding, and heterogeneous effects (with Discussion). Bayesian Anal. 15(3), 965–2020 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  7. Hernán, M., Robins, J.: Causal Inference. Chapman & Hall/CRC, Boca Raton (2018). forthcoming

    Google Scholar 

  8. Heumann, C., Nittner, T., Rao, C., Scheid, S., Toutenburg, H.: Linear Models: Least Squares and Alternatives. Springer, New York (2013). https://doi.org/10.1007/978-1-4899-0024-1

    Book  Google Scholar 

  9. Hill, J.L.: Bayesian nonparametric modeling for causal inference. J. Comput. Graph. Stat. 20(1), 217–240 (2011)

    Article  MathSciNet  Google Scholar 

  10. Holland, P.: Statistics and causal inference. J. Am. Stat. Assoc. 81(396), 945–960 (1986)

    Google Scholar 

  11. Imai, K., Ratkovic, M.: Estimating treatment effect heterogeneity in randomized program evaluation. Ann. Appl. Stat. 7, 443–470 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  12. Imbens, G., Rubin, D.: Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction. Cambridge University Press, New York (2015)

    Book  MATH  Google Scholar 

  13. Jaśkowski, M., Jaroszewicz, S.: Uplift modeling for clinical trial data. In: ICML 2012 Workshop on Machine Learning for Clinical Data Analysis, Edinburgh, June 2012

    Google Scholar 

  14. Kane, K., Lo, V.S.Y., Zheng, J.: Mining for the truly responsive customers and prospects using true-lift modeling: comparison of new and existing methods. J. Mark. Analytics 2(4), 218–238 (2014)

    Google Scholar 

  15. Kozubowski, T.J., Podgórski, K., Rychlik, I.: Multivariate generalized Laplace distribution and related random fields. J. Multivar. Anal. 113, 59–72 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  16. Künzel, S.R., Sekhon, J.S., Bickel, P.J., Yu, B.: Metalearners for estimating heterogeneous treatment effects using machine learning. Proc. Natl. Acad. Sci. 116(10), 4156–4165 (2019). https://doi.org/10.1073/pnas.1804597116

    Article  Google Scholar 

  17. Kuusisto, F., Costa, V.S., Nassif, H., Burnside, E., Page, D., Shavlik, J.: Support vector machines for differential prediction. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014. LNCS (LNAI), vol. 8725, pp. 50–65. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44851-9_4

    Chapter  Google Scholar 

  18. Lai, L.Y.T.: Influential marketing: a new direct marketing strategy addressing the existence of voluntary buyers. Master’s thesis, Simon Fraser University (2006)

    Google Scholar 

  19. Lalonde, R.: Evaluating the econometric evaluations of training programs. Am. Econ. Rev. 76, 604–620 (1986)

    Google Scholar 

  20. Liaw, F., Klebanov, P., Brooks-Gunn, J.: Effects of early intervention on cognitive function of low birth weight preterm infants. J. Pediatr. 120, 350–359 (1991)

    Google Scholar 

  21. Nyberg, O., Kuśmierczyk, T., Klami, A.: Uplift modeling with high class imbalance. In: Proceedings of the 13th Asian Conference on Machine Learning, pp. 315–330, Bangkok, November 2021

    Google Scholar 

  22. Padilla, O.H.M., Chen, Y., Ruiz, G.: A causal fused Lasso for interpretable heterogeneous treatment effects estimation (2022)

    Google Scholar 

  23. Pearl, J.: Causality. Cambridge University Press, Cambridge (2009)

    Google Scholar 

  24. Petersen, K.B., Pedersen, M.S.: The Matrix Cookbook. Technical University of Denmark, November 2012. version 20121115

    Google Scholar 

  25. Radcliffe, N.J., Surry, P.D.: Real-world uplift modelling with significance-based uplift trees. Portrait Technical report TR-2011-1, Stochastic Solutions (2011)

    Google Scholar 

  26. Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. Wiley, New York (1987)

    Google Scholar 

  27. Rudas, K., Jaroszewicz, S.: Linear regression for uplift modeling. Data Min. Knowl. Disc. 32(5), 1275–1305 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  28. Rudaś, K., Jaroszewicz, S.: Shrinkage estimators for uplift regression. In: Brefeld, U., Fromont, E., Hotho, A., Knobbe, A., Maathuis, M., Robardet, C. (eds.) ECML PKDD 2019. LNCS (LNAI), vol. 11906, pp. 607–623. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-46150-8_36

    Chapter  Google Scholar 

  29. Rzepakowski, P., Jaroszewicz, S.: Decision trees for uplift modeling with single and multiple treatments. Knowl. Inf. Syst. 32, 303–327 (2011)

    Article  Google Scholar 

  30. Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search. MIT Press, Cambridge (2001)

    Google Scholar 

  31. Zaniewicz, Ł., Jaroszewicz, S.: Support vector machines for uplift modeling. In: The First IEEE ICDM Workshop on Causal Discovery (CD 2013), Dallas, December 2013

    Google Scholar 

  32. Zaniewicz, Ł., Jaroszewicz, S.: \(l_p\)-support vector machines for uplift modeling. Knowl. Inf. Syst. 53(1), 269–296 (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Krzysztof Rudaś .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rudaś, K., Jaroszewicz, S. (2023). Regularization for Uplift Regression. In: Koutra, D., Plant, C., Gomez Rodriguez, M., Baralis, E., Bonchi, F. (eds) Machine Learning and Knowledge Discovery in Databases: Research Track. ECML PKDD 2023. Lecture Notes in Computer Science(), vol 14169. Springer, Cham. https://doi.org/10.1007/978-3-031-43412-9_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43412-9_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43411-2

  • Online ISBN: 978-3-031-43412-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics