Skip to main content

Uplift Modeling Under Limited Supervision

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases. Research Track (ECML PKDD 2024)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14946))

  • 603 Accesses

Abstract

Estimating causal effects in e-commerce tends to involve costly treatment assignments which can be impractical in large-scale settings. Leveraging machine learning to predict such treatment effects without actual intervention is a standard practice to diminish the risk. However, existing methods for treatment effect prediction tend to rely on training sets of substantial size, which are built from real experiments and are thus inherently risky to create. In this work we propose a graph neural network to diminish the required training set size, relying on graphs that are common in e-commerce data. Specifically, we view the problem as node regression with a restricted number of labeled instances, develop a two-model neural architecture akin to previous causal effect estimators, and test varying message-passing layers for encoding. Furthermore, as an extra step, we combine the model with an acquisition function to guide the creation of the training set in settings with extremely low experimental budget. The framework is flexible since each step can be used separately with other models or treatment policies. The experiments on real large-scale networks indicate a clear advantage of our methodology over the state of the art, which in many cases performs close to random, underlining the need for models that can generalize with limited supervision to reduce experimental risks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/geopanag/UMGNet.

  2. 2.

    https://ods.ai/competitions/x5-retailhero-uplift-modeling.

References

  1. Arbour, D., Garant, D., Jensen, D.: Inferring network effects from observational data. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 715–724 (2016)

    Google Scholar 

  2. Bakshy, E., Eckles, D., Yan, R., Rosenn, I.: Social influence in social advertising: evidence from field experiments. In: Proceedings of the 13th ACM Conference on Electronic Commerce, pp. 146–161 (2012)

    Google Scholar 

  3. Betlei, A., Diemert, E., Amini, M.R.: Uplift modeling with generalization guarantees. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 55–65 (2021)

    Google Scholar 

  4. Cer, D., et al.: Universal sentence encoder (2018). arXiv preprint arXiv:1803.11175

  5. Chen, H., Harinen, T., Lee, J.Y., Yung, M., Zhao, Z.: CausalML: Python package for causal machine learning (2020). arXiv preprint arXiv:2002.11631

  6. Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W.: Double/Debiased/Neyman machine learning of treatment effects. Am. Econ. Rev. 107(5), 261–265 (2017)

    Article  Google Scholar 

  7. Chu, Z., Rathbun, S.L., Li, S.: Graph infomax adversarial learning for treatment effect estimation with networked observational data. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 176–184 (2021)

    Google Scholar 

  8. Cortez, M., Eichhorn, M., Yu, C.: Staggered rollout designs enable causal inference under interference without network knowledge. Adv. Neural. Inf. Process. Syst. 35, 7437–7449 (2022)

    Google Scholar 

  9. Cristali, I., Veitch, V.: Using embeddings for causal estimation of peer influence in social networks. Adv. Neural. Inf. Process. Syst. 35, 15616–15628 (2022)

    Google Scholar 

  10. Dawid, A.P.: Conditional independence in statistical theory. J. R. Stat. Soc. Ser. B Stat Methodol. 41(1), 1–15 (1979)

    Article  MathSciNet  Google Scholar 

  11. Devriendt, F., Moldovan, D., Verbeke, W.: A literature survey and experimental evaluation of the state-of-the-art in uplift modeling: a stepping stone toward the development of prescriptive analytics. Big Data 6(1), 13–41 (2018)

    Article  Google Scholar 

  12. Diemert, E., Betlei, A., Renaudin, C., Amini, M.R.: A large scale benchmark for uplift modeling. In: Proceedings of the KDD Workshop on Artificial Intelligence for Computational Advertising (2018)

    Google Scholar 

  13. Fan, W., et al.: Graph neural networks for social recommendation. In: Proceeding of the 28th ACM Web Conference, pp. 417–426. ACM (2019)

    Google Scholar 

  14. Farzam, A., Tannenbaum, A., Sapiro, G.: Curvature and causal inference in network data. In: Causal Representation Learning Workshop at NeurIPS 2023 (2023)

    Google Scholar 

  15. Gal, Y., Ghahramani, Z.: Dropout as a bayesian approximation: representing model uncertainty in deep learning. In: International Conference on Machine Learning, pp. 1050–1059. PMLR (2016)

    Google Scholar 

  16. Garivier, A., Moulines, E.: On upper-confidence bound policies for switching bandit problems. In: Kivinen, J., Szepesvári, C., Ukkonen, E., Zeugmann, T. (eds.) ALT 2011. LNCS (LNAI), vol. 6925, pp. 174–188. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24412-4_16

    Chapter  Google Scholar 

  17. Gilhuber, S., Busch, J., Rotthues, D., Frey, C.M.M., Seidl, T.: DiffusAL: coupling active learning with graph diffusion for label-efficient node classification. In: Koutra, D., Plant, C., Gomez Rodriguez, M., Baralis, E., Bonchi, F. (eds.) Machine Learning and Knowledge Discovery in Databases: Research Track. ECML PKDD 2023. LNCS(), vol. 14169. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-43412-9_5

  18. Graff, D.E., Shakhnovich, E.I., Coley, C.W.: Accelerating high-throughput virtual screening through molecular pool-based active learning. Chem. Sci. 12(22), 7866–7881 (2021)

    Article  Google Scholar 

  19. Gui, H., Xu, Y., Bhasin, A., Han, J.: Network a/b testing: From sampling to estimation. In: Proceedings of the 24th International Conference on World Wide Web, pp. 399–409 (2015)

    Google Scholar 

  20. Guo, R., Li, J., Liu, H.: Learning individual causal effects from networked observational data. In: Proceedings of the 13th International Conference on Web Search and Data Mining, pp. 232–240 (2020)

    Google Scholar 

  21. Gutierrez, P., Gérardy, J.Y.: Causal inference and uplift modelling: a review of the literature. In: Proceedings of the 4th International Conference on Predictive Applications and APIs, pp. 1–13. PMLR (2017)

    Google Scholar 

  22. Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  23. Hartford, J., Lewis, G., Leyton-Brown, K., Taddy, M.: Deep IV: a flexible approach for counterfactual prediction. In: International Conference on Machine Learning, pp. 1414–1423. PMLR (2017)

    Google Scholar 

  24. He, X., Deng, K., Wang, X., Li, Y., Zhang, Y., Wang, M.: LightGCN: simplifying and powering graph convolution network for recommendation. In: Proceedings of the 43rd ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 639–648. ACM (2020)

    Google Scholar 

  25. Huang, K., Jin, Y., Candes, E., Leskovec, J.: Uncertainty quantification over graph with conformalized graph neural networks. In: Advances in Neural Information Processing Systems, vol. 36 (2023)

    Google Scholar 

  26. Jiang, S., Sun, Y.: Estimating causal effects on networked observational data via representation learning. In: Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pp. 852–861 (2022)

    Google Scholar 

  27. Johansson, F., Shalit, U., Sontag, D.: Learning representations for counterfactual inference. In: Proceedings of the 33rdh International Conference on Machine Learning, pp. 3020–3029. PMLR (2016)

    Google Scholar 

  28. Karrer, B., et al.: Network experimentation at scale. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 3106–3116 (2021)

    Google Scholar 

  29. Kennedy, E.H.: Towards optimal doubly robust estimation of heterogeneous causal effects. Electron. J. Stat. 17(2), 3008–3049 (2023)

    Article  MathSciNet  Google Scholar 

  30. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks (2016). arXiv preprint arXiv:1609.02907

  31. Künzel, S.R., Sekhon, J.S., Bickel, P.J., Yu, B.: Metalearners for estimating heterogeneous treatment effects using machine learning. Proc. Natl. Acad. Sci. 116(10), 4156–4165 (2019)

    Article  Google Scholar 

  32. Lee, B.K., Lessler, J., Stuart, E.A.: Improving propensity score weighting using machine learning. Stat. Med. 29(3), 337–346 (2010)

    Article  MathSciNet  Google Scholar 

  33. Lin, X., Zhang, G., Lu, X., Bao, H., Takeuchi, K., Kashima, H.: Estimating treatment effects under heterogeneous interference. In: Koutra, D., Plant, C., Gomez Rodriguez, M., Baralis, E., Bonchi, F. (eds.) Machine Learning and Knowledge Discovery in Databases: Research Track. ECML PKDD 2023. LNCS(), vol. 14169. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-43412-9_34

  34. Louizos, C., Shalit, U., Mooij, J.M., Sontag, D., Zemel, R., Welling, M.: Causal effect inference with deep latent-variable models. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  35. Ma, J., Guo, R., Chen, C., Zhang, A., Li, J.: Deconfounding with networked observational data in a dynamic environment. In: Proceedings of the 14th ACM International Conference on Web Search and Data Mining, pp. 166–174 (2021)

    Google Scholar 

  36. Ma, J., Wan, M., Yang, L., Li, J., Hecht, B., Teevan, J.: Learning causal effects on hypergraphs. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 1202–1212 (2022)

    Google Scholar 

  37. Ma, Y., Tresp, V.: Causal inference under networked interference and intervention policy enhancement. In: International Conference on Artificial Intelligence and Statistics, pp. 3700–3708. PMLR (2021)

    Google Scholar 

  38. Olaya, D., Verbeke, W., Van Belle, J., Guerry, M.A.: To do or not to do: cost-sensitive causal decision-making. Eur. J. Oper. Res. 305(2), 838–852 (2023)

    Article  Google Scholar 

  39. Panagopoulos, G., Tziortziotis, N., Vazirgiannis, M., Malliaros, F.: Maximizing influence with graph neural networks. In: Proceedings of the International Conference on Advances in Social Networks Analysis and Mining, pp. 237–244 (2023)

    Google Scholar 

  40. Pearl, J.: Causality. Cambridge University Press, Cambridge (2009)

    Google Scholar 

  41. Radcliffe, N.: Using control groups to target on predicted lift: building and assessing uplift model. Dir. Mark. Anal. J. 14–21 (2007)

    Google Scholar 

  42. Radcliffe, N.J., Surry, P.D.: Real-world uplift modelling with significance-based uplift trees. White Paper TR-2011-1, Stochastic Solutions, pp. 1–33 (2011)

    Google Scholar 

  43. Rafla, M., Voisine, N., Crémilleux, B.: Evaluation of Uplift Models with Non-Random Assignment Bias. In: Bouadi, T., Fromont, E., Hüllermeier, E. (eds.) IDA 2022. LNCS, vol. 13205, pp. 251–263. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-01333-1_20

    Chapter  Google Scholar 

  44. Rubin, D.B.: Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 66(5), 688 (1974)

    Article  Google Scholar 

  45. Rubin, D.B.: Causal inference using potential outcomes: design, modeling, decisions. J. Am. Stat. Assoc. 100(469), 322–331 (2005)

    Article  MathSciNet  Google Scholar 

  46. Rudaś, K., Jaroszewicz, S.: Regularization for uplift regression. In: Koutra, D., Plant, C., Gomez Rodriguez, M., Baralis, E., Bonchi, F. (eds.) Machine Learning and Knowledge Discovery in Databases: Research Track. ECML PKDD 2023. LNCS(), vol. 14169. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-43412-9_35

  47. Russo, D.J., Van Roy, B., Kazerouni, A., Osband, I., Wen, Z., et al.: A tutorial on Thompson sampling. Found. Trends® Mach. Learn. 11(1), 1–96 (2018)

    Google Scholar 

  48. Rzepakowski, P., Jaroszewicz, S.: Decision trees for uplift modeling with single and multiple treatments. Knowl. Inf. Syst. 32, 303–327 (2012)

    Article  Google Scholar 

  49. Settles, B.: Active learning literature survey (2009)

    Google Scholar 

  50. Shalit, U., Johansson, F.D., Sontag, D.: Estimating individual treatment effect: generalization bounds and algorithms. In: Proceedings of the 34th International Conference on Machine Learning, pp. 3076–3085. PMLR (2017)

    Google Scholar 

  51. Shi, C., Blei, D., Veitch, V.: Adapting neural networks for the estimation of treatment effects. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

  52. Sołtys, M., Jaroszewicz, S.: Boosting algorithms for uplift modeling (2018). arXiv preprint arXiv:1807.07909

  53. Stadler, M., Charpentier, B., Geisler, S., Zügner, D., Günnemann, S.: Graph posterior network: Bayesian predictive uncertainty for node classification. Adv. Neural. Inf. Process. Syst. 34, 18033–18048 (2021)

    Google Scholar 

  54. Tye, H.: Application of statistical ‘design of experiments’ methods in drug discovery. Drug Discov. Today 9(11), 485–491 (2004)

    Article  Google Scholar 

  55. Ugander, J., Karrer, B., Backstrom, L., Kleinberg, J.: Graph cluster randomization: network exposure to multiple universes. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 329–337 (2013)

    Google Scholar 

  56. Vanderschueren, T., Verbeke, W., Moraes, F., Proença, H.M.: Metalearners for ranking treatment effects (2024). arXiv preprint arXiv:2405.02183

  57. Veitch, V., Wang, Y., Blei, D.: Using embeddings to correct for unobserved confounding in networks. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

  58. Verhelst, T., Petit, R., Verbeke, W., Bontempi, G.: Uplift vs. predictive modeling: a theoretical analysis (2023). arXiv preprint arXiv:2309.12036

  59. Wager, S., Athey, S.: Estimation and inference of heterogeneous treatment effects using random forests. J. Am. Stat. Assoc. 113(523), 1228–1242 (2018)

    Article  MathSciNet  Google Scholar 

  60. Wang, X., He, X., Wang, M., Feng, F., Chua, T.: Neural graph collaborative filtering. In: SIGIR, pp. 165–174. ACM (2019)

    Google Scholar 

  61. Wei, K., Iyer, R., Bilmes, J.: Submodularity in data subset selection and active learning. In: Proceedings of the 32nd International Conference on Machine Learning, pp. 1954–1963. PMLR (2015)

    Google Scholar 

  62. Wright, D.B.: Comparing groups in a before-after design: when t test and ANCOVA produce different results. Br. J. Educ. Psychol. 76(3), 663–675 (2006)

    Article  Google Scholar 

  63. Wu, Y., Xu, Y., Singh, A., Yang, Y., Dubrawski, A.: Active learning for graph neural networks via node feature propagation (2019). arXiv preprint arXiv:1910.07567

Download references

Acknowledgements

Supported in part by ANR (French National Research Agency) under the JCJC project GraphIA (ANR-20-CE23-0009-01).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to George Panagopoulos .

Editor information

Editors and Affiliations

Ethics declarations

Ethics

This study only involved public datasets that are freely available for academic purposes. There are no obvious ethical considerations regarding negative impacts from the broader application of the method.

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 90 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Panagopoulos, G., Malitesta, D., Malliaros, F.D., Pang, J. (2024). Uplift Modeling Under Limited Supervision. In: Bifet, A., Davis, J., Krilavičius, T., Kull, M., Ntoutsi, E., Žliobaitė, I. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2024. Lecture Notes in Computer Science(), vol 14946. Springer, Cham. https://doi.org/10.1007/978-3-031-70365-2_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-70365-2_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-70364-5

  • Online ISBN: 978-3-031-70365-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics