Skip to main content
Log in

Analyzing the Impact of Personalization on Fairness in Federated Learning for Healthcare

  • Research Article
  • Published:
Journal of Healthcare Informatics Research Aims and scope Submit manuscript

Abstract

As machine learning (ML) usage becomes more popular in the healthcare sector, there are also increasing concerns about potential biases and risks such as privacy. One countermeasure is to use federated learning (FL) to support collaborative learning without the need for patient data sharing across different organizations. However, the inherent heterogeneity of data distributions among participating FL parties poses challenges for exploring group fairness in FL. While personalization within FL can handle performance degradation caused by data heterogeneity, its influence on group fairness is not fully investigated. Therefore, the primary focus of this study is to rigorously assess the impact of personalized FL on group fairness in the healthcare domain, offering a comprehensive understanding of how personalized FL affects group fairness in clinical outcomes. We conduct an empirical analysis using two prominent real-world Electronic Health Records (EHR) datasets, namely eICU and MIMIC-IV. Our methodology involves a thorough comparison between personalized FL and two baselines: standalone training, where models are developed independently without FL collaboration, and standard FL, which aims to learn a global model via the FedAvg algorithm. We adopt Ditto as our personalized FL approach, which enables each client in FL to develop its own personalized model through multi-task learning. Our assessment is achieved through a series of evaluations, comparing the predictive performance (i.e., AUROC and AUPRC) and fairness gaps (i.e., EOPP, EOD, and DP) of these methods. Personalized FL demonstrates superior predictive accuracy and fairness over standalone training across both datasets. Nevertheless, in comparison with standard FL, personalized FL shows improved predictive accuracy but does not consistently offer better fairness outcomes. For instance, in the 24-h in-hospital mortality prediction task, personalized FL achieves an average EOD of 27.4% across racial groups in the eICU dataset and 47.8% in MIMIC-IV. In comparison, standard FL records a better EOD of 26.2% for eICU and 42.0% for MIMIC-IV, while standalone training yields significantly worse EOD of 69.4% and 54.7% on these datasets, respectively. Our analysis reveals that personalized FL has the potential to enhance fairness in comparison to standalone training, yet it does not consistently ensure fairness improvements compared to standard FL. Our findings also show that while personalization can improve fairness for more biased hospitals (i.e., hospitals having larger fairness gaps in standalone training), it can exacerbate fairness issues for less biased ones. These insights suggest that the integration of personalized FL with additional strategic designs could be key to simultaneously boosting prediction accuracy and reducing fairness disparities. The findings and opportunities outlined in this paper can inform the research agenda for future studies, to overcome the limitations and further advance health equity research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Availability of Data and Materials

Data used in this study are openly available and free for research [59, 64].

Code Availability

Code will be made available upon request.

References

  1. Purushotham S, Meng C, Che Z, Liu Y (2018) Benchmarking deep learning models on large healthcare datasets. J Biomed Inform 83:112–134

    Article  Google Scholar 

  2. Harutyunyan H, Khachatrian H, Kale DC, Ver Steeg G, Galstyan A (2019) Multitask learning and benchmarking with clinical time series data. Sci Data 6(1):96

    Article  Google Scholar 

  3. Wang S, McDermott MB, Chauhan G, Ghassemi M, Hughes MC, Naumann T (2020) MIMIC-extract: a data extraction, preprocessing, and representation pipeline for MIMIC-III. In: Proceedings of the ACM conference on health, inference, and learning, pp 222–235

  4. Bhatt P, Liu J, Gong Y, Wang J, Guo Y (2022) Emerging artificial intelligence-empowered mhealth: scoping review. JMIR mHealth and uHealth 10(6):35053

    Article  Google Scholar 

  5. Rieke N, Hancox J, Li W, Milletari F, Roth HR, Albarqouni S, Bakas S, Galtier MN, Landman BA, Maier-Hein K et al (2020) The future of digital health with federated learning. NPJ Digit Med 3(1):119

    Article  Google Scholar 

  6. Chen IY, Szolovits P, Ghassemi M (2019) Can AI help reduce disparities in general medical and mental health care? AMA J Ethics 21(2):167–179

    Article  Google Scholar 

  7. Leslie D, Mazumder A, Peppin A, Wolters MK, Hagerty A (2021) Does AI stand for augmenting inequality in the era of covid-19 healthcare? BMJ 372

  8. Braveman P (2006) Health disparities and health equity: concepts and measurement. Annu Rev Public Health 27:167–194

    Article  Google Scholar 

  9. Ghassemi M, Naumann T, Schulam P, Beam AL, Chen IY, Ranganath R (2020) A review of challenges and opportunities in machine learning for health. AMIA Summits Transl Sci Proc 2020:191

    Google Scholar 

  10. Zhang H, Lu AX, Abdalla M, McDermott M, Ghassemi M (2020) Hurtful words: quantifying biases in clinical contextual word embeddings. In: Proceedings of the ACM conference on health, inference, and learning, pp 110–120

  11. Gianfrancesco MA, Tamang S, Yazdany J, Schmajuk G (2018) Potential biases in machine learning algorithms using electronic health record data. JAMA Intern Med 178(11):1544–1547

    Article  Google Scholar 

  12. Popejoy AB, Ritter DI, Crooks K, Currey E, Fullerton SM, Hindorff LA, Koenig B, Ramos EM, Sorokin EP, Wand H et al (2018) The clinical imperative for inclusivity: race, ethnicity, and ancestry (rea) in genomics. Hum Mutat 39(11):1713–1720

    Article  Google Scholar 

  13. Rajkomar A, Hardt M, Howell MD, Corrado G, Chin MH (2018) Ensuring fairness in machine learning to advance health equity. Ann Intern Med 169(12):866–872

    Article  Google Scholar 

  14. Voigt P, Bussche A (2017) The eu general data protection regulation (gdpr). A Practical Guide, 1st Ed., Cham: Springer International Publishing. 10(3152676):10–5555

  15. Health UD, Services H (2013) Others: Modifications to the hipaa privacy, security, enforcement, and breach notification rules under the health information technology for economic and clinical health act and the genetic information nondiscrimination act; other modifications to the hipaa rules. Fed Regist 78(17):5566–5702

    Google Scholar 

  16. Dwork C, Hardt M, Pitassi T, Reingold O, Zemel R (2012) Fairness through awareness. In: Proceedings of the 3rd innovations in theoretical computer science conference, pp 214–226

  17. Feldman M, Friedler SA, Moeller J, Scheidegger C, Venkatasubramanian S (2015) Certifying and removing disparate impact. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pp 259–268

  18. Hardt M, Price E, Srebro N (2016) Equality of opportunity in supervised learning. Adv Neural Inf Process 29

  19. Agarwal A, Dudík M, Wu ZS (2019) Fair regression: Quantitative definitions and reduction-based algorithms. In: International conference on machine learning. PMLR, pp 120–129

  20. Agarwal A, Beygelzimer A, Dudík M, Langford J, Wallach H (2018) A reductions approach to fair classification. In: International conference on machine learning. PMLR, pp 60–69

  21. Roh Y, Lee K, Whang SE, Suh C (2021) Fairbatch: batch selection for model fairness. In: 9th International conference on learning representations

  22. Chai J, Wang X (2022) Fairness with adaptive weights. In: International conference on machine learning. PMLR, pp 2853–2866

  23. McMahan B, Moore E, Ramage D, Hampson S, Arcas BA (2017) Communication-efficient learning of deep networks from decentralized data. In: Artificial intelligence and statistics. PMLR, pp 1273–1282

  24. Wu X, Huang F, Hu Z, Huang H (2023) Faster adaptive federated learning. In: Proceedings of the AAAI conference on artificial intelligence, vol 37, pp 10379–10387

  25. Guo Y, Sun Y, Hu R, Gong Y (2022) Hybrid local sgd for federated learning with heterogeneous communications. In: International conference on learning representations

  26. Hu R, Gong Y, Guo Y (2021) Federated learning with sparsification-amplified privacy and adaptive optimization. In: Proceedings of the thirtieth international joint conference on artificial intelligence

  27. Wang T, Du Y, Gong Y, Choo K-KR, Guo Y (2023) Applications of federated learning in mobile health: scoping review. J Med Internet Res 25:43006

    Article  Google Scholar 

  28. Wang T, Guo Y, Choo K-KR (2023) Enabling privacy-preserving prediction for length of stay in ICU-a multimodal federated-learning-based approach. In: European conference on information systems (ECIS)

  29. Cui S, Pan W, Liang J, Zhang C, Wang F (2021) Addressing algorithmic disparity and performance inconsistency in federated learning. Adv Neural Inf Process Syst 34:26091–26102

    Google Scholar 

  30. Du W, Xu D, Wu X, Tong H (2021) Fairness-aware agnostic federated learning. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM). SIAM, pp 181–189

  31. Papadaki A, Martinez N, Bertran M, Sapiro G, Rodrigues M (2022) Minimax demographic group fairness in federated learning. In: 2022 ACM Conference on fairness, accountability, and transparency, pp 142–159

  32. Chang H, Shokri R (2023) Bias propagation in federated learning. In: The Eleventh international conference on learning representations. https://openreview.net/forum?id=V7CYzdruWdm

  33. Smith V, Chiang C-K, Sanjabi M, Talwalkar AS (2017) Federated multi-task learning. Adv Neural Inf Process Syst 30

  34. Li T, Hu S, Beirami A, Smith V (2021) Ditto: fair and robust federated learning through personalization. In: International conference on machine learning. PMLR, pp 6357–6368

  35. Collins L, Hassani H, Mokhtari A, Shakkottai S (2021) Exploiting shared representations for personalized federated learning. In: International conference on machine learning. PMLR, pp 2089–2099

  36. Zhao Y, Li M, Lai L, Suda N, Civin D, Chandra V (2018) Federated learning with non-iid data. Preprint at arXiv:1806.00582

  37. Friedler SA, Scheidegger C, Venkatasubramanian S, Choudhary S, Hamilton EP, Roth D (2019) A comparative study of fairness-enhancing interventions in machine learning. In: Proceedings of the conference on fairness, accountability, and transparency, pp 329–338

  38. Blum A, Stangl K (2020) Recovering from biased data: can fairness constraints improve accuracy? In: 1st Symposium on foundations of responsible computing

  39. Zhang BH, Lemoine B, Mitchell M (2018) Mitigating unwanted biases with adversarial learning. In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp 335–340

  40. Kim MP, Ghorbani A, Zou J (2019) Multiaccuracy: Black-box post-processing for fairness in classification. In: Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pp 247–254

  41. Pfohl S, Marafino B, Coulet A, Rodriguez F, Palaniappan L, Shah NH (2019) Creating fair models of atherosclerotic cardiovascular disease risk. In: Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pp 271–278

  42. Pfohl SR, Duan T, Ding DY, Shah NH (2019) Counterfactual reasoning for fair clinical risk prediction. In: Machine learning for healthcare conference. PMLR, pp 325–358

  43. Marcinkevics R, Ozkan E, Vogt JE (2022) Debiasing deep chest x-ray classifiers using intra-and post-processing methods. In: Machine Learning for Healthcare Conference. PMLR, pp 504–536

  44. Ezzeldin YH, Yan S, He C, Ferrara E, Avestimehr AS (2023) Fairfed: Enabling group fairness in federated learning. In: Proceedings of the AAAI conference on artificial intelligence, vol 37, pp 7494–7502

  45. Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: International conference on machine learning. PMLR, pp 1126–1135

  46. Khodak M, Balcan M-FF, Talwalkar AS (2019) Adaptive gradient-based meta-learning methods. Adv Neural Inf Process Syst 32

  47. Hu R, Guo Y, Li H, Pei Q, Gong Y (2020) Personalized federated learning with differential privacy. IEEE Internet Things J 7(10):9530–9539

    Article  Google Scholar 

  48. Dinh CT, Tran N, Nguyen J (2020) Personalized federated learning with moreau envelopes. Adv Neural Inf Process Syst 33:21394–21405

    Google Scholar 

  49. Li D, Wang J (2019) Fedmd: Heterogenous federated learning via model distillation. Preprint at arXiv:1910.03581

  50. Deng Y, Kamani MM, Mahdavi M (2020) Adaptive personalized federated learning. Preprint at arXiv:2003.13461

  51. Liang PP, Liu T, Ziyin L, Allen NB, Auerbach RP, Brent D, Salakhutdinov R, Morency L-P (2020) Think locally, act globally: Federated learning with local and global representations. Preprint atarXiv:2001.01523

  52. Qin Z, Yao L, Chen D, Li Y, Ding B, Cheng M (2023) Revisiting personalized federated learning: Robustness against backdoor attacks. In: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. KDD ’23, Association for Computing Machinery, New York, USA, pp 4743–4755

  53. Li X, Jiang M, Zhang X, Kamp M, Dou Q (2021) FedBN: Federated learning on non-IID features via local batch normalization. In: International conference on learning representations. https://openreview.net/forum?id=6YEQUn0QICG

  54. Li T, Sahu AK, Zaheer M, Sanjabi M, Talwalkar A, Smith V (2020) Federated optimization in heterogeneous networks. Proc Mach Learn Syst 2:429–450

    Google Scholar 

  55. Chen H-Y, Chao W-L (2022) On bridging generic and personalized federated learning for image classification. In: International conference on learning representations. https://openreview.net/forum?id=I1hQbx10Kxn

  56. Fallah A, Mokhtari A, Ozdaglar A (2020) Personalized federated learning with theoretical guarantees: a model-agnostic meta-learning approach. Adv Neural Inf Process Syst 33:3557–3568

    Google Scholar 

  57. Li C, Niu D, Jiang B, Zuo X, Yang J (2021) Meta-har: Federated representation learning for human activity recognition. In: Proceedings of the web conference 2021, pp 912–922

  58. Wu Q, Chen X, Zhou Z, Zhang J (2020) Fedhome: Cloud-edge based personalized federated learning for in-home health monitoring. IEEE Trans Mob Comput 21(8):2818–2832

    Article  Google Scholar 

  59. Pollard TJ, Johnson AE, Raffa JD, Celi LA, Mark RG, Badawi O (2018) The eicu collaborative research database, a freely available multi-center database for critical care research. Sci Data 5(1):1–13

    Article  Google Scholar 

  60. Rocheteau E, Liò P, Hyland S (2021) Temporal pointwise convolutional networks for length of stay prediction in the intensive care unit. In: Proceedings of the conference on health, inference, and learning, pp 58–68

  61. Obermeyer Z, Powers B, Vogeli C, Mullainathan S (2019) Dissecting racial bias in an algorithm used to manage the health of populations. Science 366(6464):447–453

    Article  Google Scholar 

  62. Mauvais-Jarvis F, Merz NB, Barnes PJ, Brinton RD, Carrero J-J, DeMeo DL, De Vries GJ, Epperson CN, Govindan R, Klein SL et al (2020) Sex and gender: modifiers of health, disease, and medicine. Lancet 396(10250):565–582

    Article  Google Scholar 

  63. Che Z, Purushotham S, Cho K, Sontag D, Liu Y (2018) Recurrent neural networks for multivariate time series with missing values. Sci Rep 8(1):1–12

    Article  Google Scholar 

  64. Johnson A, Bulgarelli L, Pollard T, Horng S, Celi LA, Mark R (2020) Mimic-iv (version 0.4). PhysioNet. Available online at: https://physionet.org/content/mimiciv/0.4/. Accessed 13 Aug 2020

  65. Hsu T-MH, Qi H, Brown M (2019) Measuring the effects of non-identical data distribution for federated visual classification. Preprint arXiv:1909.06335

  66. Poulain R, Bin Tarek MF, Beheshti R (2023) Improving fairness in ai models on electronic health records: the case for federated learning methods. In: Proceedings of the 2023 ACM conference on fairness, accountability, and transparency, pp 1599–1608

  67. Kalchbrenner N, Espeholt L, Simonyan K, Oord Avd, Graves A, Kavukcuoglu K (2016) Neural machine translation in linear time. Preprint at arXiv:1610.10099

  68. Oord Avd, Dieleman S, Zen H, Simonyan K, Vinyals O, Graves A, Kalchbrenner N, Senior A, Kavukcuoglu K (2016) Wavenet: A generative model for raw audio. Preprint arXiv:1609.03499

  69. Sundararajan M, Taly A, Yan Q (2017) Axiomatic attribution for deep networks. In: International conference on machine learning. PMLR, pp 3319–3328

Download references

Funding

The work of Y. Guo was partially supported by NSF CNS-2106761, CMMI-2222670, and UTSA Office of the Vice President for Research, Economic Development, and Knowledge Enterprise. The work of Y. Gong was partially supported by NSF CNS-2047761, CNS-2106761, and Cisco Research Award. The work of J. Cai was partially supported by NSF CMMI-2222670.

Author information

Authors and Affiliations

Authors

Contributions

Tongnian Wang: conception, implementation, analysis, and writing. Kai Zhang: writing support, and cross-reading. Jiannan Cai: writing support, and cross-reading. Yanmin Gong: conception, writing support, and cross-reading. Kim-Kwang Raymond Choo: Conception and working as co-supervisor. Yuanxiong Guo: providing ideas and working as supervisor. All authors contributed to the manuscript and reviewed it.

Corresponding author

Correspondence to Yuanxiong Guo.

Ethics declarations

Ethics Approval

Not applicable.

Consent to Participate

Not applicable.

Consent for Publication

Not applicable.

Conflict of Interest

The authors declare no competing interests.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, T., Zhang, K., Cai, J. et al. Analyzing the Impact of Personalization on Fairness in Federated Learning for Healthcare. J Healthc Inform Res 8, 181–205 (2024). https://doi.org/10.1007/s41666-024-00164-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41666-024-00164-7

Keywords

Navigation