Skip to main content
Log in

Fairness with censorship and group constraints

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Fairness in machine learning (ML) has gained attention within the ML community and the broader society beyond with many fairness definitions and algorithms being proposed. Surprisingly, there is little work quantifying and guaranteeing fairness in the presence of uncertainty which is prevalent in many socially sensitive applications, ranging from marketing analytics to actuarial analysis and recidivism prediction instruments. To this end, we revisit fairness and reveal idiosyncrasies of existing fairness literature assuming certainty on the class label that limits their real-world utility. Our primary contributions are formulating fairness under uncertainty and group constraints along with a suite of corresponding new fairness definitions and algorithm. We argue that this formulation has a broader applicability to practical scenarios concerning fairness. We also show how the newly devised fairness notions involving censored information and the general framework for fair predictions in the presence of censorship allow us to measure and mitigate discrimination under uncertainty that bridges the gap with real-world applications. Empirical evaluations on real-world datasets with censorship and sensitive attributes demonstrate the practicality of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Agarwal A, Beygelzimer A, Dudík M, Langford J, Wallach H (2018) A reductions approach to fair classification. In: international conference on machine learning, pp 60–69. PMLR

  2. Angwin J, Larson J, Mattu S, Kirchner L (2016) There’s software used across the country to predict future criminals. ProPublica, and it’s biased against blacks

  3. Barocas S, Hardt M, Narayanan A (2017) Fairness in machine learning. Nips Tutorial 1:2

    Google Scholar 

  4. Barocas S, Selbst AD (2016) Big data’s disparate impact. Calif Law Rev 104(3):671

    Google Scholar 

  5. Bechavod Y, Jung C, Wu SZ (2020) Metric-free individual fairness in online learning. Adv Neural Inform Proc Syst 33:11214

    Google Scholar 

  6. Beutel A, Chen J, Doshi T, Qian H, Woodruff A, Luu C, Kreitmann P, Bischof J, Chi E.H (2019) Putting fairness principles into practice: challenges, metrics, and improvements. AIES’19, January 27–28, 2019, Honolulu, HI, USA

  7. Beutel A, Chen J, Zhao Z, Chi E.H (2017) Data decisions and theoretical implications when adversarially learning fair representations. arXiv preprint arXiv:1707.00075

  8. Binns R (2018) Fairness in machine learning: lessons from political philosophy. In: conference on fairness, accountability and transparency, pp 149–159

  9. Bland JM, Altman DG (2004) The logrank test. Bmj 328(7447):1073

    Article  Google Scholar 

  10. Bonnans JF, Shapiro A (2013) Perturbation analysis of optimization problems. Springer, USA

    MATH  Google Scholar 

  11. Borgan Ø (2014) Nelson-aalen estimator Statistics Reference Online. Wiley, London

    Google Scholar 

  12. Bou-Hamad I, Larocque D, Ben-Ameur H et al (2011) A review of survival trees. Stat surv 5:44–71

    Article  MathSciNet  MATH  Google Scholar 

  13. Bradburn MJ, Clark TG, Love SB, Altman DG (2003) Survival analysis part ii: multivariate data analysis-an introduction to concepts and methods. Br J Cancer 89(3):431–436

    Article  Google Scholar 

  14. Brier G.W, Allen R.A (1951) Verification of weather forecasts. In: compendium of meteorology, pp 841–848. Springer

  15. Calders T, Kamiran F, Pechenizkiy M (2009) Building classifiers with independency constraints. In: ICDMW, pp 13–18

  16. Chambless LE, Diao G (2006) Estimation of time-dependent area under the roc curve for long-term risk prediction. Stat Med 25(20):3474–3486

    Article  MathSciNet  Google Scholar 

  17. Chang V (2021) An ethical framework for big data and smart cities. Technol Forecast Soc Chang 165:120559

    Article  Google Scholar 

  18. Chen C, Wong R (2019) Black patients miss out on promising cancer drugs-propublica. 2018

  19. Chouldechova A (2017) Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big Data 5(2):153–163

    Article  Google Scholar 

  20. Clark TG, Bradburn MJ, Love SB, Altman DG (2003) Survival analysis part i: basic concepts and first analyses. Br J Cancer 89(2):232–238

    Article  Google Scholar 

  21. Cox DR (1972) Regression models and life-tables. J Roy Stat Soc: Ser B (Methodol) 34(2):187–202

    MathSciNet  MATH  Google Scholar 

  22. D’Agostino RB, Nam BH (2003) Evaluation of the performance of survival analysis models: discrimination and calibration measures. Handbook Stat 23:1–25

    Article  MathSciNet  Google Scholar 

  23. Du M, Liu N, Yang F, Hu X (2021) Learning credible dnns via incorporating prior knowledge and model local explanation. Knowl Inf Syst 63(2):305–332

    Article  Google Scholar 

  24. Dwork C, Hardt M, Pitassi T, Reingold O, Zemel R (2012) Fairness through awareness. In: proceedings of the 3rd innovations in theoretical computer science conference, pp. 214–226

  25. Fish B, Kun J, Lelkes, Á D (2016) A confidence-based approach for balancing fairness and accuracy. In: SDM, pp 144–152

  26. Fox J, Carvalho MS et al (2012) The rcmdrplugin survival package: extending the r commander interface to survival analysis. J Stat Soft 49(7):1–32

    Article  Google Scholar 

  27. Frezal S, Barry L (2019) Fairness in uncertainty: some limits and misinterpretations of actuarial fairness. J Business Ethics 167:1–10

    Google Scholar 

  28. Grace K, Salvatier J, Dafoe A, Zhang B, Evans O (2018) When will ai exceed human performance? evidence from ai experts. J Artif Intell Res 62:729–754

    Article  MathSciNet  Google Scholar 

  29. Haider H, Hoehn B, Davis S, Greiner R (2020) Effective ways to build and evaluate individual survival distributions. J Mach Learn Res 21:1–85

    MathSciNet  Google Scholar 

  30. Hajian S, Bonchi F, Castillo C (2016) Algorithmic bias: from discrimination discovery to fairness-aware data mining. In: proceedings of the SIGKDD international conference on knowledge discovery and data mining, pp. 2125–2126

  31. Han J, Kamber M, Pei J (2012) Data mining: concepts and techniques, waltham, ma

  32. Hardt M, Price E, Srebro N, et al. (2016) Equality of opportunity in supervised learning. In: advances in neural information processing systems, pp. 3315–3323

  33. Harrell FE, Califf RM, Pryor DB, Lee KL, Rosati RA (1982) Evaluating the yield of medical tests. JAMA 247(18):2543–2546

    Article  Google Scholar 

  34. Hill K (2020) Wrongfully accused by an algorithm

  35. Hosmer DW, Lemesbow S (1980) Goodness of fit tests for the multiple logistic regression model. Commun stat Theory Methods 9(10):1043–1069

    Article  MATH  Google Scholar 

  36. Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS et al (2008) Random survival forests. Annals Appl Stat 2(3):841–860

    Article  MathSciNet  MATH  Google Scholar 

  37. Kamiran F, Calders T (2009) Classifying without discriminating. In: 2nd international conference on computer, control and communication, pp 1–6

  38. Kamiran F, Calders T (2012) Data preprocessing techniques for classification without discrimination. Knowl Inf Syst 33(1):1–33

    Article  Google Scholar 

  39. Katzman JL, Shaham U, Cloninger A, Bates J, Jiang T, Kluger Y (2018) Deepsurv: personalized treatment recommender system using a cox proportional hazards deep neural network. BMC Med Res Methodol 18(1):1–12

    Article  Google Scholar 

  40. Keya K.N, Pan S, Stockwell I, Foulds J (2020) Equitable allocation of healthcare resources with fair cox models. In: AAAI fall symposium on AI in government and public sector

  41. Knaus WA, Harrell FE, Lynn J, Goldman L, Phillips RS, Connors AF, Dawson NV, Fulkerson WJ, Califf RM, Desbiens N et al (1995) The support prognostic model: objective estimates of survival for seriously ill hospitalized adults. Ann Intern Med 122(3):191–203

    Article  Google Scholar 

  42. Kuzborskij I, Lampert C (2018) Data-dependent stability of stochastic gradient descent. In: international conference on machine learning, pp 2815–2824. PMLR

  43. Kvamme H, Borgan Ø, Scheel I (2019) Time-to-event prediction with neural networks and cox regression. J Mach Learn Res 20(129):1–30

    MathSciNet  MATH  Google Scholar 

  44. Latouche A, Allignol A, Beyersmann J, Labopin M, Fine JP (2013) A competing risks analysis should report results on all cause-specific hazards and cumulative incidence functions. J Clin Epidemiol 66(6):648–653

    Article  Google Scholar 

  45. Mehrabi N, Morstatter F, Saxena N, Lerman K, Galstyan A (2021) A survey on bias and fairness in machine learning. ACM Comput Surv (CSUR) 54(6):1–35

    Article  Google Scholar 

  46. Meyer D (2018) Amazon reportedly killed an ai recruitment system because it couldn’t stop the tool from discriminating against women. fortune, Oct 10

  47. Miller Jr R.G (2011) Survival analysis, vol. 66. John Wiley & Sons

  48. Quy TL, Roy A, Iosifidis V, Zhang W, Ntoutsi E (2022) A survey on datasets for fairness-aware machine learning. Data Mining Knowledge Disc 12(3):1452

    Google Scholar 

  49. Ranstam J, Cook J (2017) Kaplan-meier curve. Br J Surg 104(4):442–442

    Article  Google Scholar 

  50. Ringrose K, Ramjee D (2020) Watch where you walk: law enforcement surveillance and protester privacy. Calif. L. Rev. Online 11:349

    Google Scholar 

  51. Saxena N.A, Zhang W, Shahabi C (2023) Unveiling and mitigating bias in ride-hailing pricing for equitable policy making. arXiv preprint arXiv:2301.03489

  52. Skirpan M, Gorelick M (2017) The authority of“fair”in machine learning

  53. Turner K, Brownstein NC, Thompson Z, El Naqa I, Luo Y, Jim HS, Rollison DE, Howard R, Zeng D, Rosenberg SA et al (2022) Longitudinal patient-reported outcomes and survival among early-stage non-small cell lung cancer patients receiving stereotactic body radiotherapy. Radiother Oncol 167:116–121

    Article  Google Scholar 

  54. Verma S, Rubin J (2018) Fairness definitions explained. In: 2018 IEEE/ACM international workshop on software fairness (FairWare), pp 1–7. IEEE

  55. Wan C, Chang W, Zhao T, Cao S, Zhang C (2020) Denoising individual bias for fairer binary submatrix detection. In: proceedings of the 29th ACM International Conference on Information and Knowledge management, pp. 2245–2248

  56. Wang P, Li Y, Reddy CK (2019) Machine learning for survival analysis: a survey. ACM Comput Surv (CSUR) 51(6):1–36

    Article  Google Scholar 

  57. Wang X, Zhang W, Jadhav A, Weiss J (2021) Harmonic-mean cox models: a ruler for equal attention to risk. In: survival prediction-algorithms, challenges and applications, pp. 171–183. PMLR

  58. Woolson R.F (2007) Wilcoxon signed-rank test. Wiley encyclopedia of clinical trials pp 1–3

  59. Zafar M.B, Valera I, Gomez Rodriguez M, Gummadi K.P (2017) Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In: proceedings of the 26th international conference on world wide web, pp. 1171–1180

  60. Zeng J, Ustun B, Rudin C (2017) Interpretable classification models for recidivism prediction. J R Stat Soc A Stat Soc 180(3):689–722

    Article  MathSciNet  Google Scholar 

  61. Zhang W, Bifet A (2020) Feat: A fairness-enhancing and concept-adapting decision tree classifier. In: international conference on discovery science, pp. 175–189. Springer

  62. Zhang W, Bifet A, Zhang X, Weiss J.C, Nejdl W (2021) Farf: A fair and adaptive random forests classifier. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 245–256. Springer

  63. Zhang W, Hernandez-Boussard T, Weiss J.C (2023) Censored fairness through awareness. In: Proceedings of the AAAI conference on artificial intelligence

  64. Zhang W, Ntoutsi E (2019) Faht: an adaptive fairness-aware decision tree classifier. In: international joint conference on artificial intelligence (IJCAI), pp. 1480–1486

  65. Zhang W, Tang J, Wang N (2016) Using the machine learning approach to predict patient survival from high-dimensional survival data. In: IEEE international conference on bioinformatics and biomedicine (BIBM)

  66. Zhang W, Tang X, Wang J (2019) On fairness-aware learning for non-discriminative decision-making. In: international conference on data mining workshops (ICDMW), pp 1072–1079

  67. Zhang W, Weiss J (2021) Fair decision-making under uncertainty. In: 2021 IEEE international conference on data mining (ICDM). IEEE

  68. Zhang W, Weiss J.C (2022) Longitudinal fairness with censorship. In: proceedings of the AAAI conference on artificial intelligence, vol 36, pp 12235–12243

  69. Zhang W, Weiss J.C, Zhou S, Walsh T (2022) Fairness amidst non-iid graph data: A literature review

  70. Zhang W, Zhang L, Pfoser D, Zhao L (2021) Disentangled dynamic graph deep generation. In: proceedings of the SIAM international conference on data mining (SDM) pp 738–746

  71. Žliobaite I, Kamiran F, Calders T (2011) Handling conditional discrimination. In: 2011 IEEE 11th international conference on data mining, pp 992–1001. IEEE

Download references

Acknowledgements

This research was supported in part by the Intramural Research Program of the National Library of Medicine (NLM), National Institutes of Health and a NVIDIA GPU Grant.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wenbin Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, W., Weiss, J.C. Fairness with censorship and group constraints. Knowl Inf Syst 65, 2571–2594 (2023). https://doi.org/10.1007/s10115-023-01842-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-023-01842-5

Keywords

Navigation