Abstract
Fairness in machine learning (ML) has gained attention within the ML community and the broader society beyond with many fairness definitions and algorithms being proposed. Surprisingly, there is little work quantifying and guaranteeing fairness in the presence of uncertainty which is prevalent in many socially sensitive applications, ranging from marketing analytics to actuarial analysis and recidivism prediction instruments. To this end, we revisit fairness and reveal idiosyncrasies of existing fairness literature assuming certainty on the class label that limits their real-world utility. Our primary contributions are formulating fairness under uncertainty and group constraints along with a suite of corresponding new fairness definitions and algorithm. We argue that this formulation has a broader applicability to practical scenarios concerning fairness. We also show how the newly devised fairness notions involving censored information and the general framework for fair predictions in the presence of censorship allow us to measure and mitigate discrimination under uncertainty that bridges the gap with real-world applications. Empirical evaluations on real-world datasets with censorship and sensitive attributes demonstrate the practicality of our approach.
Similar content being viewed by others
References
Agarwal A, Beygelzimer A, Dudík M, Langford J, Wallach H (2018) A reductions approach to fair classification. In: international conference on machine learning, pp 60–69. PMLR
Angwin J, Larson J, Mattu S, Kirchner L (2016) There’s software used across the country to predict future criminals. ProPublica, and it’s biased against blacks
Barocas S, Hardt M, Narayanan A (2017) Fairness in machine learning. Nips Tutorial 1:2
Barocas S, Selbst AD (2016) Big data’s disparate impact. Calif Law Rev 104(3):671
Bechavod Y, Jung C, Wu SZ (2020) Metric-free individual fairness in online learning. Adv Neural Inform Proc Syst 33:11214
Beutel A, Chen J, Doshi T, Qian H, Woodruff A, Luu C, Kreitmann P, Bischof J, Chi E.H (2019) Putting fairness principles into practice: challenges, metrics, and improvements. AIES’19, January 27–28, 2019, Honolulu, HI, USA
Beutel A, Chen J, Zhao Z, Chi E.H (2017) Data decisions and theoretical implications when adversarially learning fair representations. arXiv preprint arXiv:1707.00075
Binns R (2018) Fairness in machine learning: lessons from political philosophy. In: conference on fairness, accountability and transparency, pp 149–159
Bland JM, Altman DG (2004) The logrank test. Bmj 328(7447):1073
Bonnans JF, Shapiro A (2013) Perturbation analysis of optimization problems. Springer, USA
Borgan Ø (2014) Nelson-aalen estimator Statistics Reference Online. Wiley, London
Bou-Hamad I, Larocque D, Ben-Ameur H et al (2011) A review of survival trees. Stat surv 5:44–71
Bradburn MJ, Clark TG, Love SB, Altman DG (2003) Survival analysis part ii: multivariate data analysis-an introduction to concepts and methods. Br J Cancer 89(3):431–436
Brier G.W, Allen R.A (1951) Verification of weather forecasts. In: compendium of meteorology, pp 841–848. Springer
Calders T, Kamiran F, Pechenizkiy M (2009) Building classifiers with independency constraints. In: ICDMW, pp 13–18
Chambless LE, Diao G (2006) Estimation of time-dependent area under the roc curve for long-term risk prediction. Stat Med 25(20):3474–3486
Chang V (2021) An ethical framework for big data and smart cities. Technol Forecast Soc Chang 165:120559
Chen C, Wong R (2019) Black patients miss out on promising cancer drugs-propublica. 2018
Chouldechova A (2017) Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big Data 5(2):153–163
Clark TG, Bradburn MJ, Love SB, Altman DG (2003) Survival analysis part i: basic concepts and first analyses. Br J Cancer 89(2):232–238
Cox DR (1972) Regression models and life-tables. J Roy Stat Soc: Ser B (Methodol) 34(2):187–202
D’Agostino RB, Nam BH (2003) Evaluation of the performance of survival analysis models: discrimination and calibration measures. Handbook Stat 23:1–25
Du M, Liu N, Yang F, Hu X (2021) Learning credible dnns via incorporating prior knowledge and model local explanation. Knowl Inf Syst 63(2):305–332
Dwork C, Hardt M, Pitassi T, Reingold O, Zemel R (2012) Fairness through awareness. In: proceedings of the 3rd innovations in theoretical computer science conference, pp. 214–226
Fish B, Kun J, Lelkes, Á D (2016) A confidence-based approach for balancing fairness and accuracy. In: SDM, pp 144–152
Fox J, Carvalho MS et al (2012) The rcmdrplugin survival package: extending the r commander interface to survival analysis. J Stat Soft 49(7):1–32
Frezal S, Barry L (2019) Fairness in uncertainty: some limits and misinterpretations of actuarial fairness. J Business Ethics 167:1–10
Grace K, Salvatier J, Dafoe A, Zhang B, Evans O (2018) When will ai exceed human performance? evidence from ai experts. J Artif Intell Res 62:729–754
Haider H, Hoehn B, Davis S, Greiner R (2020) Effective ways to build and evaluate individual survival distributions. J Mach Learn Res 21:1–85
Hajian S, Bonchi F, Castillo C (2016) Algorithmic bias: from discrimination discovery to fairness-aware data mining. In: proceedings of the SIGKDD international conference on knowledge discovery and data mining, pp. 2125–2126
Han J, Kamber M, Pei J (2012) Data mining: concepts and techniques, waltham, ma
Hardt M, Price E, Srebro N, et al. (2016) Equality of opportunity in supervised learning. In: advances in neural information processing systems, pp. 3315–3323
Harrell FE, Califf RM, Pryor DB, Lee KL, Rosati RA (1982) Evaluating the yield of medical tests. JAMA 247(18):2543–2546
Hill K (2020) Wrongfully accused by an algorithm
Hosmer DW, Lemesbow S (1980) Goodness of fit tests for the multiple logistic regression model. Commun stat Theory Methods 9(10):1043–1069
Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS et al (2008) Random survival forests. Annals Appl Stat 2(3):841–860
Kamiran F, Calders T (2009) Classifying without discriminating. In: 2nd international conference on computer, control and communication, pp 1–6
Kamiran F, Calders T (2012) Data preprocessing techniques for classification without discrimination. Knowl Inf Syst 33(1):1–33
Katzman JL, Shaham U, Cloninger A, Bates J, Jiang T, Kluger Y (2018) Deepsurv: personalized treatment recommender system using a cox proportional hazards deep neural network. BMC Med Res Methodol 18(1):1–12
Keya K.N, Pan S, Stockwell I, Foulds J (2020) Equitable allocation of healthcare resources with fair cox models. In: AAAI fall symposium on AI in government and public sector
Knaus WA, Harrell FE, Lynn J, Goldman L, Phillips RS, Connors AF, Dawson NV, Fulkerson WJ, Califf RM, Desbiens N et al (1995) The support prognostic model: objective estimates of survival for seriously ill hospitalized adults. Ann Intern Med 122(3):191–203
Kuzborskij I, Lampert C (2018) Data-dependent stability of stochastic gradient descent. In: international conference on machine learning, pp 2815–2824. PMLR
Kvamme H, Borgan Ø, Scheel I (2019) Time-to-event prediction with neural networks and cox regression. J Mach Learn Res 20(129):1–30
Latouche A, Allignol A, Beyersmann J, Labopin M, Fine JP (2013) A competing risks analysis should report results on all cause-specific hazards and cumulative incidence functions. J Clin Epidemiol 66(6):648–653
Mehrabi N, Morstatter F, Saxena N, Lerman K, Galstyan A (2021) A survey on bias and fairness in machine learning. ACM Comput Surv (CSUR) 54(6):1–35
Meyer D (2018) Amazon reportedly killed an ai recruitment system because it couldn’t stop the tool from discriminating against women. fortune, Oct 10
Miller Jr R.G (2011) Survival analysis, vol. 66. John Wiley & Sons
Quy TL, Roy A, Iosifidis V, Zhang W, Ntoutsi E (2022) A survey on datasets for fairness-aware machine learning. Data Mining Knowledge Disc 12(3):1452
Ranstam J, Cook J (2017) Kaplan-meier curve. Br J Surg 104(4):442–442
Ringrose K, Ramjee D (2020) Watch where you walk: law enforcement surveillance and protester privacy. Calif. L. Rev. Online 11:349
Saxena N.A, Zhang W, Shahabi C (2023) Unveiling and mitigating bias in ride-hailing pricing for equitable policy making. arXiv preprint arXiv:2301.03489
Skirpan M, Gorelick M (2017) The authority of“fair”in machine learning
Turner K, Brownstein NC, Thompson Z, El Naqa I, Luo Y, Jim HS, Rollison DE, Howard R, Zeng D, Rosenberg SA et al (2022) Longitudinal patient-reported outcomes and survival among early-stage non-small cell lung cancer patients receiving stereotactic body radiotherapy. Radiother Oncol 167:116–121
Verma S, Rubin J (2018) Fairness definitions explained. In: 2018 IEEE/ACM international workshop on software fairness (FairWare), pp 1–7. IEEE
Wan C, Chang W, Zhao T, Cao S, Zhang C (2020) Denoising individual bias for fairer binary submatrix detection. In: proceedings of the 29th ACM International Conference on Information and Knowledge management, pp. 2245–2248
Wang P, Li Y, Reddy CK (2019) Machine learning for survival analysis: a survey. ACM Comput Surv (CSUR) 51(6):1–36
Wang X, Zhang W, Jadhav A, Weiss J (2021) Harmonic-mean cox models: a ruler for equal attention to risk. In: survival prediction-algorithms, challenges and applications, pp. 171–183. PMLR
Woolson R.F (2007) Wilcoxon signed-rank test. Wiley encyclopedia of clinical trials pp 1–3
Zafar M.B, Valera I, Gomez Rodriguez M, Gummadi K.P (2017) Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In: proceedings of the 26th international conference on world wide web, pp. 1171–1180
Zeng J, Ustun B, Rudin C (2017) Interpretable classification models for recidivism prediction. J R Stat Soc A Stat Soc 180(3):689–722
Zhang W, Bifet A (2020) Feat: A fairness-enhancing and concept-adapting decision tree classifier. In: international conference on discovery science, pp. 175–189. Springer
Zhang W, Bifet A, Zhang X, Weiss J.C, Nejdl W (2021) Farf: A fair and adaptive random forests classifier. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 245–256. Springer
Zhang W, Hernandez-Boussard T, Weiss J.C (2023) Censored fairness through awareness. In: Proceedings of the AAAI conference on artificial intelligence
Zhang W, Ntoutsi E (2019) Faht: an adaptive fairness-aware decision tree classifier. In: international joint conference on artificial intelligence (IJCAI), pp. 1480–1486
Zhang W, Tang J, Wang N (2016) Using the machine learning approach to predict patient survival from high-dimensional survival data. In: IEEE international conference on bioinformatics and biomedicine (BIBM)
Zhang W, Tang X, Wang J (2019) On fairness-aware learning for non-discriminative decision-making. In: international conference on data mining workshops (ICDMW), pp 1072–1079
Zhang W, Weiss J (2021) Fair decision-making under uncertainty. In: 2021 IEEE international conference on data mining (ICDM). IEEE
Zhang W, Weiss J.C (2022) Longitudinal fairness with censorship. In: proceedings of the AAAI conference on artificial intelligence, vol 36, pp 12235–12243
Zhang W, Weiss J.C, Zhou S, Walsh T (2022) Fairness amidst non-iid graph data: A literature review
Zhang W, Zhang L, Pfoser D, Zhao L (2021) Disentangled dynamic graph deep generation. In: proceedings of the SIAM international conference on data mining (SDM) pp 738–746
Žliobaite I, Kamiran F, Calders T (2011) Handling conditional discrimination. In: 2011 IEEE 11th international conference on data mining, pp 992–1001. IEEE
Acknowledgements
This research was supported in part by the Intramural Research Program of the National Library of Medicine (NLM), National Institutes of Health and a NVIDIA GPU Grant.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, W., Weiss, J.C. Fairness with censorship and group constraints. Knowl Inf Syst 65, 2571–2594 (2023). https://doi.org/10.1007/s10115-023-01842-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-023-01842-5