Abstract
There is growing interest in learning from data classifiers whose predictions are both accurate and fair, avoiding discrimination against sub-groups of people based e.g. on gender or race. This paper proposes a new Lexicographic multi-objective Genetic Algorithm for Fair Feature Selection (LGAFFS). LGAFFS selects a subset of relevant features which is optimised for a given classification algorithm, by simultaneously optimising one measure of accuracy and four measures of fairness. This is achieved by using a lexicographic multi-objective optimisation approach where the objective of optimising accuracy has higher priority over the objective of optimising the four fairness measures. LGAFFS was used to select features in a pre-processing phase for a random forest algorithm. The experiments compared LGAFFS’ performance against two feature selection approaches: (a) the baseline approach of letting the random forest algorithm use all features, i.e. no feature selection in a pre-processing phase; and (b) a Sequential Forward Selection method. The results showed that LGAFFS significantly improved fairness measures in several cases, with no significant difference regarding predictive accuracy, across all experiments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Angwin, J., Larson, J., Mattu, S., Kirchner, L.: Machine bias: there’s software used across the country to predict future criminals, and it’s biased against blacks (2016). https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
Calders, T., Verwer, S.: Three Naive Bayes approaches for discrimination-free classification. Data Min. Knowl. Discov. 21(2), 277–292 (2010)
Chouldechova, A.: Fair prediction with disparate impact: a study of bias in recidivism prediction instruments. Big Data 5(2), 153–163 (2017)
Corbett-Davies, S., Goel, S.: The measure and mismeasure of fairness: a critical review of fair machine learning. arXiv preprint arXiv:1808.00023 (2018)
Deb, K.: Multi-objective Optimization Using Evolutionary Algorithms. Wiley, New York (2002)
Dua, D., Graff, C.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml
Freitas, A.: Data Mining and Knowledge Discovery with Evolutionary Algorithms. Springer, Heidelberg (2002). https://doi.org/10.1007/978-3-662-04923-5
Freitas, A.A.: A critical review of multi-objective optimization in data mining: a position paper. ACM SIGKDD Explorat. Newslett. 6(2), 77–86 (2004)
Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. In: Advances in Neural Information Processing Systems, pp. 3315–3323 (2016)
Kleinberg, J., Mullainathan, S., Raghavan, M.: Inherent trade-offs in the fair determination of risk scores. arXiv preprint arXiv:1609.05807 (2016)
Kusner, M.J., Loftus, J., Russell, C., Silva, R.: Counterfactual fairness. In: Advances in Neural Information Processing Systems, pp. 4066–4076 (2017)
La Cava, W., Moore, J.: Genetic programming approaches to learning fair classifiers. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2020), pp. 967–975 (2020)
Li, J., et al.: Feature selection: a data perspective. ACM Comput. Surv. 50(6), 94:1–94:45 (2017)
Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., Galstyan, A.: A survey on bias and fairness in machine learning. arXiv preprint arXiv:1908.09635 (2019)
Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Quadrianto, N., Sharmanska, V.: Recycling privileged learning and distributed matching for fairness. In: Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), pp. 677–688 (2017)
Skeem, J.L., Lowenkamp, C.T.: Risk, race, & recidivism: predictive bias and disparate impact. Criminology 54, 680 (2016)
Telikani, A., Tahmassebi, A., Banzhaf, W., Gandomi, A.: Evolutionary machine learning: a survey. ACM Comput. Surv. 54(8), 161:1–161:35 (2021)
Tian, Y., et al.: Evolutionary large-scale multi-objective optimization: a survey. ACM Comput. Surv. 54(8), 174:1–174:34 (2021)
Valdivia, A., Sanchez-Monedero, J., Casillas, J.: How fair can we go in machine learning? Assessing the boundaries of accuracy and fairness. Int. J. Intell. Syst. 36(4), 1619–1643 (2021)
Verma, S., Rubin, J.: Fairness definitions explained. In: 2018 IEEE/ACM International Workshop on Software Fairness (FairWare), pp. 1–7. IEEE (2018)
Zemel, R., Wu, Y., Swersky, K., Pitassi, T., Dwork, C.: Learning fair representations. In: International Conference on Machine Learning, pp. 325–333 (2013)
Acknowledgements
This work was funded by a research grant from The Leverhulme Trust, UK, reference number RPG-2020-145.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Brookhouse, J., Freitas, A. (2022). Fair Feature Selection with a Lexicographic Multi-objective Genetic Algorithm. In: Rudolph, G., Kononova, A.V., Aguirre, H., Kerschke, P., Ochoa, G., Tušar, T. (eds) Parallel Problem Solving from Nature – PPSN XVII. PPSN 2022. Lecture Notes in Computer Science, vol 13399. Springer, Cham. https://doi.org/10.1007/978-3-031-14721-0_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-14721-0_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-14720-3
Online ISBN: 978-3-031-14721-0
eBook Packages: Computer ScienceComputer Science (R0)