Abstract
Similarity search in high-dimensional spaces is an important primitive operation in many diverse application domains. Locality Sensitive Hashing (LSH) is a popular technique for solving the Approximate Nearest Neighbor (ANN) problem in high-dimensional spaces. Along with creating fair machine learning models, there is also a need for creating data structures that target different types of fairness. In this paper, we propose a fair variant of the ANN problem that targets Equal opportunity in group fairness in the ANN domain. We formally introduce the notion of fair ANN for Equal opportunity in group fairness. Additionally, we present an efficient disk-based index structure for finding Fair approximate nearest neighbors using Locality Sensitive Hashing (FairLSH). Moreover, we present an advanced version of FairLSH that uses cost models to further balance the trade-off between I/O cost and processing time. Finally, we experimentally show that FairLSH returns fair results with a very low I/O cost and processing time when compared with the state-of-the-art LSH techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agarwal, A., et al.: A reductions approach to fair classification. arXiv (2018)
Aumüller, M., et al.: Fair near neighbor search via sampling. SIGMOD Rec. 50(1), 42–49 (2021)
Aumüller, M., et al.: Fair near neighbor search: Independent range sampling in high dimensions. In: SIGMOD (2020)
Bera, S., et al.: Fair algorithms for clustering. In: NIPS (2019)
Chávez, E., et al.: Searching in metric spaces. CSUR 33(3), 273–321 (2001)
Chierichetti, F., et al.: Matroids, matchings, and fairness. In: AISTATS (2019)
Chouldechova, A.: Fair prediction with disparate impact: a study of bias in recidivism prediction instruments. Big Data 5(2), 153–163 (2017)
Datar, M., et al.: Locality-sensitive hashing scheme based on p-stable distributions. In: SOCG (2004)
Donini, M., et al.: Empirical risk minimization under fairness constraints. In: NIPS (2018)
Elzayn, H., et al.: Fair algorithms for learning in allocation problems. In: FAccT (2019)
Gan, J., et al.: Locality-sensitive hashing scheme based on dynamic collision counting. In: SIGMOD (2012)
Gionis, A., et al.: Similarity search in high dimensions via hashing. In: VLDB (1999)
Har-Peled, S., et al.: Near neighbor: who is the fairest of them all? In: NIPS (2019)
Hardt, M., et al.: Equality of opportunity in supervised learning. In: NIPS (2016)
Huang, Q., et al.: Query-aware locality-sensitive hashing for approximate nearest neighbor search. VLDB 9(1), 1–12 (2015)
Jafari, O., Nagarkar, P.: Experimental analysis of locality sensitive hashing techniques for high-dimensional approximate nearest neighbor searches. In: Qiao, M., Vossen, G., Wang, S., Li, L. (eds.) ADC 2021. LNCS, vol. 12610, pp. 62–73. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-69377-0_6
Jafari, O., et al.: A survey on locality sensitive hashing algorithms and their applications. arXiv (2021)
Kleinberg, J., et al.: Human decisions and machine predictions. QJE 133(1), 237–293 (2018)
Kleindessner, M., et al.: Guarantees for spectral clustering with fairness constraints. arXiv (2019)
Liu, W., et al.: I-LSH: I/O efficient c-approximate nearest neighbor search in high-dimensional space. In: ICDE (2019)
Lu, K., Kudo, M.: R2LSH: a nearest neighbor search scheme based on two-dimensional projected spaces. In: ICDE (2020)
Mehrabi, N., et al.: A survey on bias and fairness in machine learning. arXiv (2019)
MNIST (1998). http://yann.lecun.com/exdb/mnist
Seagate ST2000DM001 Manual (2011). https://www.seagate.com/files/staticfiles/docs/pdf/datasheet/disc/barracuda-ds1737-1-1111us.pdf
SIFT (2004). http://corpus-texmex.irisa.fr
Zheng, B., et al.: PM-LSH: a fast and accurate LSH framework for high-dimensional approximate NN search. VLDB 13(5), 643–655 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Jafari, O., Maurya, P., Islam, K.M., Nagarkar, P. (2021). Optimizing Fair Approximate Nearest Neighbor Searches Using Threaded B+-Trees. In: Reyes, N., et al. Similarity Search and Applications. SISAP 2021. Lecture Notes in Computer Science(), vol 13058. Springer, Cham. https://doi.org/10.1007/978-3-030-89657-7_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-89657-7_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89656-0
Online ISBN: 978-3-030-89657-7
eBook Packages: Computer ScienceComputer Science (R0)