research-article

Data Transparency and Fairness Analysis of the NYPD Stop-and-Frisk Program

Authors:

Rahul SharmaAuthors Info & Claims

ACM Journal of Data and Information Quality (JDIQ), Volume 14, Issue 2

Article No.: 7, Pages 1 - 14

https://doi.org/10.1145/3460533

Published: 11 February 2022 Publication History

Abstract

Given the increased concern of racial disparities in the stop-and-frisk programs, the New York Police Department (NYPD) requires publicly displaying detailed data for all the stops conducted by police authorities, including the suspected offense and race of the suspects. By adopting a public data transparency policy, it becomes possible to investigate racial biases in stop-and-frisk data and demonstrate the benefit of data transparency to approve or disapprove social beliefs and police practices. Thus, data transparency becomes a crucial need in the era of Artificial Intelligence (AI), where police and justice increasingly use different AI techniques not only to understand police practices but also to predict recidivism, crimes, and terrorism. In this study, we develop a predictive analytics method, including bias metrics and bias mitigation techniques to analyze the NYPD Stop-and-Frisk datasets and discover whether underline bias patterns are responsible for stops and arrests. In addition, we perform a fairness analysis on two protected attributes, namely, the race and the gender, and investigate their impacts on arrest decisions. We also apply bias mitigation techniques. The experimental results show that the NYPD Stop-and-Frisk dataset is not biased toward colored and Hispanic individuals and thus law enforcement authorities can apply the bias predictive analytics method to inculcate more fair decisions before making any arrests.

References

[1]

K. L. Antonovics and B. G. Knight. 2004. A new look at racial profiling: Evidence from the Boston police department. National Bureau of Economic Research. The Review of Economics and Statistics 91, 1 (2004), 163–177.

[2]

R. K. E. Bellamy, K. Dey, M. Hind, S. C. Hoffman, S. Houde, K. Kannan, P. Lohia, J. Martino, S. Mehta, A. Mojsilovié, S. Nagar, K. N. Ramamurthy, J. Richards, D. Saha, P. Sattigeri, M. Singh, K. R. Varshney, and Y. Zhang. 2019. Ii fairness 360: An extensible toolkit for detecting and mitigating algorithmic bias. IBM Journal of Research and Development 63, (4/5) (2019), 1–15.

[3]

R. Berk, H. Heidari, S. Jabbari, M. Kearns, and A. Roth. 2021. Fairness in criminal justice risk assessments: The state of the art. Sociological Methods & Research 50, 1 (2021), 3–44.

[4]

L. Breiman. 2001. Random forests. Machine Learning 45 (2001), 5–32.

Digital Library

[5]

C Bustamante, L Garrido, and R Soto. 2006. Comparing fuzzy naive Bayes and Gaussian naive Bayes for decision making in robocup 3D. In MICAI 2006: Advances in Artificial Intelligence (2006), 237–247.

Digital Library

[6]

T. Chen and C. Guestrin. 2016. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794.

Digital Library

[7]

S. Corbett-Davies, E. Pierson, A. Feller, S. Goel, and A. Huq. 2017. Algorithmic decision making and the cost of fairness. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining ([n.d]), 797–806.

Digital Library

[8]

Decio Coviello and Nicola Persico. 2015. An economic analysis of black-white disparities in the New York Police Department’s Stop-and-Frisk Program. The Journal of Legal Studies 44, 2 (2015), 315–360. 10.1086/684292; https://doi.org/10.1086/684292

[9]

New York City Police Department. 2012. The NYPD Stop, Question, and Frisk Database. Retrieved from December 8, 2021. https://www1.nyc.gov/site/nypd/stats/reports-analysis/stopfrisk.page.

[10]

Julia Dressel and Hany Farid. 2018. The accuracy, fairness, and limits of predicting recidivism. Science Advances 4, 1 (2018). 10.1126/sciadv.aao5580

[11]

C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel. Fairness Through Awareness. 214–226.

Digital Library

[12]

M. Feldman, S. A. Friedler, J. Moeller, C. Scheidegger, and S. Venkatasubramanian. 2015. Certifying and removing disparate impact. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’15). ACM, 259–268. https://doi.org/10.1145/2783258.2783311

Digital Library

[13]

Sharad Goel, Justin M. Rao, and Ravi Shroff. 2016. Precinct or prejudice? Understanding racial disparities in New York City’s stop-and-frisk policy. Annals of Applied Statistics 10, 1 (2016), 365–394. 10.1214/15-aoas897

[14]

Faisal Kamiran and Toon Calders. 2012. Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems 33, 1 (2012), 1–33. https://doi.org/10.1007/s10115-011-0463-8

Digital Library

[15]

F. Kamiran, A. Karim, and Zhang. 2012. Decision theory for discrimination-aware classification. In IEEE International Conference on Data Mining, 924–929.

Digital Library

[16]

N. Kilbertus, M. R. Carulla, G. Parascandolo, M. Hardt, D. Janzing, and B. Scholkopf. 2017. Avoiding discrimination through causal reasoning. Advances in Neural Information Processing Systems (2017), 656–666.

Digital Library

[17]

N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman, and A. Galstyan. 2019. A Survey on bias and fairness in machine learning. arXiv:1908.09635[Cs].

[18]

M. Michael, Marjorie Grynbaum, and Connelly. 2012. Majority in City See Police as Favoring Whites Poll Finds. Retrieved from December 8, 2021. https://www.nytimes.com/2012/08/21/nyregion/64-of-new-yorkers-in-poll-say-police-favor-whites.html.

[19]

Chao-Ying Joanne Peng, Kuk Lida Lee, and Gary M. Ingersoll. 2002. An introduction to logistic regression analysis and reporting. The Journal of Educational Research 96, 1 (2002), 3–14.

[20]

G Ridgeway. 2007. Analysis of racial disparities in the New York Police Department’s stop, question, and frisk practices. RAND Corporation. Retrieved from December 8, 2021. https://www.rand.org/pubs/technical_reports/TR534.html.

[21]

Ray Rivera. 2012. Pockets of City See Higher Use of Force During Police Stops. Retrieved from December 8, 2021. https://www.nytimes.com/2012/08/16/nyregion/in-police-stop-data-pockets-where-force-is-used-more-often.html.

[22]

Harold Stolper and Jeff Jones. 2018. The enduring discriminatory practice of stop-and-frisk: An analysis of stop-and-frisk policing. NYC Community Service Society (2018), 1–9.

[23]

H. Suresh and J. V. Guttag. 2020. A framework for understanding unintended consequences of machine learning, http://arxiv.org/abs/1901.10002.

[24]

S. Verma and J. Rubin. 2018. Fairness definitions explained. In Proceedings of the International Workshop on Software Fairness (FairWare’18). Association for Computing Machinery, 1–7.

Digital Library

[25]

Nancy G. La Vigne, Pamela Lachman, Shebani Rao, and Andrea Matthews. 2014. Stop and Frisk: Balancing Crime Control with Community Relations. Office of Community Oriented Policing Services, Washington, DC. 978–979.

[26]

G. Yona and G. Rothblum. 2018. Probably approximately metric-fair learning. In International Conference on Machine Learning, 5666–5674.

[27]

R. Zemel, Y. Wu, K. Swersky, T. Pitassi, and C. Dwork. 2013. Learning fair representations. In International Conference on Machine Learning, 325–333.

Digital Library

[28]

Y. Zhang. 2012. Support vector machine classification algorithm and its application. In International Conference on Information Computing and Applications (ICICA’12), 179–186.

Cited By

Ge YBertino EWang HCao JZhang Y(2023)Distributed Cooperative Coevolution of Data Publishing Privacy and TransparencyACM Transactions on Knowledge Discovery from Data10.1145/361396218:1(1-23)Online publication date: 6-Sep-2023
https://dl.acm.org/doi/10.1145/3613962
Pessach DShmueli E(2022)A Review on Fairness in Machine LearningACM Computing Surveys10.1145/349467255:3(1-44)Online publication date: 3-Feb-2022
https://dl.acm.org/doi/10.1145/3494672

Index Terms

Data Transparency and Fairness Analysis of the NYPD Stop-and-Frisk Program
1. Computing methodologies
  1. Artificial intelligence
  2. Machine learning
    1. Learning settings
2. Information systems
  1. Information systems applications
    1. Decision support systems
      1. Data analytics

Recommendations

Using artificial intelligence to support compliance with the general data protection regulation

The General Data Protection Regulation (GDPR) is a European Union regulation that will replace the existing Data Protection Directive on 25 May 2018. The most significant change is a huge increase in the maximum fine that can be levied for breaches of ...
Legal and Regulatory Issues on Artificial Intelligence, Machine Learning, Data Science, and Big Data
HCI International 2022 – Late Breaking Papers: Interacting with eXtended Reality and Artificial Intelligence
Abstract
Technological innovation creates numerous opportunities for businesses, organizations, and societies. Artificial intelligence, machine learning, data science, and big data provide opportunities for developing self-controlling systems emulating ...
Russian Court Decisions Data Analysis Using Distributed Computing and Machine Learning to Improve Lawmaking and Law Enforcement
Abstract
This article describes the study results of semi-structured data processing and analysis of the Russian court decisions (almost 30 million) using distributed cluster-computing framework and machine learning. Spark was used for data processing and ...

Comments

Information & Contributors

Information

Published In

cover image Journal of Data and Information Quality

Journal of Data and Information Quality Volume 14, Issue 2

June 2022

150 pages

ISSN:1936-1955

EISSN:1936-1963

DOI:10.1145/3505186

Editor:
Tiziana Catarci
Sapienza University of Rome, Rome, Italy

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 February 2022

Accepted: 01 April 2021

Revised: 01 April 2021

Received: 01 December 2020

Published in JDIQ Volume 14, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
747
Total Downloads

Downloads (Last 12 months)190
Downloads (Last 6 weeks)27

Reflects downloads up to 25 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ge YBertino EWang HCao JZhang Y(2023)Distributed Cooperative Coevolution of Data Publishing Privacy and TransparencyACM Transactions on Knowledge Discovery from Data10.1145/361396218:1(1-23)Online publication date: 6-Sep-2023
https://dl.acm.org/doi/10.1145/3613962
Pessach DShmueli E(2022)A Review on Fairness in Machine LearningACM Computing Surveys10.1145/349467255:3(1-44)Online publication date: 3-Feb-2022
https://dl.acm.org/doi/10.1145/3494672

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents