skip to main content
research-article

Data Transparency and Fairness Analysis of the NYPD Stop-and-Frisk Program

Published: 11 February 2022 Publication History

Abstract

Given the increased concern of racial disparities in the stop-and-frisk programs, the New York Police Department (NYPD) requires publicly displaying detailed data for all the stops conducted by police authorities, including the suspected offense and race of the suspects. By adopting a public data transparency policy, it becomes possible to investigate racial biases in stop-and-frisk data and demonstrate the benefit of data transparency to approve or disapprove social beliefs and police practices. Thus, data transparency becomes a crucial need in the era of Artificial Intelligence (AI), where police and justice increasingly use different AI techniques not only to understand police practices but also to predict recidivism, crimes, and terrorism. In this study, we develop a predictive analytics method, including bias metrics and bias mitigation techniques to analyze the NYPD Stop-and-Frisk datasets and discover whether underline bias patterns are responsible for stops and arrests. In addition, we perform a fairness analysis on two protected attributes, namely, the race and the gender, and investigate their impacts on arrest decisions. We also apply bias mitigation techniques. The experimental results show that the NYPD Stop-and-Frisk dataset is not biased toward colored and Hispanic individuals and thus law enforcement authorities can apply the bias predictive analytics method to inculcate more fair decisions before making any arrests.

References

[1]
K. L. Antonovics and B. G. Knight. 2004. A new look at racial profiling: Evidence from the Boston police department. National Bureau of Economic Research. The Review of Economics and Statistics 91, 1 (2004), 163–177.
[2]
R. K. E. Bellamy, K. Dey, M. Hind, S. C. Hoffman, S. Houde, K. Kannan, P. Lohia, J. Martino, S. Mehta, A. Mojsilovié, S. Nagar, K. N. Ramamurthy, J. Richards, D. Saha, P. Sattigeri, M. Singh, K. R. Varshney, and Y. Zhang. 2019. Ii fairness 360: An extensible toolkit for detecting and mitigating algorithmic bias. IBM Journal of Research and Development 63, (4/5) (2019), 1–15.
[3]
R. Berk, H. Heidari, S. Jabbari, M. Kearns, and A. Roth. 2021. Fairness in criminal justice risk assessments: The state of the art. Sociological Methods & Research 50, 1 (2021), 3–44.
[4]
L. Breiman. 2001. Random forests. Machine Learning 45 (2001), 5–32.
[5]
C Bustamante, L Garrido, and R Soto. 2006. Comparing fuzzy naive Bayes and Gaussian naive Bayes for decision making in robocup 3D. In MICAI 2006: Advances in Artificial Intelligence (2006), 237–247.
[6]
T. Chen and C. Guestrin. 2016. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794.
[7]
S. Corbett-Davies, E. Pierson, A. Feller, S. Goel, and A. Huq. 2017. Algorithmic decision making and the cost of fairness. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining ([n.d]), 797–806.
[8]
Decio Coviello and Nicola Persico. 2015. An economic analysis of black-white disparities in the New York Police Department’s Stop-and-Frisk Program. The Journal of Legal Studies 44, 2 (2015), 315–360. 10.1086/684292; https://doi.org/10.1086/684292
[9]
New York City Police Department. 2012. The NYPD Stop, Question, and Frisk Database. Retrieved from December 8, 2021. https://www1.nyc.gov/site/nypd/stats/reports-analysis/stopfrisk.page.
[10]
Julia Dressel and Hany Farid. 2018. The accuracy, fairness, and limits of predicting recidivism. Science Advances 4, 1 (2018). 10.1126/sciadv.aao5580
[11]
C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel. Fairness Through Awareness. 214–226.
[12]
M. Feldman, S. A. Friedler, J. Moeller, C. Scheidegger, and S. Venkatasubramanian. 2015. Certifying and removing disparate impact. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’15). ACM, 259–268. https://doi.org/10.1145/2783258.2783311
[13]
Sharad Goel, Justin M. Rao, and Ravi Shroff. 2016. Precinct or prejudice? Understanding racial disparities in New York City’s stop-and-frisk policy. Annals of Applied Statistics 10, 1 (2016), 365–394. 10.1214/15-aoas897
[14]
Faisal Kamiran and Toon Calders. 2012. Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems 33, 1 (2012), 1–33. https://doi.org/10.1007/s10115-011-0463-8
[15]
F. Kamiran, A. Karim, and Zhang. 2012. Decision theory for discrimination-aware classification. In IEEE International Conference on Data Mining, 924–929.
[16]
N. Kilbertus, M. R. Carulla, G. Parascandolo, M. Hardt, D. Janzing, and B. Scholkopf. 2017. Avoiding discrimination through causal reasoning. Advances in Neural Information Processing Systems (2017), 656–666.
[17]
N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman, and A. Galstyan. 2019. A Survey on bias and fairness in machine learning. arXiv:1908.09635[Cs].
[18]
M. Michael, Marjorie Grynbaum, and Connelly. 2012. Majority in City See Police as Favoring Whites Poll Finds. Retrieved from December 8, 2021. https://www.nytimes.com/2012/08/21/nyregion/64-of-new-yorkers-in-poll-say-police-favor-whites.html.
[19]
Chao-Ying Joanne Peng, Kuk Lida Lee, and Gary M. Ingersoll. 2002. An introduction to logistic regression analysis and reporting. The Journal of Educational Research 96, 1 (2002), 3–14.
[20]
G Ridgeway. 2007. Analysis of racial disparities in the New York Police Department’s stop, question, and frisk practices. RAND Corporation. Retrieved from December 8, 2021. https://www.rand.org/pubs/technical_reports/TR534.html.
[21]
Ray Rivera. 2012. Pockets of City See Higher Use of Force During Police Stops. Retrieved from December 8, 2021. https://www.nytimes.com/2012/08/16/nyregion/in-police-stop-data-pockets-where-force-is-used-more-often.html.
[22]
Harold Stolper and Jeff Jones. 2018. The enduring discriminatory practice of stop-and-frisk: An analysis of stop-and-frisk policing. NYC Community Service Society (2018), 1–9.
[23]
H. Suresh and J. V. Guttag. 2020. A framework for understanding unintended consequences of machine learning, http://arxiv.org/abs/1901.10002.
[24]
S. Verma and J. Rubin. 2018. Fairness definitions explained. In Proceedings of the International Workshop on Software Fairness (FairWare’18). Association for Computing Machinery, 1–7.
[25]
Nancy G. La Vigne, Pamela Lachman, Shebani Rao, and Andrea Matthews. 2014. Stop and Frisk: Balancing Crime Control with Community Relations. Office of Community Oriented Policing Services, Washington, DC. 978–979.
[26]
G. Yona and G. Rothblum. 2018. Probably approximately metric-fair learning. In International Conference on Machine Learning, 5666–5674.
[27]
R. Zemel, Y. Wu, K. Swersky, T. Pitassi, and C. Dwork. 2013. Learning fair representations. In International Conference on Machine Learning, 325–333.
[28]
Y. Zhang. 2012. Support vector machine classification algorithm and its application. In International Conference on Information Computing and Applications (ICICA’12), 179–186.

Cited By

View all
  • (2023)Distributed Cooperative Coevolution of Data Publishing Privacy and TransparencyACM Transactions on Knowledge Discovery from Data10.1145/361396218:1(1-23)Online publication date: 6-Sep-2023
  • (2022)A Review on Fairness in Machine LearningACM Computing Surveys10.1145/349467255:3(1-44)Online publication date: 3-Feb-2022

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Journal of Data and Information Quality
Journal of Data and Information Quality  Volume 14, Issue 2
June 2022
150 pages
ISSN:1936-1955
EISSN:1936-1963
DOI:10.1145/3505186
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 February 2022
Accepted: 01 April 2021
Revised: 01 April 2021
Received: 01 December 2020
Published in JDIQ Volume 14, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Bias
  2. artificial intelligence
  3. machine learning
  4. data transparency

Qualifiers

  • Research-article
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)190
  • Downloads (Last 6 weeks)27
Reflects downloads up to 25 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Distributed Cooperative Coevolution of Data Publishing Privacy and TransparencyACM Transactions on Knowledge Discovery from Data10.1145/361396218:1(1-23)Online publication date: 6-Sep-2023
  • (2022)A Review on Fairness in Machine LearningACM Computing Surveys10.1145/349467255:3(1-44)Online publication date: 3-Feb-2022

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media