research-article

From Bias to Repair: Error as a Site of Collaboration and Negotiation in Applied Data Science Work

Authors:
Cindy Kaiying Lin

Pennsylvania State University, State College, PA, USA

Pennsylvania State University, State College, PA, USA

0000-0002-9273-4970
View Profile

,
Steven J. Jackson

Cornell University, Ithaca, NY, USA

Cornell University, Ithaca, NY, USA

0000-0002-4426-1320
View Profile

Proceedings of the ACM on Human-Computer Interaction Volume 7 Issue CSCW1Article No.: 131pp 1–32https://doi.org/10.1145/3579607

Published:16 April 2023Publication History

Proceedings of the ACM on Human-Computer Interaction

Abstract

Managing error has become an increasingly central and contested arena within data science work. While recent scholarship in artificial intelligence and machine learning has focused on limiting and eliminating error, practitioners have long used error as a site of collaboration and learning vis-à-vis labelers, domain experts, and the worlds data scientists seek to model and understand. Drawing from work in CSCW, STS, HCML, and repair studies, as well as from multi-sited ethnographic fieldwork within a government institution and a non-profit organization, we move beyond the notion of error as an edge case or anomaly to make three basic arguments. First, error discloses or calls to attention existing structures of collaboration unseen or underappreciated under 'working' systems. Second, error calls into being new forms and sites of collaboration (including, sometimes, new actors). Third, error redeploys old sites and actors in new ways, often through restructuring relations of hierarchy and expertise which recenter or devalue the position of different actors. We conclude by discussing how an artful living with error can better support the creative strategies of negotiation and adjustment which data scientists and their collaborators engage in when faced with disruption, breakdown, and friction in their work.

References

Mark S. Ackerman. "The intellectual challenge of CSCW: the gap between social requirements and technical feasibility." Human--Computer Interaction 15, no. 2--3 (2000): 179--203.Google Scholar
Mike Ananny. 2022. Seeing Like an Algorithmic Error: What are Algorithmic Mistakes, Why Do They Matter, How Might They Be Public Problems? In The Yale Information Society Project & Yale Journal Of Law And Technology White Paper Series. https://yjolt.org/sites/default/files/0_-_ananny_-_seeing_like_an_algorithmic_error.pdfGoogle Scholar
Claudia Aradau and Tobias Blanke. 2021. Algorithmic Surveillance and the Political Life of Error. Journal for the History of Knowledge 2, no. 1: 10--10.Google ScholarCross Ref
Cecilia Aragon, Shion Guha, Marina Kogan, Michael Muller, and Gina Neff. Human-Centered Data Science: An Introduction. Cambridge, MA: MIT Press, 2022.Google Scholar
Cecilia Aragon, Clayton Hutto, Andy Echenique, Brittany Fiore-Gartland, Yun Huang, Jinyoung Kim, Gina Nef, Wanli Xing, and Joseph Bayer. 2016. Developing a Research Agenda for Human-Centered Data Science. In Conference Companion Publication of the 2016 Conference on Computer Supported Cooperative Work and Social Computing. ACM Press, San Francisco, California, USA, 529--535. https: //doi.org/10.1145/2818052.2855518Google ScholarDigital Library
Atul Adya, Paramvir Bahl, Jitendra Padhye, Alec Wolman, and Lidong Zhou. 2004. A multi-radio unification protocol for IEEE 802.11 wireless networks. In Proceedings of the IEEE 1st International Conference on Broadnets Networks (BroadNets'04) . IEEE, Los Alamitos, CA, 210--217. https://doi.org/10.1109/BROADNETS.2004.8Google ScholarDigital Library
Sam Anzaroot and Andrew McCallum. 2013. UMass Citation Field Extraction Dataset. Retrieved May 27, 2019 from http://www.iesl.cs.umass.edu/data/data-umasscitationfieldGoogle Scholar
Seyram Avle and Silvia Lindtner. 2016. Design(ing) 'Here' and 'There': Tech Entrepreneurs, Global Markets, and Reflexivity in Design Processes. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI '16). Association for Computing Machinery, New York, NY, USA, 2233--2245. https://doi-org.proxy.library.cornell.edu/10.1145/2858036.2858509Google ScholarDigital Library
Gregory Bateson, Don D. Jackson, Jay Haley, and John Weakland. 1956. "Toward a theory of schizophrenia." Behavioral science 1, no. 4: 251--264Google ScholarCross Ref
Batran. 2021. A GIS Pipeline for LIDAR Point Cloud Feature Extraction. Towards Data Science. https://towardsdatascience.com/a-gis-pipeline-for-lidar-point-cloud-feature-extraction-8cd1c686468aGoogle Scholar
Andrea Ballestero. 2015. The ethics of a formula: Calculating a financial--humanitarian price for water. American Ethnologist 42, no. 2: 262--278.Google ScholarCross Ref
Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT '21). Association for Computing Machinery, New York, NY, USA, 610--623. https://doi-org.proxy.library.cornell.edu/10.1145/3442188.3445922Google ScholarDigital Library
Ruha Benjamin. 2019. How Race and Technology ?Shape Each Other'. Emerson Today. https://today.emerson.edu/2019/10/18/ruha-benjamin-how-race-and-technology-shape-each-other/Google Scholar
Mélanie Bernhardt, Daniel C. Castro, Ryutaro Tanno, Anton Schwaighofer, Kerem C. Tezcan, Miguel Monteiro, Shruthi Bannur et al. 2022. Active label cleaning for improved dataset quality under resource constraints. Nature communications 13, no. 1 (2022), 1--11.Google Scholar
Lucas Beyer, Olivier J. Hénaff, Alexander Kolesnikov, Xiaohua Zhai, Aäron van den Oord. 2020. Are we done with ImageNet? In Proceedings of Advances in Neural Information Processing Systems 2020. https://doi.org/10.48550/arXiv.2006.07159Google ScholarCross Ref
Dan Bouk. 2020. Error, Uncertainty, and the Shifting Ground of Census Data. Harvard Data Science Review, 2(2). https://doi-org.proxy.library.cornell.edu/10.1162/99608f92.962cb309Google Scholar
Dan Bouk and danah boyd. March 18, 2021. ?Democracy's Data Infrastructure: The technopolitics of the U.S. census." Knight First Amendment Institute at Columbia University. https://knightcolumbia.org/content/democracys-data-infrastructureGoogle Scholar
Solon Barocas, Andrew D Selbst, and Manish Raghavan. 2020. The hidden assumptions behind counterfactual explanations and principal reasons. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 80--89.Google ScholarDigital Library
Joy Buolamwini, Sorelle A Friedler, and Christo Wilson. [n.d.]. Gender shades: Intersectional accuracy disparities in commercial gender classification. http: //proceedings.mlr.press/v81/buolamwini18a/buolamwini18a.pdf. Accessed: 2022--1--12.Google Scholar
Meredith Broussard. Forthcoming. More Than a Glitch: Confronting Race, Gender, and Ability Bias in Tech. Cambridge, MA: MIT Press.Google Scholar
Carrie J. Cai, Emily Reif, Narayan Hegde, Jason Hipp, Been Kim, Daniel Smilkov, Martin Wattenberg, Fernanda Viegas, Greg S. Corrado, Martin C. Stumpe, and Michael Terry. 2019. Human-centered tools for coping with imperfect algorithms during medical decision-making. Conference on Human Factors in Computing Systems - Proceedings: 1--14. https://doi.org/10.1145/3290605.3300234.Google ScholarDigital Library
Alexander Campolo. 2019. Steering by Sight: Data, Visualization, and the Birth of an Informational Worldview. PhD diss., New York University, 2019.Google Scholar
Stevie Chancellor. 2022. Towards Practices for Human-Centered Machine Learning. arXiv preprint arXiv:2203.00432 (2022).Google Scholar
Edwin Chen. 2022. 30% of Google's Emotions Dataset is Mislabeled. Surge AI. https://www.surgehq.ai//blog/30-percent-of-googles-reddit-emotions-dataset-is-mislabeledGoogle Scholar
Aida Mostafazadeh Davani, Mark Díaz, and Vinodkumar Prabhakaran. 2022. Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations. Transactions of the Association for Computational Linguistics (2022) 10: 92--110.Google Scholar
Lorraine Daston. 2005. Scientific error and the ethos of belief. Social Research: 1--28.Google Scholar
Lorraine Daston. Cloud Physiognomy. Representations 135(1), pp.45--71.Google Scholar
Aida Mostafazadeh Davani, Mark Díaz, and Vinodkumar Prabhakaran. 2022. Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations. Transactions of the Association for Computational Linguistics, 10:92--110.Google ScholarCross Ref
John Dewey. 1998 The essential Dewey: Pragmatism, education, democracy. Vol. 1. Bloomington, IN: Indiana University Press.Google Scholar
John Dewey. 1986. Experience and education. In The educational forum (Vol. 50, No. 3, pp. 241--252). Taylor & Francis Group.Google Scholar
John Dewey. 1938. Logic: The Theory of Inquiry. H. Holt and company, New York.Google Scholar
Catherine D'Ignazio and Lauren F Klein. 2020. Data Feminism. MIT Press, Cambridge, MA.Google Scholar
Anca Dumitrache, Lora Aroyo, and Chris Welty. 2015. CrowdTruth Measures for Language Ambiguity: The Case of Medical Relation Extraction. In In Proc. of LD4IE Workshop, ISWC. http://ceur-ws.org/Vol-1467/LD4IE2015_Dumitrache.pdfGoogle Scholar
Virginia Eubanks. 2018. Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor. St. Martin's Press, New York.Google ScholarDigital Library
Elena Samuylova and Emeli Dral. 2021. My data drifted. What's next?" How to handle ML model drift in production. Evidently AI. https://evidentlyai.com/blog/ml-monitoring-data-drift-how-to-handleGoogle Scholar
Martin A. Fischler and Robert C. Bolles. 1981. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24, 6 (June 1981), 381--395. https://doi.org/10.1145/358669.358692Google ScholarDigital Library
Batya Friedman and Helen Nissenbaum. "Bias in computer systems." In Computer Ethics, pp. 215--232. Routledge, 2017.Google Scholar
Mitchell L. Gordon, Kaitlyn Zhou, Kayur Patel, Tatsunori Hashimoto, and Michael S. Bernstein. 2021. The Disagreement Deconvolution: Bringing Machine Learning Performance Metrics In Line With Reality. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (CHI '21). Association for Computing Machinery, New York, NY, USA, Article 388, 1--14. https://doi-org/10.1145/3411764.3445423Google ScholarDigital Library
Daniel Greene, Anna Lauren Hoffman, and Luke Stark. Better, Nicer, Clearer, Fairer: A Critical Assessment of the Movement for Ethical Artificial Intelligence and Machine Learning. In Proceedings of the 52nd Hawaii International Conference on System Sciences, 2122--2131. https://hdl.handle.net/10125/59651Google Scholar
Matthew Van Gundy, Davide Balzarotti, and Giovanni Vigna. 2007. Catch me, if you can: Evading network signatures with web-based polymorphic worms. In Proceedings of the first USENIX workshop on Offensive Technologies (WOOT '07) . USENIX Association, Berkley, CA, Article 7, 9 pages.Google ScholarDigital Library
James W. Demmel, Yozo Hida, William Kahan, Xiaoye S. Li, Soni Mukherjee, and Jason Riedy. 2005. Error Bounds from Extra Precise Iterative Refinement. Technical Report No. UCB/CSD-04--1344. University of California, Berkeley.Google Scholar
Theodora Dryer. Designing Certainty: The Rise of Algorithmic Computing in an Age of Anxiety 1920--1970. University of California, San Diego, 2019.Google Scholar
Melanie Feinberg. 2017. A design perspective on data. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI '17). ACM, New York, NY, 2952--2963. http://dx.doi.org/10.1145/3025453.3025837Google ScholarDigital Library
Clare Garvie. 2019. Garbage in, Garbage out. Face recognition on flawed data. Georgetown Law Center on Privacy & Technology (2019)Google Scholar
Ian Hacking. 1990. The Taming of Chance. Cambridge University Press.Google Scholar
Lara Houston, Steven J. Jackson, Daniela K. Rosner, Syed Ishtiaque Ahmed, Meg Young, and Laewoo Kang. 2016. Values in Repair. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI '16). Association for Computing Machinery, New York, NY, USA, 1403--1414. https://doi-org.proxy.library.cornell.edu/10.1145/2858036.2858470Google ScholarDigital Library
Jessica Hullman, Sayash Kapoor, Priyanka Nanayakkara, Andrew Gelman, and Arvind Narayanan. 2022. The worst of both worlds: A comparative analysis of errors in learning from data in psychology and machine learning. arXiv preprint arXiv:2203.06498 (2022).Google Scholar
Ben Hutchinson, Andrew Smart, Alex Hanna, Emily Denton, Christina Greer, Oddur Kjartansson, Parker Barnes, and Margaret Mitchell. 2021. Towards Accountability for Machine Learning Datasets: Practices from Software Engineering and Infrastructure. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT '21). Association for Computing Machinery, New York, NY, USA, 560--575. https://doi-org.proxy.library.cornell.edu/10.1145/3442188.3445918Google ScholarDigital Library
Steven J. Jackson and Lara Houston. 2020. The Poetics and Political Economy of Repair. in Janet Wasko and Jeremy Schwartz, eds. Media: A Transdisciplinary Inquiry. Intellect Books, University of Chicago Press: Chicago.Google Scholar
Steven Jackson. 2014. Rethinking Repair, in T. Gillespie, P. Boczkowski, and K. Foot, eds. Media Technologies: Essays on Communication, Materiality and Society. Cambridge, MA: MIT Press.Google Scholar
Abigail Z Jacobs and Hanna Wallach. 2021. Measurement and Fairness. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (Virtual Event Canada). ACM, New York, NY, USA.Google ScholarDigital Library
Matthew Jones, 2018. How we became instrumentalists (again) data positivism since World War II. Historical Studies in the Natural Sciences, 48(5), pp.673--684.Google ScholarCross Ref
Ju Yeon Jung, Tom Steinberger, John L. King, and Mark S. Ackerman. 2022. How Domain Experts Work with Data: Situating Data Science in the Practices and Settings of Craftwork. Proc. ACM Hum.-Comput. Interact. 6, CSCW1, Article 58 (April 2022), 29 pages. https://doi-org/10.1145/3512905Google ScholarDigital Library
Frederike Kaltheuner, Abeba Birhane, Inioluwa Deborah Raji, Razvan Amironesei, Emily Denton, Alex Hanna, Hilary Nicole, Andrew Smart, Serena Dokuaa Oduro, James Vincent, Alexander Reben, Gemma Milne, Crofton Black, Adam Harvey, Andrew Strait, Tulsi Parida, Aparna Ashok, Fieke Jansen, Corinne Cath, and Aidan Peppin. 2021. Fake AI. Meatspace Press.Google Scholar
Daniel Kang, Nikos Arechiga, Sudeep Pillai, Peter D. Bailis, and Matei Zaharia. 2022. Finding Label and Model Errors in Perception Data With Learned Observation Assertions. In Proceedings of the 2022 International Conference on Management of Data (SIGMOD '22). Association for Computing Machinery, New York, NY, USA, 496--505. https://doi-org.proxy.library.cornell.edu/10.1145/3514221.3517907Google ScholarDigital Library
Nathaniel Klemp, Ray McDermott, Jason Raley, Matthew Thibeault, Kimberly Powell, and Daniel J. Levitin. 2008. Plans, takes, and mis-takes. Outlines. Critical Practice Studies, 10(1), 4--21.Google ScholarCross Ref
Will Knight. March 31, 2021. The Foundations of AI are riddled with error. Wired Magazine. https://www.wired.com/story/foundations-ai-riddled-errors/#: :text=The%20labels%20attached%20to%20images,driving%20cars%20and%20medical%20algorithms.Google Scholar
P. M. Krafft, Meg Young, Michael Katell, Karen Huang, and Ghislain Bugingo. 2020. Defining AI in Policy versus Practice. Association for Computing Machinery, New York, NY, USA, 72--78. https://doi.org/10.1145/3375627.3375835Google ScholarDigital Library
Dongyue Li and Hongyang Zhang. 2021. Improved regularization and robustness for fine-tuning in neural networks." In 35th Conference on Neural Information Processing Systems (NeurIPS 2021): 27249--27262.Google Scholar
Cindy Lin. 2020. How to make a forest. E-Flux. https://www.e-flux.com/architecture/at-the-border/325757/how-to-make-a-forest/Google Scholar
Cindy Lin and Silvia Margot Lindtner. 2021. Techniques of Use: Confronting Value Systems of Productivity, Progress, and Usefulness in Computing and Design. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (CHI '21). Association for Computing Machinery, New York, NY, USA, Article 595, 1--16. https://doi-org.proxy.library.cornell.edu/10.1145/3411764.3445237Google ScholarDigital Library
Adrian Mackenzie. 2017. Machine Learners: Archaeology of a data practice. Cambridge, MA: MIT Press.Google ScholarCross Ref
Donald MacKenzie. 1993. Inventing accuracy: A historical sociology of nuclear missile guidance. Cambridge, MA: MIT press.Google Scholar
Donald MacKenzie. 1994. Computer-related accidental death: an empirical exploration. Science and Public Policy 21, no. 4: 233--248.Google ScholarCross Ref
Zhiyi Ma, Kawin Ethayarajh, Tristan Thrush, Somya Jain, Ledell Wu, Robin Jia, Christopher Potts, Adina Williams, Douwe Kiela. 2021. Dynaboard : An Evaluation-As-A-Service Platform for Holistic Next-Generation Benchmarking. In Advances in Neural Information Processing Systems 34 (NeurIPS 2021). https://proceedings.neurips.cc/paper/2021/hash/55b1927fdafef39c48e5b73b5d61ea60-Abstract.htmlGoogle Scholar
McWilliam, N., R. Teeuw, M. Whiteside, and P. Zukowskyj. 2005. Chapter 8: Image Interpretation and Processing GIS, GPS and remote sensing. In The Expedition Advisory Centre Royal Geographical Society 1 Kensington Gore. https://www.rgs.org/CMSPages/GetFile.aspx?nodeguid=09c5b6e1--87f5--4ba9--9976-e03c383506ff&lang=en-GBGoogle Scholar
Jacob Metcalf, Emanuel Moss, Elizabeth Anne Watkins, Ranjit Singh, and Madeleine Clare Elish. 2021.. Algorithmic impact assessments and accountability: The co-construction of impacts. In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, pp. 735--746. 2021.Google ScholarDigital Library
Milagros Miceli, Julian Posada, and Tianling Yang. 2022. Studying Up Machine Learning Data: Why Talk About Bias When We Mean Power? Proc. ACM Hum.-Comput. Interact. 6, GROUP, Article 34 (January 2022), 14 pages. https://doi-org/10.1145/3492853Google ScholarDigital Library
Michael Muller, Ingrid Lange, Dakuo Wang, David Piorkowski, Jason Tsay, Q. Vera Liao, Casey Dugan, and Thomas Erickson. 2019. How Data Science Workers Work with Data: Discovery, Capture, Curation, Design, Creation. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). Association for Computing Machinery, New York, NY, USA, Paper 126, 1--15. https://doi-org.proxy.library.cornell.edu/10.1145/3290605.3300356Google ScholarDigital Library
Michael Muller, Melanie Feinberg, Timothy George, Steven J. Jackson, Bonnie E. John, Mary Beth Kery, and Samir Passi. 2019. Human-centered study of data science work practices. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, Glasgow Scotland Uk, 1--8. https: //doi.org/10.1145/3290607.3299018Google ScholarDigital Library
Michael Muller, Cecilia Aragon, Shion Guha, Marina Kogan, Gina Nef, Cathrine Seidelin, Katie Shilton, and Anissa Tanweer. 2020. Interrogating data science. In Conference Companion Publication of the 2020 Conference on Computer Supported Cooperative Work and Social Computing. ACM, Virtual Event USA, 467--473. https://doi.org/10.1145/3406865.3418584Google ScholarDigital Library
Michael Muller, Christine T. Wolf, Josh Andres, Michael Desmond, Narendra Nath Joshi, Zahra Ashktorab, Aabhas Sharma, Kristina Brimijoin, Qian Pan, Evelyn Duesterwald, and Casey Dugan. 2021. Designing ground truth and the social life of labels. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, 1--16. https://doi.org/10.1145/3411764.3445402Google ScholarDigital Library
Microsoft Azure. https://docs.microsoft.com/en-us/azure/machine-learning/how-to-monitor-datasets?tabs=pythonGoogle Scholar
Emanuel Moss, Elizabeth Anne Watkins, Ranjit Singh, Madeleine Clare Elish, and Jacob Metcalf. 2021. Assembling accountability: algorithmic impact assessment for the public interest. Available at SSRN 3877437.Google Scholar
Arvind Narayanan. 2019. How to recognize AI snake oil. Arthur Miller Lecture on Science and Ethics.Google Scholar
Nagarajan Natarajan, Inderjit S. Dhillon, Pradeep K. Ravikumar, and Ambuj Tewari. 2013. "Learning with noisy labels. In Proceedings of Advances in Neural Information Processing Systems 26 (2013): 1--9.Google Scholar
Gina Neff and Dawn Nafus. 2016. Self-tracking. MIT PressGoogle Scholar
Curtis G. Northcutt, Lu Jiang, Issac L. Chuang. 2021. Confident Learning: Estimating Uncertainty in Dataset Labels. Journal of Artificial Intelligence Research 70 (2021): 1373--1411.Google ScholarDigital Library
Pang Wei Koh, Shiori Sagawa, Henrik Marklund, Sang Michael Xie, Marvin Zhang, Akshay Balsubramani, Weihua Hu, Michihiro Yasunaga, Richard Lanas Phillips, Irena Gao, Tony Lee, Etienne David, Ian Stavness, Wei Guo, Berton Earnshaw, Imran Haque, Sara M Beery, Jure Leskovec, Anshul Kundaje, Emma Pierson, Sergey Levine, Chelsea Finn, Percy Liang. 2021. WILDS: A Benchmark of in-the-Wild Distribution Shifts. Proceedings of the 38th International Conference on Machine Learning, PMLR 139:5637--5664.Google Scholar
Marina Kogan, Aaron Halfaker, Shion Guha, Cecilia Aragon, Michael Muller, and Stuart Geiger. 2020. Mapping out human-centered data science: Methods, approaches, and best practices. In Companion of the 2020 ACM International Conference on Supporting Group Work. ACM, Sanibel Island Florida USA, 151--156. https://doi.org/10.1145/3323994.3369898Google ScholarDigital Library
Desmond Patton, Philipp Blandfort, William Frey, Michael Gaskell, and Svebor Karaman. 2019. Annotating social media data from vulnerable populations: Evaluating disagreement between domain experts and graduate student annotators. In Proceedings of the 52nd Hawaii International Conference on System Sciences. https://hdl.handle.net/10125/59653Google ScholarCross Ref
Precarity Lab. Technoprecarious. Goldsmiths Press, 2020.Google Scholar
Roberta Raileanu, Maxwell Goldstein, Denis Yarats, Ilya Kostrikov, and Rob Fergus. 2021. Automatic data augmentation for generalization in reinforcement learning. In Proceedings of the 35th Conference on Neural Information Processing Systems (NeurIPS 2021): 5402--5415.Google Scholar
Inioluwa Deborah Raji, I. Elizabeth Kumar, Aaron Horowitz, Andrew D. Selbst. 2022. The Fallacy of AI Funcitonality. In 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT '22). Association for Computing Machinery, New York, NY, USA, 959--972. https://doi-org.proxy.library.cornell.edu/10.1145/3531146.3533158Google Scholar
Inioluwa Deborah Raji and Jingying Yang. 2019. About ml: Annotation and benchmarking on understanding and transparency of machine learning lifecycles." arXiv preprint arXiv:1912.06166.Google Scholar
Rashida Richardson, Jason Schultz, and Kate Crawford. 2019. Dirty Data, Bad Predictions: How Civil Rights Violations Impact Police Data, Predictive Policing Systems, and Justice. (Feb. 2019).Google Scholar
Daniela K. Rosner and Morgan G. Ames. 2014. ?Designing for Repair? Infrastructures and Materialities of Breakdown." Proceedings of CSCW 2014, ACM Conference on Computer-Supported Cooperative Work and Social Computing. ACM Press, February 2014, 319--331.Google Scholar
Hadi Salman, Greg Yang, Jerry Li, Pengchuan Zhang , Huan Zhang , Ilya Razenshteyn , Sébastien Bubeck. 2019. Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers. 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada: 1--12.Google ScholarDigital Library
Samir Passi and Solon Barocas. 2019. Problem Formulation and Fairness. In Proceedings of the Conference on Fairness, Accountability, and Transparency (Atlanta, GA, USA) (FAT* '19). Association for Computing Machinery, New York, NY, USA, 39--48.Google ScholarDigital Library
Samir Passi and Steven Jackson. 2017. Data Vision: Learning to See Through Algorithmic Abstraction. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW '17). Association for Computing Machinery, New York, NY, USA, 2436--2447. https://doi-org.proxy.library.cornell.edu/10.1145/2998181.2998331Google ScholarDigital Library
Nithya Sambasivan and Rajesh Veeraraghavan. 2022. The Deskilling of Domain Expertise in AI Development. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (CHI '22). Association for Computing Machinery, New York, NY, USA, Article 587, 1--14. https://doi-org/10.1145/3491102.3517578Google ScholarDigital Library
Morgan Klaus Scheuerman, Alex Hanna, and Emily Denton. 2021. Do Datasets Have Politics? Disciplinary Values in Computer Vision Dataset Development. Proc. ACM Hum.-Comput. Interact. 5, CSCW2, Article 317 (October 2021), 37 pages. https://doi-org.proxy.library.cornell.edu/10.1145/3476058Google ScholarDigital Library
Nick Seaver. 2021. Care and scale: decorrelative ethics in algorithmic recommendation. Cultural Anthropology 36, no. 3: 509--537.Google ScholarCross Ref
Shreya Shankar, Yoni Halpern, Eric Breck, James Atwood, Jimbo Wilson, D. Sculley. 2017. No Classification without Representation: Assessing Geodiversity Issues in Open Data Sets for the Developing World. In Proceedings of 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA. arXiv:1711.08536v1Google Scholar
Chirag Shah, Theresa Anderson, Loni Hagen, and Yin Zhang. 2021. An iSchool approach to data science: Human-centered, socially responsible, and context-driven. Journal of the Association for Information Science and Technology 72, 6 (2021), 793--796. https://doi.org/10.1002/asi.24444 _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/asi.24444.Google ScholarDigital Library
Manu Siddharta. 2019. Regularization Techniques in Deep Learning. https://www.kaggle.com/code/sid321axn/regularization-techniques-in-deep-learning/notebookGoogle Scholar
Rebecca Slayton. 2013. Arguments that Count: Physics, Computing, and Missile Defense, 1949--2012. Cambridge, MA: MIT Press.Google ScholarDigital Library
Luke Stark and Jevan Hutson. 2022. Physiognomic Artificial Intelligence. forthcoming in Fordham Intellectual Property, Media & Entertainment Law Journal XXXII (2022). https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3927300Google Scholar
Anissa Tanweer, Cecilia R Aragon, Michael Muller, Shion Guha, Samir Passi, Gina Neff, and Marina Kogan. 2022. Interrogating Human-centered Data Science: Taking Stock of Opportunities and Limitations. In Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems (CHI EA '22). Association for Computing Machinery, New York, NY, USA, Article 99, 1--6. https://doi-org.proxy.library.cornell.edu/10.1145/3491101.3503740Google ScholarDigital Library
Angelique Taylor, Hee Rin Lee, Alyssa Kubota, and Laurel D. Riek. 2019. Coordinating Clinical Teams: Using Robots to Empower Nurses to Stop the Line. Proc. ACM Hum.-Comput. Interact. 3, CSCW, Article 221 (November 2019), 30 pages. https://doi.org/10.1145/3359323Google ScholarDigital Library
The Engine Room. 2022. AT THE CONFLUENCE OF DIGITAL RIGHTS & CLIMATE JUSTICE. https://www.theengineroom.org/new-report-at-the-confluence-of-digital-rights-climate-justice/Google Scholar
Anna L. Tsing. (2012). On NonscalabilityThe Living World Is Not Amenable to Precision-Nested Scales. Common knowledge, 18(3), 505--524.Google Scholar
Pablo R. Velasco. 2019. Artificial Intelligibility and Proxy Error. spheres: Journal for Digital Cultures 5: 1--6.Google Scholar
Richmond Y. Wong, Michael A. Madaio, and Nick Merrill. Seeing Like a Toolkit: How Toolkits Envision the Work of AI Ethics. arXiv preprint arXiv:2202.08792 (2022).Google Scholar
Meg Young, Michael Katell, and P.M. Krafft. 2022. Confronting Power and Corporate Capture at the FAccT Conference. In 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT '22). Association for Computing Machinery, New York, NY, USA, 1375--1386. https://doi-org.proxy.library.cornell.edu/10.1145/3531146.3533194Google Scholar
Songzhu Zheng, Pengxiang Wu, Aman Goswami, Mayank Goswami, Dimitris Metaxas, Chao Chen. 2020. Error-Bounded Correction of Noisy Labels. In International Conference on Machine Learning, pp. 11447--11457. PMLR, 2020.Google Scholar
Le Zhang, Ryutaro Tanno, Mou-Cheng Xu, Chen Jin, Joseph Jacob, Olga Ciccarelli, Frederik Barkhof, and Daniel C. Alexander. 2020. Disentangling Human Error from the Ground Truth in Segmentation of Medical Images. In Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada. https://proceedings.neurips.cc/paper/2020/file/b5d17ed2b502da15aa727af0d51508d6-Paper.pdfGoogle Scholar

Index Terms

From Bias to Repair: Error as a Site of Collaboration and Negotiation in Applied Data Science Work
1. Human-centered computing
  1. Collaborative and social computing
    1. Collaborative and social computing design and evaluation methods
      1. Ethnographic studies

Recommendations

Integrating FATE/critical data studies into data science curricula: where are we going and how do we get there?
FAT* '20: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency

There have been multiple calls for integrating topics related to fairness, accountability, transparency, ethics (FATE) and social justice into Data Science curricula, but little exploration of how this might work in practice. This paper presents the ...
Read More
Designing for repair?: infrastructures and materialities of breakdown
CSCW '14: Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing

This paper explores issues that come up in practices of breakage and repair through two projects: the 'XO' laptops of One Laptop Per Child in Paraguay and public sites of facilitated repair in California, USA. Collectively drawing on 15 months of ...
Read More
Privacy in Repair: An Analysis of the Privacy Challenges Surrounding Broken Digital Artifacts in Bangladesh
ICTD '16: Proceedings of the Eighth International Conference on Information and Communication Technologies and Development

This paper presents an analysis of the privacy issues associated with the practice of repairing broken digital objects in Bangladesh. Historically, research in Human-Computer Interaction (HCI), Information and Communication Technologies for Development (...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Proceedings of the ACM on Human-Computer Interaction Volume 7, Issue CSCW1
CSCW
April 2023
3836 pages
EISSN:2573-0142
DOI:10.1145/3593053
Editor:
Jeff Nichols
Google
Issue’s Table of Contents
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 16 April 2023
Published in pacmhci Volume 7, Issue CSCW1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
AI ethics
critical data studies
data science
error
machine learning
repair
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 11
  Total Citations
  View Citations
- 336
  Total Downloads
- Downloads (Last 12 months)336
- Downloads (Last 6 weeks)33
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

From Bias to Repair: Error as a Site of Collaboration and Negotiation in Applied Data Science Work

Proceedings of the ACM on Human-Computer Interaction

Abstract

References

Cited By

Index Terms

Recommendations

Integrating FATE/critical data studies into data science curricula: where are we going and how do we get there?

Designing for repair?: infrastructures and materialities of breakdown

Privacy in Repair: An Analysis of the Privacy Challenges Surrounding Broken Digital Artifacts in Bangladesh

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

From Bias to Repair: Error as a Site of Collaboration and Negotiation in Applied Data Science Work

Proceedings of the ACM on Human-Computer Interaction

Abstract

References

Cited By

Index Terms

Recommendations

Integrating FATE/critical data studies into data science curricula: where are we going and how do we get there?

Designing for repair?: infrastructures and materialities of breakdown

Privacy in Repair: An Analysis of the Privacy Challenges Surrounding Broken Digital Artifacts in Bangladesh

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media