skip to main content
10.1145/3626246.3656000acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
keynote

The Limitations of Data, Machine Learning and Us

Published: 09 June 2024 Publication History

Abstract

Machine learning (ML), particularly deep learning, is being used everywhere. However, not always is applied well or has ethical and/or scientific issues. In this keynote we first do a deep dive in the limitations of supervised ML and data, its key input. We cover small data, datification, bias, and evaluating success instead of harm, among other limitations. The second part is about ourselves using ML, including different types of social limitations and human incompetence such as cognitive biases, pseudoscience, or unethical applications. These limitations have harmful consequences such as discrimination, misinformation, and mental health issues, to mention just a few. In the final part we discuss regulation on the use of AI and responsible principles that can mitigate the problems outlined above.

References

[1]
Blaise Agüera y Arcas, Margaret Mitchell, and Alexander Todorov. 2023. Physiognomy in the Age of AI. In Feminist AI: Critical Perspectives on Algorithms, Data, and Intelligent Machines. Oxford University Press, 208.
[2]
Ricardo Baeza-Yates. 2015. Incremental sampling of query logs. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (Industry track). Santiago, Chile, 1093--1096.
[3]
Ricardo Baeza-Yates. 2018a. Bias on the Web. Commun. ACM, Vol. 61, 6 (2018), 54--61. https://doi.org/10.1145/3209581
[4]
Ricardo Baeza-Yates. 2018b. BIG, small or Right Data: Which is the proper focus. https://www.kdnuggets.com/2018/10/big-small-right-data.html
[5]
Ricardo Baeza-Yates. 2023. An Introduction to Responsible AI. European Review, Vol. 31, 4 (2023), 406--421.
[6]
Ricardo Baeza-Yates and Marina Estévez-Almenzar. 2022. The relevance of non-human errors in machine learning. In Proceedings of Workshop on AI Evaluation Beyond Metrics (EBeM 2022). CEUR-WS, Vienna, Austria.
[7]
R. Baeza-Yates, J. Matthews, et al. 2022. ACM US TPC Statement on Principles for Responsible Algorithmic Systems. https://www.acm.org/binaries/content/assets/public-policy/final-joint-ai-statement-update.pdf
[8]
Sonja Bekker. 2021. Fundamental rights in digital welfare states: The case of SyRI in the Netherlands. In Netherlands Yearbook of International Law 2019. Vol. 50. T.M.C. Asser Press, 289--307.
[9]
Oliver Bendel. 2018. The uncanny return of physiognomy. In 2018 AAAI Spring Symposium Series. Palo Alto, CA, US.
[10]
Emily M Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. On the dangers of stochastic parrots: Can language models be too big?. In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency. 610--623.
[11]
David Brooks. 2013. The philosophy of data. New York Times, Vol. 4, 02 (2013).
[12]
Bart De Langhe and Philip Fernbach. 2019. The dangers of categorical thinking. Harvard Business Review, Vol. 97, 5 (2019), 80--92.
[13]
Peter Flach. 2019. Performance evaluation in machine learning: the good, the bad, the ugly, and the way forward. In Proc. of the AAAI conference on artificial intelligence, Vol. 33. 9808--9814.
[14]
Luciano Floridi. 2019. Translating principles into practices of digital ethics: Five risks of being unethical. Philosophy & Technology, Vol. 32, 2 (2019), 185--193.
[15]
Sorelle A Friedler, Carlos Scheidegger, and Suresh Venkatasubramanian. 2021. The (im)possibility of fairness: Different value systems require different mechanisms for fair decision making. Commun. ACM, Vol. 64, 4 (2021), 136--143.
[16]
A Shaji George, AS Hovan George, and AS Gabrio Martin. 2023. The environmental impact of AI: A case study of water consumption by ChatGPT. Partners Universal International Innovation Journal, Vol. 1, 2 (2023), 97--104.
[17]
Johanna Johansen, Tore Pedersen, and Christian Johansen. 2023. Studying human-to-computer bias transference. AI & Society, Vol. 38, 4 (2023), 1659--1683.
[18]
Daniel Kahneman, Olivier Sibony, and Cass R Sunstein. 2021. Noise: A flaw in human judgment. Hachette UK.
[19]
Jon Kleinberg, Himabindu Lakkaraju, Jure Leskovec, Jens Ludwig, and Sendhil Mullainathan. 2018. Human decisions and machine predictions. The quarterly journal of economics, Vol. 133, 1 (2018), 237--293.
[20]
Martin Lindstrom. 2016. Small data: The tiny clues that uncover huge trends. St. Martin's Press, New York, NY, US.
[21]
Ilaria Purificato. 2021. Behind the scenes of Deliveroo's algorithm: the discriminatory effect of Frank's blindness. Italian Labour Law e-Journal, Vol. 14, 1 (2021).
[22]
Megha Srivastava, Hoda Heidari, and Andreas Krause. 2019. Mathematical notions vs. human perception of fairness: A descriptive approach to fairness for machine learning. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 2459--2468.
[23]
Eliza Strickland. 2022. Andrew Ng, AI Minimalist: The Machine-Learning Pioneer Says Small is the New Big. IEEE Spectrum, Vol. 59, 4 (2022), 22--50.
[24]
Angelina Wang, Sayash Kapoor, Solon Barocas, and Arvind Narayanan. 2024. Against Predictive Optimization: On the Legitimacy of Decision-making Algorithms That Optimize Predictive Accuracy. ACM J. Responsib. Comput., Vol. 1, 1, Article 9 (mar 2024), bibinfonumpages45 pages. https://doi.org/10.1145/3636509
[25]
Mengyi Wei and Zhixuan Zhou. 2022. AI ethics issues in the real world: Evidence from the AI incident database. arXiv preprint arXiv:2206.07635 (2022).
[26]
Joseph Weizenbaum. 1977. Computers as" Therapists". Science, Vol. 198, 4315 (1977), 354--354.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD/PODS '24: Companion of the 2024 International Conference on Management of Data
June 2024
694 pages
ISBN:9798400704222
DOI:10.1145/3626246
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 June 2024

Check for updates

Author Tags

  1. ai ethics
  2. bias
  3. legitimacy
  4. ml evaluation
  5. pseudoscience

Qualifiers

  • Keynote

Funding Sources

  • National Science Foundation

Conference

SIGMOD/PODS '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 252
    Total Downloads
  • Downloads (Last 12 months)252
  • Downloads (Last 6 weeks)16
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media