abstract

Model Monitoring in Practice: Lessons Learned and Open Challenges

Authors:
Krishnaram Kenthapadi

Fiddler AI, Palo Alto, CA, USA

Fiddler AI, Palo Alto, CA, USA
View Profile

,
Himabindu Lakkaraju

Harvard University, Cambridge, MA, USA

Harvard University, Cambridge, MA, USA
View Profile

,
Pradeep Natarajan

Amazon Alexa AI, Chicago, IL, USA

Amazon Alexa AI, Chicago, IL, USA
View Profile

,
Mehrnoosh Sameki

Microsoft Azure AI, Boston, MA, USA

Microsoft Azure AI, Boston, MA, USA
View Profile

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data MiningAugust 2022Pages 4800–4801https://doi.org/10.1145/3534678.3542617

Published:14 August 2022Publication History

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Pages 4800–4801

ABSTRACT

Artificial Intelligence (AI) is increasingly playing an integral role in determining our day-to-day experiences. Increasingly, the applications of AI are no longer limited to search and recommendation systems, such as web search and movie and product recommendations, but AI is also being used in decisions and processes that are critical for individuals, businesses, and society. With AI based solutions in high-stakes domains such as hiring, lending, criminal justice, healthcare, and education, the resulting personal and professional implications of AI are far-reaching. Consequently, it becomes critical to ensure that these models are making accurate predictions, are robust to shifts in the data, are not relying on spurious features, and are not unduly discriminating against minority groups. To this end, several approaches spanning various areas such as explainability, fairness, and robustness have been proposed in recent literature, and many papers and tutorials on these topics have been presented in recent computer science conferences. However, there is relatively less attention on the need for monitoring machine learning (ML) models once they are deployed and the associated research challenges.

In this tutorial, we first motivate the need for ML model monitoring[14], as part of a broader AI model governance[9] and responsible AI framework, from societal, legal, customer/end-user, and model developer perspectives, and provide a roadmap for thinking about model monitoring in practice. We then present findings and insights on model monitoring desiderata based on interviews with various ML practitioners spanning domains such as financial services, healthcare, hiring, online retail, computational advertising, and conversational assistants[15]. We then describe the technical considerations and challenges associated with realizing the above desiderata in practice. We provide an overview of techniques/tools for model monitoring (e.g., see [1, 1, 2, 5, 6, 8, 10-13, 18-21]. Then, we focus on the real-world application of model monitoring methods and tools [3, 4, 7, 11, 13, 16, 17], present practical challenges/guidelines for using such techniques effectively, and lessons learned from deploying model monitoring tools for several web-scale AI/ML applications. We present case studies across different companies, spanning application domains such as financial services, healthcare, hiring, conversational assistants, online retail, computational advertising, search and recommendation systems, and fraud detection. We hope that our tutorial will inform both researchers and practitioners, stimulate further research on model monitoring, and pave the way for building more reliable ML models and monitoring tools in the future.

References

Eric Breck, Neoklis Polyzotis, Sudip Roy, Steven Whang, and Martin Zinkevich. 2019. Data Validation for Machine Learning.. In MLSys.Google Scholar
Graham Cormode, Zohar Karnin, Edo Liberty, Justin Thaler, and Pavel Vesely. 2021. Relative Error Streaming Quantiles. In PODS.Google Scholar
Jordan Edwards et al. 2021. MLOps: Model management, deployment, lineage, and monitoring with Azure Machine Learning. https://tinyurl.com/57y8rrecGoogle Scholar
Fiddler. 2022. Explainable Monitoring. https://www.fiddler.ai/ml-monitoringGoogle Scholar
João Gama, Albert Bifet, Mykola Pechenizkiy, and Abdelhamid Bouchachia. 2014. A survey on concept drift adaptation. ACM Computing Surveys (CSUR) 46, 4 (2014), 1--37.Google ScholarDigital Library
Saurabh Garg, Yifan Wu, Sivaraman Balakrishnan, and Zachary Lipton. 2020. A Unified View of Label Shift Estimation. In NeurIPS.Google Scholar
Michaela Hardt, Xiaoguang Chen, Xiaoyi Cheng, Michele Donini, Jason Gelman, Satish Gollaprolu, John He, Pedro Larroy, Xinyu Liu, Nick McCarthy, Ashish Rathi, Scott Rees, Ankit Siva, ErhYuan Tsai, Keerthan Vasist, Pinar Yilmaz, Muhammad Bilal Zafar, Sanjiv Das, Kevin Haas, Tyler Hill, and Krishnaram Kenthapadi. 2021. Amazon SageMaker Clarify: Machine Learning Bias Detection and Explainability in the Cloud. In KDD.Google Scholar
Zohar Karnin, Kevin Lang, and Edo Liberty. 2016. Optimal quantile approximation in streams. In FOCS.Google Scholar
Eren Kurshan, Hongda Shen, and Jiahao Chen. 2020. Towards self-regulating AI: Challenges and opportunities of AI model governance in financial services. In Proceedings of the First ACM International Conference on AI in Finance.Google ScholarDigital Library
Zachary Lipton, Yu-Xiang Wang, and Alexander Smola. 2018. Detecting and correcting for label shift with black box predictors. In ICML.Google Scholar
David Nigenda, Zohar Karnin, Muhammad Bilal Zafar, Raghu Ramesha, Alan Tan, Michele Donini, and Krishnaram Kenthapadi. 2022. Amazon SageMaker Model Monitor: A System for Real-Time Insights into Deployed Machine Learning Models. In KDD.Google Scholar
Sashank Reddi, Barnabas Poczos, and Alex Smola. 2015. Doubly robust covariate shift correction. In AAAI.Google Scholar
Sebastian Schelter, Dustin Lange, Philipp Schmidt, Meltem Celikel, Felix Biessmann, and Andreas Grafberger. 2018. Automating large-scale data quality verification. In VLDB.Google Scholar
David Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-Francois Crespo, and Dan Dennison. 2015. Hidden technical debt in machine learning systems. In NeurIPS.Google Scholar
Murtuza N Shergadwala, Himabindu Lakkaraju, and Krishnaram Kenthapadi. 2022. A Human-Centric Take on Model Monitoring. In ICML Workshop on Human-Machine Collaboration and Teaming (HMCaT).Google Scholar
Ankur Taly, Kaz Sato, and Claudiu Gruia. 2021. Monitoring feature attributions: How Google saved one of the largest ML services in trouble. https://tinyurl. com/awt3f5ex Google Cloud Blog.Google Scholar
TruEra. 2022. TruEra Monitoring. https://truera.com/monitoring/Google Scholar
Alexey Tsymbal. 2004. The problem of concept drift: definitions and related work. Computer Science Department, Trinity College Dublin 106, 2 (2004), 58.Google Scholar
Geoffrey I Webb, Roy Hyde, Hong Cao, Hai Long Nguyen, and Francois Petitjean. 2016. Characterizing concept drift. Data Mining and Knowledge Discovery 30, 4 (2016), 964--994.Google ScholarDigital Library
Yifan Wu, Ezra Winston, Divyansh Kaushik, and Zachary Lipton. 2019. Domain adaptation with asymmetrically-relaxed distribution alignment. In ICML.Google Scholar
Mykola Pechenizkiy, and Joao Gama. 2016. An overview of concept drift applications. Big data analysis: new algorithms for a new society (2016), 91--114.Google Scholar

Index Terms

Model Monitoring in Practice: Lessons Learned and Open Challenges
1. Computing methodologies
  1. Artificial intelligence
  2. Machine learning

Recommendations

Generative AI meets Responsible AI: Practical Challenges and Opportunities
KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Generative AI models and applications are being rapidly developed and deployed across a wide spectrum of industries and applications ranging from writing and email assistants to graphic design and art generation to educational assistants to coding to ...
Read More
Seamful XAI: Operationalizing Seamful Design in Explainable AI
CSCW

Mistakes in AI systems are inevitable, arising from both technical limitations and sociotechnical gaps. While black-boxing AI systems can make the user experience seamless, hiding the seams risks disempowering users to mitigate fallouts from AI mistakes. ...
Read More
Explainable AI (XAI): A systematic meta-survey of current challenges and future opportunities
Abstract
The past decade has seen significant progress in artificial intelligence (AI), which has resulted in algorithms being adopted for resolving a variety of problems. However, this success has been met by increasing model complexity and ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
August 2022
5033 pages
ISBN:9781450393850
DOI:10.1145/3534678
General Chairs:
Aidong Zhang
University of Virginia
,
Huzefa Rangwala
Amazon/George Mason University
Copyright © 2022 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 August 2022
Check for updates
Author Tags
case studies from industry
ethics in ai
model monitoring and model risk management
responsible ai
Qualifiers
- abstract
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 309
  Total Downloads
- Downloads (Last 12 months)101
- Downloads (Last 6 weeks)7
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Model Monitoring in Practice: Lessons Learned and Open Challenges

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Generative AI meets Responsible AI: Practical Challenges and Opportunities

Seamful XAI: Operationalizing Seamful Design in Explainable AI

Explainable AI (XAI): A systematic meta-survey of current challenges and future opportunities

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Model Monitoring in Practice: Lessons Learned and Open Challenges

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Generative AI meets Responsible AI: Practical Challenges and Opportunities

Seamful XAI: Operationalizing Seamful Design in Explainable AI

Explainable AI (XAI): A systematic meta-survey of current challenges and future opportunities

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media