research-article

XAI tools in the public sector: a case study on predicting combined sewer overflows

Authors:
Nicholas Maltbie

University of Cincinnati, USA

University of Cincinnati, USA
View Profile

,
Nan Niu

University of Cincinnati, USA

University of Cincinnati, USA

0000-0001-5566-2368
View Profile

,
Matthew Van Doren

Metropolitan Sewer District of Greater Cincinnati, USA

Metropolitan Sewer District of Greater Cincinnati, USA
View Profile

,
Reese Johnson

Metropolitan Sewer District of Greater Cincinnati, USA

Metropolitan Sewer District of Greater Cincinnati, USA
View Profile

ESEC/FSE 2021: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software EngineeringAugust 2021Pages 1032–1044https://doi.org/10.1145/3468264.3468547

Published:18 August 2021Publication History

ESEC/FSE 2021: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Pages 1032–1044

ABSTRACT

Artificial intelligence and deep learning are becoming increasingly prevalent in contemporary software solutions. Explainable artificial intelligence (XAI) tools attempt to address the black box nature of the deep learning models and make them more understandable to humans. In this work, we apply three state-of-the-art XAI tools in a real-world case study. Our study focuses on predicting combined sewer overflow events for a municipal wastewater treatment organization. Through a data driven inquiry, we collect both qualitative information via stakeholder interviews and quantitative measures. These help us assess the predictive accuracy of the XAI tools, as well as the simplicity, soundness, and insightfulness of the produced explanations. Our results not only show the varying degrees that the XAI tools meet the requirements, but also highlight that domain experts can draw new insights from complex explanations that may differ from their previous expectations.

References

Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dandelion Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. https://www.tensorflow.org/Google Scholar
Amina Adadi and Mohammed Berrada. 2018. Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI). IEEE Access, 6 (2018), 52138–52160. https://doi.org/10.1109/ACCESS.2018.2870052 Google ScholarCross Ref
Saleema Amershi, Andrew Begel, Christian Bird, Robert DeLine, Harald C. Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, and Thomas Zimmermann. 2019. Software Engineering for Machine Learning: A Case Study. In Proceedings of the 41st IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP’19). Montreal, Canada. 291–300. https://doi.org/10.1109/ICSE-SEIP.2019.00042 Google ScholarDigital Library
Sebastian Bach, Alexander Binder, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller, and Wojciech Samek. 2015. On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation. PloS one, 10, 7 (2015), e0130140. https://doi.org/10.1371/journal.pone.0130140 Google ScholarCross Ref
José Manuel Benítez, Juan Luis Castro, and Ignacio Requena. 1997. Are Artificial Neural Networks Black Boxes? IEEE Transactions on Neural Networks, 8, 5 (1997), September, 1156–1164. https://doi.org/10.1109/72.623216 Google ScholarDigital Library
Tanmay Bhowmik, Vander Alves, and Nan Niu. 2014. An Exploratory Case Study on Exploiting Aspect Orientation in Mobile Game Porting. In Integration of Reusable Systems, Thouraya Bouabana-Tebibel and Stuart H. Rubin (Eds.). Springer, 241–261. https://doi.org/10.1007/978-3-319-04717-1_11 Google ScholarCross Ref
Supriyo Chakraborty, Richard Tomsett, Ramya Raghavendra, Daniel Harborne, Moustafa Alzantot, Federico Cerutti, Mani B. Srivastava, Alun D. Preece, Simon Julier, Raghuveer M. Rao, Troy D. Kelley, Dave Braines, Murat Sensoy, Christopher J. Willis, and Prudhvi Gurram. 2017. Interpretability of Deep Learning Models: A Survey of Results. In Proceedings of the IEEE International Conference on Ubiquitous Intelligence and Computing (UIC’17). San Francisco, CA, USA. 1–6. https://doi.org/10.1109/UIC-ATC.2017.8397411 Google ScholarCross Ref
Harshitha Challa, Nan Niu, and Reese Johnson. 2020. Faulty Requirements Made Valuable: On the Role of Data Quality in Deep Learning. In Proceedings of the 7th IEEE International Workshop on Artificial Intelligence for Requirements Engineering (AIRE’20). Zurich, Switzerland. 61–69. https://doi.org/10.1109/AIRE51212.2020.00016 Google ScholarCross Ref
Larissa Chazette and Kurt Schneider. 2020. Explainability as a Non-Functional Requirement: Challenges and Recommendations. Requirements Engineering, 25, 4 (2020), December, 493–514. https://doi.org/10.1007/s00766-020-00333-1 Google ScholarDigital Library
Lawrence Chung, Brian A. Nixon, Eric Yu, and John Mylopoulos. 1999. Non-Functional Requirements in Software Engineering. Springer.Google ScholarCross Ref
Fabiano Dalpiaz and Nan Niu. 2020. Requirements Engineering in the Days of Artificial Intelligence. IEEE Software, 37, 4 (2020), July/August, 7–10. https://doi.org/10.1109/MS.2020.2986047 Google ScholarDigital Library
Hoa Khanh Dam, Truyen Tran, and Aditya Ghose. 2018. Explainable Software Analytics. In Proceedings of the 40th ACM/IEEE International Conference on Software Engineering: New Ideas and Emerging Results (ICSE’18). Gothenburg, Sweden. 53–56. https://doi.org/10.1145/3183399.3183424 Google ScholarDigital Library
Department for Environment Food & Rural Affairs. 2015. Creating a River Thames fit for our future: An updated strategic and economic case for the Thames Tideway Tunnel. https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/471847/thames-tideway-tunnel-strategic-economic-case.pdfGoogle Scholar
DIVER. 2020. Web Application: Data Integration Visualization Exploration and Reporting Application, National Oceanic and Atmospheric Administration.. https://www.diver.orr.noaa.govGoogle Scholar
Finale Doshi-Velez and Been Kim. 2017. Towards A Rigorous Science of Interpretable Machine Learning. arxiv:1702.08608.Google Scholar
Leilani H. Gilpin, David Bau, Ben Z. Yuan, Ayesha Bajwa, Michael Specter, and Lalana Kagal. 2018. Explaining Explanations: An Overview of Interpretability of Machine Learning. In Proceedings of the 5th IEEE International Conference on Data Science and Advanced Analytics (DSAA’18). Turin, Italy. 80–89. https://doi.org/10.1109/DSAA.2018.00018 Google ScholarCross Ref
Xavier Glorot and Yoshua Bengio. 2010. Understanding the Difficulty of Training Deep Feedforward Neural Networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS’10). Sardinia, Italy. 249–256. http://proceedings.mlr.press/v9/glorot10a.htmlGoogle Scholar
Bryce Goodman and Seth R. Flaxman. 2017. European Union Regulations on Algorithmic Decision-Making and a “Right to Explanation”. AI Magazine, 38, 3 (2017), 50–57. https://doi.org/10.1609/aimag.v38i3.2741 Google ScholarDigital Library
Hemanth Gudaparthi, Reese Johnson, Harshitha Challa, and Nan Niu. 2020. Deep Learning for Smart Sewer Systems: Assessing Nonfunctional Requirements. In Proceedings of the 42nd IEEE/ACM International Conference on Software Engineering: Software Engineering in Society (ICSE-SEIS’20). Seoul, South Korea. 35–38. https://dl.acm.org/doi/10.1145/3377815.3381379Google Scholar
Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. 2019. A Survey of Methods for Explaining Black Box Models. Comput. Surveys, 51, 5 (2019), January, 93:1–93:42. https://doi.org/10.1145/3236009 Google ScholarDigital Library
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). Las Vegas, NV, USA. 770–778. https://doi.org/10.1109/CVPR.2016.90 Google ScholarCross Ref
High-Level Expert Group on Artificial Intelligence, European Commission. 2019. Policy and Investment Recommendations for Trustworthy AI. https://digital-strategy.ec.europa.eu/en/policies/expert-group-aiGoogle Scholar
Denis J. Hilton. 1990. Conversational Processes and Causal Explanation. Psychological Bulletin, 107, 1 (1990), January, 65–81. https://doi.org/10.1037/0033-2909.107.1.65 Google ScholarCross Ref
Jirayus Jiarpakdee, Chakkrit Tantithamthavorn, and John Grundy. 2021. Practitioners’ Perceptions of the Goals and Visual Explanations of Defect Prediction Models. arxiv:2102.12007.Google Scholar
Anna Jobin, Marcello Ienca, and Effy Vayena. 2019. The Global Landscape of AI Ethics Guidelines. Nature Machine Intelligence, 1 (2019), September, 389–399. https://doi.org/10.1038/s42256-019-0088-2 Google ScholarCross Ref
Charu Khatwani, Xiaoyu Jin, Nan Niu, Amy Koshoffer, Linda Newman, and Juha Savolainen. 2017. Advancing Viewpoint Merging in Requirements Engineering: A Theoretical Replication and Explanatory Study. Requirements Engineering, 22, 3 (2017), September, 317–338. https://doi.org/10.1007/s00766-017-0271-0 Google ScholarDigital Library
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. arxiv:1412.6980.Google Scholar
Maximilian A. Köhl, Kevin Baum, Markus Langer, Daniel Oster, Timo Speith, and Dimitri Bohlender. 2019. Explainability as a Non-Functional Requirement. In Proceedings of the 27th IEEE International Requirements Engineering Conference (RE’19). Jeju Island, South Korea. 363–368. https://doi.org/10.1109/RE.2019.00046 Google ScholarCross Ref
Tania Lombrozo. 2007. Simplicity and probability in causal explanation. Cognitive Psychology, 55, 3 (2007), November, 232–257. https://doi.org/10.1016/j.cogpsych.2006.09.006 Google ScholarCross Ref
Scott M Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17). Long Beach, CA, USA. 4765–4774. http://papers.nips.cc/paper/6930-a-universal-analysis-of-large-scale-regularized-least-squares-solutionsGoogle ScholarDigital Library
Tim Miller, Piers Howe, and Liz Sonenberg. 2017. Explainable AI: Beware of Inmates Running the Asylum. arxiv:1712.00547v2.Google Scholar
Yao Ming, Huamin Qu, and Enrico Bertini. 2019. RuleMatrix: Visualizing and Understanding Classifiers with Rules. IEEE Transactions on Visualization and Computer Graphics, 25, 1 (2019), January, 342–352. https://doi.org/10.1109/TVCG.2018.2864812 Google ScholarDigital Library
Nan Niu, Sjaak Brinkkemper, Xavier Franch, Jari Partanen, and Juha Savolainen. 2018. Requirements Engineering and Continuous Deployment. IEEE Software, 35, 2 (2018), March/April, 86–90. https://doi.org/10.1109/MS.2018.1661332 Google Scholar
Nan Niu and Steve Easterbrook. 2007. So, You Think You Know Others’ Goals? A Repertory Grid Study. IEEE Software, 24, 2 (2007), March/April, 53–61. https://doi.org/10.1109/MS.2007.52 Google ScholarDigital Library
Nan Niu, Amy Koshoffer, Linda Newman, Charu Khatwani, Chatura Samarasinghe, and Juha Savolainen. 2016. Advancing Repeated Research in Requirements Engineering: A Theoretical Replication of Viewpoint Merging. In Proceedings of the 24th IEEE International Requirements Engineering Conference (RE’16). Beijing, China. 186–195. https://doi.org/10.1109/RE.2016.46 Google ScholarCross Ref
Nan Niu, Alejandra Yepez Lopez, and Jing-Ru C. Cheng. 2011. Using Soft Systems Methodology to Improve Requirements Practices: An Exploratory Case Study. IET Software, 5, 6 (2011), December, 487–495. https://doi.org/10.1049/iet-sen.2010.0096 Google ScholarCross Ref
Nan Niu, Sandeep Reddivari, and Zhangji Chen. 2013. Keeping Requirements on Track via Visual Analytics. In Proceedings of the 21st IEEE International Requirements Engineering Conference (RE’13). Rio de Janeiro, Brazil. 205–214. https://doi.org/10.1109/RE.2013.6636720 Google ScholarCross Ref
Nan Niu, Li Da Xu, Jing-Ru C. Cheng, and Zhendong Niu. 2014. Analysis of Architecturally Significant Requirements for Enterprise Systems. IEEE Systems Journal, 8, 3 (2014), September, 850–857. https://doi.org/10.1109/JSYST.2013.2249892 Google ScholarCross Ref
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12 (2011), 2825–2830. https://jmlr.org/papers/volume12/pedregosa11a/pedregosa11a.pdfGoogle ScholarDigital Library
Sandeep Reddivari, Shirin Rad, Tanmay Bhowmik, Nisreen Cain, and Nan Niu. 2014. Visual Requirements Analytics: A Framework and Case Study. Requirements Engineering, 19, 3 (2014), September, 257–279. https://doi.org/10.1007/s00766-013-0194-3 Google ScholarDigital Library
Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016. 2016. General Data Protection Regulation. https://eur-lex.europa.eu/eli/reg/2016/679/ojGoogle Scholar
Alfréd Rényi. 1961. On Measures of Entropy and Information. In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics.Google Scholar
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. “Why Should I Trust You?” Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’16). San Francisco, CA, USA. 1135–1144. https://doi.org/10.1145/2939672.2939778 Google ScholarDigital Library
Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje. 2017. Learning Important Features Through Propagating Activation Differences. arxiv:1704.02685.Google Scholar
Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2014. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps. arxiv:1312.6034v2.Google Scholar
Rini Van Solingen and Egon Berghout. 1999. The Goal/Question/Metric Method: A Practical Guide for Quality Improvement of Software Development. McGraw-Hill.Google Scholar
Gianni Talamini, Di Shao, X. Su, X. Guo, and X. Ji. 2016. Combined Sewer Overflow In Shenzhen, China: The Case Study of Dasha River. WIT Transactions on Ecology and the Environment, 210 (2016), 785–796. https://doi.org/10.2495/SDP160661 Google ScholarCross Ref
Chakkrit Tantithamthavorn, Jirayus Jiarpakdee, and John Grundy. 2020. Explainable AI for Software Engineering. arxiv:2012.01614.Google Scholar
Paul Thagard. 1989. Explanatory Coherence. Behavioral and Brain Sciences, 12, 3 (1989), September, 435–502. https://doi.org/10.1017/S0140525X00057046 Google ScholarCross Ref
United States Environmental Protection Agency. 2004. Report to Congress: Impacts and control of CSOs and SSOs. https://www.epa.gov/npdes/2004-npdes-cso-report-congressGoogle Scholar
Wentao Wang, Nan Niu, Hui Liu, and Zhendong Niu. 2018. Enhancing Automated Requirements Traceability by Resolving Polysemy. In Proceedings of the 26th IEEE International Requirements Engineering Conference (RE’18). Banff, Canada. 40–51. https://doi.org/10.1109/RE.2018.00-53 Google ScholarCross Ref
Meredith Whittaker, Kate Crawford, Roel Dobbe, Genevieve Fried, Elizabeth Kaziunas, Varoon Mathur, Sarah Myers West, Rashida Richardson, Jason Schultz, and Oscar Schwartz. 2018. AI Now Report. https://ainowinstitute.org/AI_Now_2018_Report.pdfGoogle Scholar
Christine T. Wolf and Jeanette Blomberg. 2019. Evaluating the Promise of Human-Algorithm Collaborations in Everyday Work Practices. Proceedings of the ACM on Human-Computer Interaction, 3, EICS (2019), June, 143:1–143:23. https://doi.org/10.1145/3359245 Google ScholarDigital Library
Christine T. Wolf and Jeanette Blomberg. 2019. Explainability in Context: Lessons from an Intelligent System in the IT Services Domain. In Joint Proceedings of the ACM IUI 2019 Workshops (IUI’19). Los Angeles, CA, USA. http://ceur-ws.org/Vol-2327/IUI19WS-ExSS2019-17.pdfGoogle Scholar
Robert K. Yin. 2008. Case Study Research: Design and Methods. Sage Publications.Google Scholar
Jie M. Zhang, Mark Harman, Lei Ma, and Yang Liu. 2020. Machine Learning Testing: Survey, Landscapes and Horizons. IEEE Transactions on Software Engineering, https://doi.org/10.1109/TSE.2019.2962027 Google ScholarDigital Library

Index Terms

XAI tools in the public sector: a case study on predicting combined sewer overflows
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Process validation

Recommendations

The Tower of Babel in Explainable Artificial Intelligence (XAI)
Machine Learning and Knowledge Extraction
Abstract
As machine learning (ML) has emerged as the predominant technological paradigm for artificial intelligence (AI), complex black box models such as GPT-4 have gained widespread adoption. Concurrently, explainable AI (XAI) has risen in significance ...
Read More
Introduction to Explainable AI
CHI EA '20: Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems

As Artificial Intelligence (AI) technologies are increasingly used to make important decisions and perform autonomous tasks, providing explanations that allow users to understand the AI has become a ubiquitous concern in human-AI interaction. Recently, a ...
Read More
The Methods and Approaches of Explainable Artificial Intelligence
Computational Science – ICCS 2021
Abstract
Artificial Intelligence has found innumerable applications, becoming ubiquitous in the contemporary society. From making unnoticeable, minor choices to determining people’s fates (the case of predictive policing). This fact raises serious concerns ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ESEC/FSE 2021: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
August 2021
1690 pages
ISBN:9781450385626
DOI:10.1145/3468264
General Chairs:
Diomidis Spinellis
Athens University of Economics and Business, Greece
,
Georgios Gousios
Facebook, Netherlands / Delft University of Technology, Netherlands
,
Program Chairs:
Marsha Chechik
University of Toronto, Canada
,
Massimiliano Di Penta
University of Sannio, Italy
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 18 August 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Artifacts Available / v1.1
- Artifacts Evaluated & Reusable / v1.1
Author Tags
AI
case study
explainability
goal-question-metric (GQM)
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate112of543submissions,21%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 9
  Total Citations
  View Citations
- 511
  Total Downloads
- Downloads (Last 12 months)147
- Downloads (Last 6 weeks)22
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

XAI tools in the public sector: a case study on predicting combined sewer overflows

ESEC/FSE 2021: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

The Tower of Babel in Explainable Artificial Intelligence (XAI)

Introduction to Explainable AI

The Methods and Approaches of Explainable Artificial Intelligence