skip to main content
10.1145/3468264.3468547acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections

XAI tools in the public sector: a case study on predicting combined sewer overflows

Published:18 August 2021Publication History

ABSTRACT

Artificial intelligence and deep learning are becoming increasingly prevalent in contemporary software solutions. Explainable artificial intelligence (XAI) tools attempt to address the black box nature of the deep learning models and make them more understandable to humans. In this work, we apply three state-of-the-art XAI tools in a real-world case study. Our study focuses on predicting combined sewer overflow events for a municipal wastewater treatment organization. Through a data driven inquiry, we collect both qualitative information via stakeholder interviews and quantitative measures. These help us assess the predictive accuracy of the XAI tools, as well as the simplicity, soundness, and insightfulness of the produced explanations. Our results not only show the varying degrees that the XAI tools meet the requirements, but also highlight that domain experts can draw new insights from complex explanations that may differ from their previous expectations.

References

  1. Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dandelion Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. https://www.tensorflow.org/Google ScholarGoogle Scholar
  2. Amina Adadi and Mohammed Berrada. 2018. Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI). IEEE Access, 6 (2018), 52138–52160. https://doi.org/10.1109/ACCESS.2018.2870052 Google ScholarGoogle ScholarCross RefCross Ref
  3. Saleema Amershi, Andrew Begel, Christian Bird, Robert DeLine, Harald C. Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, and Thomas Zimmermann. 2019. Software Engineering for Machine Learning: A Case Study. In Proceedings of the 41st IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP’19). Montreal, Canada. 291–300. https://doi.org/10.1109/ICSE-SEIP.2019.00042 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Sebastian Bach, Alexander Binder, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller, and Wojciech Samek. 2015. On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation. PloS one, 10, 7 (2015), e0130140. https://doi.org/10.1371/journal.pone.0130140 Google ScholarGoogle ScholarCross RefCross Ref
  5. José Manuel Benítez, Juan Luis Castro, and Ignacio Requena. 1997. Are Artificial Neural Networks Black Boxes? IEEE Transactions on Neural Networks, 8, 5 (1997), September, 1156–1164. https://doi.org/10.1109/72.623216 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Tanmay Bhowmik, Vander Alves, and Nan Niu. 2014. An Exploratory Case Study on Exploiting Aspect Orientation in Mobile Game Porting. In Integration of Reusable Systems, Thouraya Bouabana-Tebibel and Stuart H. Rubin (Eds.). Springer, 241–261. https://doi.org/10.1007/978-3-319-04717-1_11 Google ScholarGoogle ScholarCross RefCross Ref
  7. Supriyo Chakraborty, Richard Tomsett, Ramya Raghavendra, Daniel Harborne, Moustafa Alzantot, Federico Cerutti, Mani B. Srivastava, Alun D. Preece, Simon Julier, Raghuveer M. Rao, Troy D. Kelley, Dave Braines, Murat Sensoy, Christopher J. Willis, and Prudhvi Gurram. 2017. Interpretability of Deep Learning Models: A Survey of Results. In Proceedings of the IEEE International Conference on Ubiquitous Intelligence and Computing (UIC’17). San Francisco, CA, USA. 1–6. https://doi.org/10.1109/UIC-ATC.2017.8397411 Google ScholarGoogle ScholarCross RefCross Ref
  8. Harshitha Challa, Nan Niu, and Reese Johnson. 2020. Faulty Requirements Made Valuable: On the Role of Data Quality in Deep Learning. In Proceedings of the 7th IEEE International Workshop on Artificial Intelligence for Requirements Engineering (AIRE’20). Zurich, Switzerland. 61–69. https://doi.org/10.1109/AIRE51212.2020.00016 Google ScholarGoogle ScholarCross RefCross Ref
  9. Larissa Chazette and Kurt Schneider. 2020. Explainability as a Non-Functional Requirement: Challenges and Recommendations. Requirements Engineering, 25, 4 (2020), December, 493–514. https://doi.org/10.1007/s00766-020-00333-1 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Lawrence Chung, Brian A. Nixon, Eric Yu, and John Mylopoulos. 1999. Non-Functional Requirements in Software Engineering. Springer.Google ScholarGoogle ScholarCross RefCross Ref
  11. Fabiano Dalpiaz and Nan Niu. 2020. Requirements Engineering in the Days of Artificial Intelligence. IEEE Software, 37, 4 (2020), July/August, 7–10. https://doi.org/10.1109/MS.2020.2986047 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Hoa Khanh Dam, Truyen Tran, and Aditya Ghose. 2018. Explainable Software Analytics. In Proceedings of the 40th ACM/IEEE International Conference on Software Engineering: New Ideas and Emerging Results (ICSE’18). Gothenburg, Sweden. 53–56. https://doi.org/10.1145/3183399.3183424 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Department for Environment Food & Rural Affairs. 2015. Creating a River Thames fit for our future: An updated strategic and economic case for the Thames Tideway Tunnel. https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/471847/thames-tideway-tunnel-strategic-economic-case.pdfGoogle ScholarGoogle Scholar
  14. DIVER. 2020. Web Application: Data Integration Visualization Exploration and Reporting Application, National Oceanic and Atmospheric Administration.. https://www.diver.orr.noaa.govGoogle ScholarGoogle Scholar
  15. Finale Doshi-Velez and Been Kim. 2017. Towards A Rigorous Science of Interpretable Machine Learning. arxiv:1702.08608.Google ScholarGoogle Scholar
  16. Leilani H. Gilpin, David Bau, Ben Z. Yuan, Ayesha Bajwa, Michael Specter, and Lalana Kagal. 2018. Explaining Explanations: An Overview of Interpretability of Machine Learning. In Proceedings of the 5th IEEE International Conference on Data Science and Advanced Analytics (DSAA’18). Turin, Italy. 80–89. https://doi.org/10.1109/DSAA.2018.00018 Google ScholarGoogle ScholarCross RefCross Ref
  17. Xavier Glorot and Yoshua Bengio. 2010. Understanding the Difficulty of Training Deep Feedforward Neural Networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS’10). Sardinia, Italy. 249–256. http://proceedings.mlr.press/v9/glorot10a.htmlGoogle ScholarGoogle Scholar
  18. Bryce Goodman and Seth R. Flaxman. 2017. European Union Regulations on Algorithmic Decision-Making and a “Right to Explanation”. AI Magazine, 38, 3 (2017), 50–57. https://doi.org/10.1609/aimag.v38i3.2741 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Hemanth Gudaparthi, Reese Johnson, Harshitha Challa, and Nan Niu. 2020. Deep Learning for Smart Sewer Systems: Assessing Nonfunctional Requirements. In Proceedings of the 42nd IEEE/ACM International Conference on Software Engineering: Software Engineering in Society (ICSE-SEIS’20). Seoul, South Korea. 35–38. https://dl.acm.org/doi/10.1145/3377815.3381379Google ScholarGoogle Scholar
  20. Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. 2019. A Survey of Methods for Explaining Black Box Models. Comput. Surveys, 51, 5 (2019), January, 93:1–93:42. https://doi.org/10.1145/3236009 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). Las Vegas, NV, USA. 770–778. https://doi.org/10.1109/CVPR.2016.90 Google ScholarGoogle ScholarCross RefCross Ref
  22. High-Level Expert Group on Artificial Intelligence, European Commission. 2019. Policy and Investment Recommendations for Trustworthy AI. https://digital-strategy.ec.europa.eu/en/policies/expert-group-aiGoogle ScholarGoogle Scholar
  23. Denis J. Hilton. 1990. Conversational Processes and Causal Explanation. Psychological Bulletin, 107, 1 (1990), January, 65–81. https://doi.org/10.1037/0033-2909.107.1.65 Google ScholarGoogle ScholarCross RefCross Ref
  24. Jirayus Jiarpakdee, Chakkrit Tantithamthavorn, and John Grundy. 2021. Practitioners’ Perceptions of the Goals and Visual Explanations of Defect Prediction Models. arxiv:2102.12007.Google ScholarGoogle Scholar
  25. Anna Jobin, Marcello Ienca, and Effy Vayena. 2019. The Global Landscape of AI Ethics Guidelines. Nature Machine Intelligence, 1 (2019), September, 389–399. https://doi.org/10.1038/s42256-019-0088-2 Google ScholarGoogle ScholarCross RefCross Ref
  26. Charu Khatwani, Xiaoyu Jin, Nan Niu, Amy Koshoffer, Linda Newman, and Juha Savolainen. 2017. Advancing Viewpoint Merging in Requirements Engineering: A Theoretical Replication and Explanatory Study. Requirements Engineering, 22, 3 (2017), September, 317–338. https://doi.org/10.1007/s00766-017-0271-0 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. arxiv:1412.6980.Google ScholarGoogle Scholar
  28. Maximilian A. Köhl, Kevin Baum, Markus Langer, Daniel Oster, Timo Speith, and Dimitri Bohlender. 2019. Explainability as a Non-Functional Requirement. In Proceedings of the 27th IEEE International Requirements Engineering Conference (RE’19). Jeju Island, South Korea. 363–368. https://doi.org/10.1109/RE.2019.00046 Google ScholarGoogle ScholarCross RefCross Ref
  29. Tania Lombrozo. 2007. Simplicity and probability in causal explanation. Cognitive Psychology, 55, 3 (2007), November, 232–257. https://doi.org/10.1016/j.cogpsych.2006.09.006 Google ScholarGoogle ScholarCross RefCross Ref
  30. Scott M Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17). Long Beach, CA, USA. 4765–4774. http://papers.nips.cc/paper/6930-a-universal-analysis-of-large-scale-regularized-least-squares-solutionsGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  31. Tim Miller, Piers Howe, and Liz Sonenberg. 2017. Explainable AI: Beware of Inmates Running the Asylum. arxiv:1712.00547v2.Google ScholarGoogle Scholar
  32. Yao Ming, Huamin Qu, and Enrico Bertini. 2019. RuleMatrix: Visualizing and Understanding Classifiers with Rules. IEEE Transactions on Visualization and Computer Graphics, 25, 1 (2019), January, 342–352. https://doi.org/10.1109/TVCG.2018.2864812 Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Nan Niu, Sjaak Brinkkemper, Xavier Franch, Jari Partanen, and Juha Savolainen. 2018. Requirements Engineering and Continuous Deployment. IEEE Software, 35, 2 (2018), March/April, 86–90. https://doi.org/10.1109/MS.2018.1661332 Google ScholarGoogle Scholar
  34. Nan Niu and Steve Easterbrook. 2007. So, You Think You Know Others’ Goals? A Repertory Grid Study. IEEE Software, 24, 2 (2007), March/April, 53–61. https://doi.org/10.1109/MS.2007.52 Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Nan Niu, Amy Koshoffer, Linda Newman, Charu Khatwani, Chatura Samarasinghe, and Juha Savolainen. 2016. Advancing Repeated Research in Requirements Engineering: A Theoretical Replication of Viewpoint Merging. In Proceedings of the 24th IEEE International Requirements Engineering Conference (RE’16). Beijing, China. 186–195. https://doi.org/10.1109/RE.2016.46 Google ScholarGoogle ScholarCross RefCross Ref
  36. Nan Niu, Alejandra Yepez Lopez, and Jing-Ru C. Cheng. 2011. Using Soft Systems Methodology to Improve Requirements Practices: An Exploratory Case Study. IET Software, 5, 6 (2011), December, 487–495. https://doi.org/10.1049/iet-sen.2010.0096 Google ScholarGoogle ScholarCross RefCross Ref
  37. Nan Niu, Sandeep Reddivari, and Zhangji Chen. 2013. Keeping Requirements on Track via Visual Analytics. In Proceedings of the 21st IEEE International Requirements Engineering Conference (RE’13). Rio de Janeiro, Brazil. 205–214. https://doi.org/10.1109/RE.2013.6636720 Google ScholarGoogle ScholarCross RefCross Ref
  38. Nan Niu, Li Da Xu, Jing-Ru C. Cheng, and Zhendong Niu. 2014. Analysis of Architecturally Significant Requirements for Enterprise Systems. IEEE Systems Journal, 8, 3 (2014), September, 850–857. https://doi.org/10.1109/JSYST.2013.2249892 Google ScholarGoogle ScholarCross RefCross Ref
  39. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12 (2011), 2825–2830. https://jmlr.org/papers/volume12/pedregosa11a/pedregosa11a.pdfGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  40. Sandeep Reddivari, Shirin Rad, Tanmay Bhowmik, Nisreen Cain, and Nan Niu. 2014. Visual Requirements Analytics: A Framework and Case Study. Requirements Engineering, 19, 3 (2014), September, 257–279. https://doi.org/10.1007/s00766-013-0194-3 Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016. 2016. General Data Protection Regulation. https://eur-lex.europa.eu/eli/reg/2016/679/ojGoogle ScholarGoogle Scholar
  42. Alfréd Rényi. 1961. On Measures of Entropy and Information. In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics.Google ScholarGoogle Scholar
  43. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. “Why Should I Trust You?” Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’16). San Francisco, CA, USA. 1135–1144. https://doi.org/10.1145/2939672.2939778 Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje. 2017. Learning Important Features Through Propagating Activation Differences. arxiv:1704.02685.Google ScholarGoogle Scholar
  45. Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2014. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps. arxiv:1312.6034v2.Google ScholarGoogle Scholar
  46. Rini Van Solingen and Egon Berghout. 1999. The Goal/Question/Metric Method: A Practical Guide for Quality Improvement of Software Development. McGraw-Hill.Google ScholarGoogle Scholar
  47. Gianni Talamini, Di Shao, X. Su, X. Guo, and X. Ji. 2016. Combined Sewer Overflow In Shenzhen, China: The Case Study of Dasha River. WIT Transactions on Ecology and the Environment, 210 (2016), 785–796. https://doi.org/10.2495/SDP160661 Google ScholarGoogle ScholarCross RefCross Ref
  48. Chakkrit Tantithamthavorn, Jirayus Jiarpakdee, and John Grundy. 2020. Explainable AI for Software Engineering. arxiv:2012.01614.Google ScholarGoogle Scholar
  49. Paul Thagard. 1989. Explanatory Coherence. Behavioral and Brain Sciences, 12, 3 (1989), September, 435–502. https://doi.org/10.1017/S0140525X00057046 Google ScholarGoogle ScholarCross RefCross Ref
  50. United States Environmental Protection Agency. 2004. Report to Congress: Impacts and control of CSOs and SSOs. https://www.epa.gov/npdes/2004-npdes-cso-report-congressGoogle ScholarGoogle Scholar
  51. Wentao Wang, Nan Niu, Hui Liu, and Zhendong Niu. 2018. Enhancing Automated Requirements Traceability by Resolving Polysemy. In Proceedings of the 26th IEEE International Requirements Engineering Conference (RE’18). Banff, Canada. 40–51. https://doi.org/10.1109/RE.2018.00-53 Google ScholarGoogle ScholarCross RefCross Ref
  52. Meredith Whittaker, Kate Crawford, Roel Dobbe, Genevieve Fried, Elizabeth Kaziunas, Varoon Mathur, Sarah Myers West, Rashida Richardson, Jason Schultz, and Oscar Schwartz. 2018. AI Now Report. https://ainowinstitute.org/AI_Now_2018_Report.pdfGoogle ScholarGoogle Scholar
  53. Christine T. Wolf and Jeanette Blomberg. 2019. Evaluating the Promise of Human-Algorithm Collaborations in Everyday Work Practices. Proceedings of the ACM on Human-Computer Interaction, 3, EICS (2019), June, 143:1–143:23. https://doi.org/10.1145/3359245 Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Christine T. Wolf and Jeanette Blomberg. 2019. Explainability in Context: Lessons from an Intelligent System in the IT Services Domain. In Joint Proceedings of the ACM IUI 2019 Workshops (IUI’19). Los Angeles, CA, USA. http://ceur-ws.org/Vol-2327/IUI19WS-ExSS2019-17.pdfGoogle ScholarGoogle Scholar
  55. Robert K. Yin. 2008. Case Study Research: Design and Methods. Sage Publications.Google ScholarGoogle Scholar
  56. Jie M. Zhang, Mark Harman, Lei Ma, and Yang Liu. 2020. Machine Learning Testing: Survey, Landscapes and Horizons. IEEE Transactions on Software Engineering, https://doi.org/10.1109/TSE.2019.2962027 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. XAI tools in the public sector: a case study on predicting combined sewer overflows

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ESEC/FSE 2021: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
        August 2021
        1690 pages
        ISBN:9781450385626
        DOI:10.1145/3468264

        Copyright © 2021 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 18 August 2021

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate112of543submissions,21%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader