Data-Debugging Through Interactive Visual Explanations

Afzal, Shazia; Chaudhary, Arunima; Gupta, Nitin; Patel, Hima; Spina, Carolina; Wang, Dakuo

doi:10.1007/978-3-030-75015-2_14

Data-Debugging Through Interactive Visual Explanations

Shazia Afzal¹⁰,
Arunima Chaudhary^10,12,
Nitin Gupta¹⁰,
Hima Patel¹⁰,
Carolina Spina¹¹ &
…
Dakuo Wang¹²

Conference paper
First Online: 03 May 2021

1142 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12705))

Abstract

Data readiness analysis consists of methods that profile data and flag quality issues to determine the AI readiness of a given dataset. Such methods are being increasingly used to understand, inspect and correct anomalies in data such that their impact on downstream machine learning is limited. This often requires a human in the loop for validation and application of remedial actions. In this paper we describe a tool to assist data workers in this task by providing rich explanations to results obtained through data readiness analysis. The aim is to allow interactive visual inspection and debugging of data issues to enhance interpretability as well as facilitate informed remediation actions by humans in the loop.

The first two authors have contributed equally to this paper.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Amershi, S., Cakmak, M., Knox, W.B., Kulesza, T.: Power to the people: the role of humans in interactive machine learning. AI Mag. 35(4), 105 (2014). https://doi.org/10.1609/aimag.v35i4.2513
Article Google Scholar
Desmond, M., Finegan-Dollak, C., Boston, J., Arnold, M.: Label noise in context. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 157–186. Association for Computational Linguistics, July 2020. https://doi.org/10.18653/v1/2020.acl-demos.21. https://www.aclweb.org/anthology/2020.acl-demos.21
Frénay, B., Verleysen, M.: Classification in the presence of label noise: a survey. IEEE Trans. Neural Netw. Learn. Syst. 25(5), 845–869 (2013)
Article Google Scholar
Ham, K.: Openrefine (version 2.5) open-source tool for cleaning and transforming data. J. Med. Libr. Assoc. JMLA 101(3), 233 (2013). http://openrefine.org.free
Hohman, F., Srinivasan, A., Drucker, S.M.: TeleGam: combining visualization and verbalization for interpretable machine learning, p. 5 (2019)
Google Scholar
Jain, A., et al.: Overview and importance of data quality for machine learning tasks, pp. 3561–3562, August 2020. https://doi.org/10.1145/3394486.3406477
Kandel, S., Paepcke, A., Hellerstein, J., Heer, J.: Wrangler: interactive visual specification of data transformation scripts. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 3363–3372 (2011)
Google Scholar
Mohseni, S., Zarei, N., Ragan, E.D.: A multidisciplinary survey and framework for design and evaluation of explainable AI systems. arXiv:1811.11839 [cs], August 2020
Northcutt, C.G., Jiang, L., Chuang, I.L.: Confident learning: estimating uncertainty in dataset labels (2020)
Google Scholar
Sevastjanova, R., et al.: Going beyond visualization: verbalization as complementary medium to explain machine learning models (2018)
Google Scholar
Smilkov, D., Thorat, N., Nicholson, C., Reif, E., Viégas, F.B., Wattenberg, M.: Embedding projector: interactive visualization and interpretation of embeddings. arXiv preprint arXiv:1611.05469 (2016)
Spinner, T., Schlegel, U., Schafer, H., El-Assady, M.: Explainer: a visual analytics framework for interactive and explainable machine learning. IEEE Trans. Vis. Comput. Graph. 1 (2019). https://doi.org/10.1109/TVCG.2019.2934629

Download references

Author information

Authors and Affiliations

IBM Research, New Delhi, India
Shazia Afzal, Arunima Chaudhary, Nitin Gupta & Hima Patel
IBM Argentina, Buenos Aires, Argentina
Carolina Spina
IBM Research, Cambridge, USA
Arunima Chaudhary & Dakuo Wang

Authors

Shazia Afzal
View author publications
You can also search for this author in PubMed Google Scholar
Arunima Chaudhary
View author publications
You can also search for this author in PubMed Google Scholar
Nitin Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Hima Patel
View author publications
You can also search for this author in PubMed Google Scholar
Carolina Spina
View author publications
You can also search for this author in PubMed Google Scholar
Dakuo Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shazia Afzal .

Editor information

Editors and Affiliations

Microsoft, Hyderabad, India
Manish Gupta
Indian Institute of Technology Bombay, Mumbai, India
Ganesh Ramakrishnan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Afzal, S., Chaudhary, A., Gupta, N., Patel, H., Spina, C., Wang, D. (2021). Data-Debugging Through Interactive Visual Explanations. In: Gupta, M., Ramakrishnan, G. (eds) Trends and Applications in Knowledge Discovery and Data Mining. PAKDD 2021. Lecture Notes in Computer Science(), vol 12705. Springer, Cham. https://doi.org/10.1007/978-3-030-75015-2_14

Download citation

DOI: https://doi.org/10.1007/978-3-030-75015-2_14
Published: 03 May 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-75014-5
Online ISBN: 978-3-030-75015-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics