Exploring quality dimensions in trustworthy Machine Learning in the context of official statistics: model explainability and uncertainty quantification

Molladavoudi, Saeid; Yung, Wesley

doi:10.1007/s11943-023-00331-z

Exploring quality dimensions in trustworthy Machine Learning in the context of official statistics: model explainability and uncertainty quantification

Originalveröffentlichung
Published: 29 November 2023

Volume 17, pages 223–252, (2023)
Cite this article

AStA Wirtschafts- und Sozialstatistisches Archiv Aims and scope Submit manuscript

Saeid Molladavoudi¹ &
Wesley Yung¹

184 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Despite the fact that National Statistical Offices (NSOs) continue to embrace and adopt Machine Learning (ML) methods and tools in a variety of areas of their operations, including data collection, integration, and processing, it is still not clear how these complex and prediction-oriented approaches can be incorporated into the quality standards and frameworks within NSOs or if the frameworks themselves need to be modified. This article focuses on and builds upon two of the quality dimensions proposed in the Quality Framework for Statistical Algorithms (QF4SA): model explainability and accuracy (including uncertainty). The implications of the current methods for explainable ML and uncertainty quantification will be examined in further detail, as well as their possible uses in statistical production, such as continuous model monitoring in intermediate ML classifications and auto-coding phases. This strategy will ensure that human subject-matter experts, who are an essential component of every statistical program, are effectively integrated into the life cycle of ML projects. It will also guarantee to maintain the quality of ML models in production, adhere to the current quality frameworks within NSOs, and ultimately boost confidence and trust in these emerging technologies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quality Assessment and Assurance of Machine Learning Systems: A Comprehensive Approach

Interpretability of Machine Learning Solutions in Industrial Decision Engineering

SLinRA2S: actively supporting regression analysis with R

Article 29 October 2014

References

Alvarez-Melis D, Jaakkola TS (2018) On the robustness of interpretability methods (presented at 2018 ICML Workshop on Human Interpretability in Machine Learning (WHI 2018), Stockholm, Sweden)
Google Scholar
Angelopoulos AN, Bates S (2021) A gentle introduction to conformal prediction and distribution-free uncertainty quantification (arXiv:2107.07511)
Google Scholar
Angelopoulos AN, Bates S, Fisch A, Lei L, Schuster T (2022) Conformal risk control (arXiv:2208.02814)
Google Scholar
Angelopoulos AN, Bates S, Fannjiang C, Jordan MI, Zrnic T (2023) Prediction-powered inference (ArXiv:2301.09633)
Book Google Scholar
Barber RF, Candès EJ, Ramdas A, Tibshirani RJ (2021) Predictive inference with the jackknife. Ann Stat 49(1):486–507. https://doi.org/10.1214/20-AOS1965
Article MathSciNet Google Scholar
Barber RF, Candes EJ, Ramdas A, Tibshirani RJ (2022) Conformal prediction beyond exchangeability (arXiv:2202.13415)
Google Scholar
Bernasconi E, De Fausti F, Pugliese F, Scannapieco M, Zardetto D (2022) Automatic extraction of land cover statistics from satellite imagery by deep learning. SJI 38:183–199
Article Google Scholar
Bhatt U, Antorán J, Zhang Y, Liao QV, Sattigeri P, Fogliato R, Melançon G, Krishnan R, Stanley J, Tickoo O, Nachman L, Chunara R, Srikumar M, Weller A, Xiang A (2021) Uncertainty as a form of transparency: measuring, communicating, and using uncertainty. Proceedings of the 2021 AAAI/ACM conference on AI, ethics, and society, association for computing machinery, New York, NY, USA, pp 401–413 https://doi.org/10.1145/3461702.3462571
Book Google Scholar
Böhm V, Lanusse F, Seljak U (2019) Uncertainty quantification with generative models (arXiv.1910.10046)
Google Scholar
Breidt FJ, Claeskens G, Opsomer JD (2005) Model-assisted estimation for complex surveys using penalised splines. Biometrika 92(4):831–846
Article MathSciNet Google Scholar
Cassel CM, Särndal CE, Wretman JH (1976) Some results on generalized difference estimation and generalized regression estimation for finite populations. Biometrika 63:615–620
Article MathSciNet Google Scholar
Chambers R, Clark R (2012) An introduction to model-based survey sampling with applications. Oxford University Press https://doi.org/10.1093/acprof:oso/9780198566625.001.0001
Book Google Scholar
Chen T, Fox E, Guestrin C (2014) Stochastic gradient hamiltonian Monte Carlo. In: Xing EP, Jebara T (eds) Proceedings of the 31st international conference on machine learning, PMLR, Bejing, China, proceedings of machine learning research, vol 32, pp 1683–1691
Google Scholar
Daas P, Puts M, Buelens B, van den Hurk P (2015) Big data as a source for official statistics. J Off Stat 31(2):249–262. https://doi.org/10.1515/jos-2015-0016
Article Google Scholar
Dagdoug M, Goga C, Haziza D (2021) Model-assisted estimation through random forests in finite population sampling. J Am Stat Assoc. https://doi.org/10.1080/01621459.2021.1987250
Article Google Scholar
Earth observations for official statistics (2017) United Nation’s satellite imagery and geospatial data task team report. https://unstats.un.org/bigdata/task-teams/earth-observation/UNGWG_Satellite_Task_Team_Report_WhiteCover.pdf. Accessed August 16, 2023
Erman S, Rancourt E, Beaucage Y, Loranger A (2022) The use of data science in a national statistical office. https://hdsr.mitpress.mit.edu/pub/x0l4x099. Accessed August 16, 2023
European Commission (2022) EU AI act. https://artificialintelligenceact.eu/the-act/. Accessed August 16, 2023
Fadel S, Trottier S (2023) A study on explainable active learning for text classification (Statistics Canada’s internal report)
Google Scholar
Firth D, Bennett KE (1998) Robust models in probability sampling. J Royal Stat Soc Ser B 60(1):3–21. https://doi.org/10.1111/1467-9868.00105
Article MathSciNet Google Scholar
Gal Y, Ghahramani Z (2015) Bayesian convolutional neural networks with bernoulli approximate variational inference (arxiv:1506.02158)
Google Scholar
Gal Y, Ghahramani Z (2016) Dropout as a bayesian approximation: representing model uncertainty in deep learning. In: Balcan MF, Weinberger KQ (eds) Proceedings of the 33rd international conference on machine learning, PMLR, New York, New York, USA, proceedings of machine learning research, vol 48, pp 1050–1059
Google Scholar
Gal Y, Islam R, Ghahramani Z (2017) Deep bayesian active learning with image data. Proceedings of the 34th international conference on machine learning, vol 70, pp 1183–1192
Google Scholar
Geifman Y, El-Yaniv R (2017) Selective classification for deep neural networks. In: Guyon I, von Luxburg U, Bengio S, Wallach H, Fergus R, Garnett R (eds) Advances in neural information processing systems, vol 30
Google Scholar
Gelein B, Haziza D, Causeur D (2018) Propensity weighting for survey non-response through machine learning. Journées De Méthodologie Stat. https://doi.org/10.1016/j.jmva.2014.06.020
Article Google Scholar
Ghai B, Liao QV, Zhang Y, Bellamy R, Mueller K (2021) Explainable active learning (xal): toward ai explanations as interfaces for machine teachers. Proc ACM Hum-Comput Interact. https://doi.org/10.1145/3432934
Article Google Scholar
Government of Canada (2022a) National occupational classification (NOC) Canada 2021 version 1.0. https://www.statcan.gc.ca/en/subjects/standard/noc/2021/indexV1. Accessed August 16, 2023
Government of Canada (2022b) North American industry classification system (NAICS) Canada 2022 version 1.0. https://www.statcan.gc.ca/en/subjects/standard/naics/2022/v1/index. Accessed August 16, 2023
Government of Canada (2023) North American product classification system (NAPCS) Canada 2022 version 1.0. https://www.statcan.gc.ca/en/subjects/standard/napcs/2022/index. Accessed August 16, 2023
Government of Canada AI & data act. https://www.parl.ca/DocumentViewer/en/44-1/bill/C-27/first-reading. Accessed August 16, 2023
Guo C, Pleiss G, Sun Y, Weinberger KQ (2017) On calibration of modern neural networks. In: Precup D, Teh YW (eds) Proceedings of the 34th international conference on machine learning, PMLR, proceedings of machine learning research, vol 70, pp 1321–1330
Google Scholar
Haziza D, Beaumont JF (2017) Construction of weights in surveys: a review. Stat Sci 32:206–226
Article MathSciNet Google Scholar
Hüllermeier E, Waegeman W (2021) Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods. Mach Learn 110(3):457–506. https://doi.org/10.1007/s10994-021-05946-3
Article MathSciNet Google Scholar
Kaiser P, Kern C, Rügamer D (2022) Uncertainty-aware predictive modeling for fair data-driven decisions
Google Scholar
Kull M, Silva Filho TM, Flach P (2017) Beyond sigmoids: how to obtain well-calibrated probabilities from binary classifiers with beta calibration. Electron J Statist 11(2):5052–5080. https://doi.org/10.1214/17-EJS1338SI
Article MathSciNet Google Scholar
Lakshminarayanan B, Pritzel A, Blundell C (2017) Simple and scalable predictive uncertainty estimation using deep ensembles. Proceedings of the 31st international conference on neural information processing systems, pp 6405–6416
Google Scholar
Lele SR (2020) How should we quantify uncertainty in statistical inference? Front Ecol Evol. https://doi.org/10.3389/fevo.2020.00035
Article Google Scholar
Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. Proceedings of the 31st international conference on neural information processing systems, pp 4768–4777
Google Scholar
McConville KS, Moisen GG, Frescino TS (2020) A tutorial on model-assisted estimation with application to forest inventory. Forests. https://doi.org/10.3390/f11020244
Article Google Scholar
Montanari G, Ranalli M (2005) Nonparametric model calibration estimation in survey sampling. J Am Stat Assoc 100(472):1429–1442
Article MathSciNet Google Scholar
Mothilal RK, Sharma A, Tan C (2020) Explaining machine learning classifiers through diverse counterfactual explanations. Proceedings of the 2020 conference on fairness, accountability, and transparency, association for computing machinery, pp 607–617 https://doi.org/10.1145/3351095.3372850
Book Google Scholar
Murdoch WJ, Singh C, Kumbier K, Abbasi-Asl R, Yu B (2019) Definitions, methods, and applications in interpretable machine learning. Proc Natl Acad Sci USA. https://doi.org/10.1073/pnas.1900654116
Article MathSciNet Google Scholar
Platt JC (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in large margin classifiers. MIT Press, pp 61–74
Google Scholar
Ribeiro MT, Singh S, Guestrin C (2018) Anchors: high-precision model-agnostic explanations. Proceedings of the AAAI conference on artificial intelligence
Google Scholar
Romano Y, Patterson E, Candes EJ (2019) Conformalized quantile regression. In: Wallach H, Larochelle H, Beygelzimer A, Alché-Buc F, Fox E, Garnett R (eds) Advances in neural information processing systems, vol 32
Google Scholar
Roscher R, Bohn B, Duarte MF, Garcke J (2020) Explainable machine learning for scientific insights and discoveries. IEEE Access. https://doi.org/10.1109/ACCESS.2020.2976199
Article Google Scholar
Rudin C (2019) Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell 1(5):206–215. https://doi.org/10.1038/s42256-019-0048-x
Article Google Scholar
Särndal CE, Swensson B, Wretman J (1992) Model assisted survey sampling. Springer
Book Google Scholar
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet Google Scholar
Statistics Canada’s quality guidelines (2019) https://www150.statcan.gc.ca/n1/pub/12-539-x/12-539-x2019001-eng.htm. Accessed August 16, 2023
Steinberger L, Leeb H (2018) Conditional predictive inference for stable algorithms (arXiv:1809.01412)
Google Scholar
The OECD Artificial Intelligence (AI) (2019) Principles. https://oecd.ai/en/ai-principles. Accessed August 16, 2023
Vaicenavicius J, Widmann D, Andersson C, Lindsten F, Roll J, Schön T (2019) Evaluating model calibration in classification. In: Chaudhuri K, Sugiyama M (eds) Proceedings of the twenty-second international conference on artificial intelligence and statistics, PMLR, proceedings of machine learning research, vol 89, pp 3459–3467
Google Scholar
Vovk V, Gammerman A, Shafer G (2005) Algorithmic learning in a random world. Springer, Berlin, Heidelberg
Google Scholar
Wachter S, Mittelstadt B, Russell C (2018) Counterfactual explanations without opening the black box: automated decisions and the gdpr. Harv J Law Technol 31(2):841–887
Google Scholar
Yung W, Cook K, Thomas S (2004) Use of GST data by the monthly survey of manufacturing. https://www.oecd.org/sdd/36232466.pdf. Accessed August 16, 2023
Yung W, Tam SM, Buelens B, Chipman H, Dumpert F, Ascari G, Rocci F, Burger J, Choi IK (2022) A quality framework for statistical algorithms. Stat J IAOS. https://doi.org/10.3233/SJI-210875
Article Google Scholar
Zadrozny B, Elkan C (2001) Obtaining calibrated probability estimates from decision trees and naive bayesian classifiers. Proceedings of the eighteenth international conference on machine learning. Morgan Kaufmann Publishers Inc, San Francisco, CA, USA, pp 609–616
Google Scholar
Zhang J (2022) Machine learning techniques to handle survey non-response (statistics Canada’s internal report)
Google Scholar

Download references

Author information

Authors and Affiliations

Statistics Canada, Ottawa, Canada
Saeid Molladavoudi & Wesley Yung

Authors

Saeid Molladavoudi
View author publications
You can also search for this author in PubMed Google Scholar
Wesley Yung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Saeid Molladavoudi.

Ethics declarations

Conflict of interest

The content of this article represents the position of the authors and may not necessarily represent that of Statistics Canada. The authors declare no competing or conflicting interests that could be perceived as having influenced the work presented in this paper.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Molladavoudi, S., Yung, W. Exploring quality dimensions in trustworthy Machine Learning in the context of official statistics: model explainability and uncertainty quantification. AStA Wirtsch Sozialstat Arch 17, 223–252 (2023). https://doi.org/10.1007/s11943-023-00331-z

Download citation

Received: 08 April 2023
Accepted: 31 October 2023
Published: 29 November 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s11943-023-00331-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Exploring quality dimensions in trustworthy Machine Learning in the context of official statistics: model explainability and uncertainty quantification

Abstract

Access this article

Similar content being viewed by others

Quality Assessment and Assurance of Machine Learning Systems: A Comprehensive Approach

Interpretability of Machine Learning Solutions in Industrial Decision Engineering

SLinRA2S: actively supporting regression analysis with R

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Exploring quality dimensions in trustworthy Machine Learning in the context of official statistics: model explainability and uncertainty quantification

Abstract

Access this article

Similar content being viewed by others

Quality Assessment and Assurance of Machine Learning Systems: A Comprehensive Approach

Interpretability of Machine Learning Solutions in Industrial Decision Engineering

SLinRA2S: actively supporting regression analysis with R

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation